Real-time PCR Miner Version 4.0

 

1.      Introduction

 

Real-time PCR Miner provides a simple way for analyzing real-time PCR data from raw fluorescence data without the need of standard curve. This allows an automated calculation of amplification kinetics, as well as performing the subsequent calculations for relative quantification and calculating assay variability. Amplification efficiencies are also tested to detect anomalous samples within groups and differences between experimental groups (amplification equivalence). Moreover, Miner is freely accessible online: http://www.miner.ewindup.info/ for scientific research . Please enjoy yourself.  For additional details, please contact with the author (Dr. Sheng Zhao, Address: Lance Kriegsfeld Lab, Neurobiology Research Laboratory, Department of Psychology and Helen Wills Neuroscience Institute, 3210 Tolman Hall, UC Berkeley, Berkeley, CA 94720, Phone: 1-510-643-9899, Email: windupzs@gmail.com).

 

Basic Use   Explanation of the result   Future improvement   How to cite Miner   How to support this website  

 

2.      Basic Use

 

1)        Basic operation may be summarized as follows

 

i.                     Import raw fluorescence data

ii.                   Assign name for each sample

iii.                 Set the basic options

iv.                 Provide your email and platform you used for Real-time PCR

v.                   Set the direction of the data / results

vi.                 Submit your data and wait for the result

 

2)    Import raw fluorescence data

            ?

i.                    You can export your fluorescent raw data from the software of your Real-time PCR machine to Microsoft Excel Sheet (can only handle data set less than 256 columns) by column or by row Please remember to include the cycle numbers of each cycle in the first column (or row, when you have more than 255 samples). Note, some platform like BioRad's MyIQ and iCycler use fractional cycle number instead of integer for each cycle. Also, you need give a name for each column (or row) at the first row (or column). See example. For more questions about how to exporting the pure raw data from different platforms, please see Frequent asked question.

ii.                  Please make sure you have not imported blanks without any amplification.

iii.                If any cell miss data, Miner may response error / warning messages.

iv.                 Conduct modifications on raw data if necessary due to possible special requirements of the platform before submit them, although it is usually unnecessary.  

 

3)                Assign name for each sample

i.                    For all requested analyses, user should give a name to each sample (at the first row of each sample column) in following format: ?/span>GeneName_ReplicateName_#No.#". Please do not use any character except all letters, numbers, and "_".

ii.                  In between of "GeneName", "ReplicateName", and "#No.#", there must be a "_" as a separating marker so that Miner can use it to further calculate the mean and standard error for both replicate group and gene group. "GeneName" and "ReplicateName" are always required for running Miner.

iii.                "GeneName" can be any name of gene you used in the experiment.

iv.                 "ReplicateName" can be any label you want to use to the sample. "ReplicateName" can be typically a label for treatment type, time point, concentration point, or tissue type.

v.                   "#No.#" is a sequential label for triplicate samples, like "1", "2", "3", etc. Although "#No.#" can be omitted, it is a good way to distinguish the samples within replicate samples when looking for the detail for individual sample.

vi.                 For example, if there are triplicates for detecting Actin gene expression level after 24 hours treatment, user can name the triplicate samples as "Actin_24 Hours_1", "Actin_24 Hours_2", and "Actin_24 Hours_3".

vii.               Note, please also remember to give a name for the cycle number column (or row).

viii.             Copy data from Microsoft Excel will automatically format the data. All columns are separated by a "Tab" and each line in text area stands for a row.

 

4)        Set the basic options

The default setting of the Miner usually works well as long as the quality of the experiment is good. It's also a good way to test the quality of the experiment by just simply run Miner followed by visual inspection, especially on the StdErr_OfReplicateSamplesEfficiency, StdErr_OfReplicateSamplesCT, Stdev_OfGenesEfficiency, MeanCV%_OfGenesEfficiencies, and MeanCV%_OfGenesCTs, etc. In some case, you might want to adjust some basic options listed as below for particular reason. But if you do need change the default values, make sure you use them for all related data that you submit to Miner so that they are still comparable to each other based on the same statistical stringency.

i.                    MaxP-value:

The maximal P-value allowed for non-linear regression by three-parameter simple exponent model from which efficiency is calculated. The default maximal P-value is 0.05 to achieve the minimal requirement of statistical signification. For some particular experiments where the data can not fit perfectly with the exponential model, user can try higher maximal P-value (especially for some experiments with lower quality data). But user should keep in mind that the bigger maximal P-value you choose, less significant efficiency Miner will produce.

ii.                  minEfficiency and maxEfficiency:

The minimum and maximal efficiency can be custom predefined. The default values for the minimum and maximal efficiency are 0.0 (0%) and 2.0 (200%). In most case, those default values work fine. But if your samples are suppose to have unusual efficiency out of the default range (0%~200%), you can try to change these two values to cover it. When calculating the data from lower quality or particular samples from which Miner produce error messages using default values, you can try lower minimum efficiency or/and higher maximal efficiency.

 

5)        Provide your email and platform you used for Real-time PCR

User must provide a valid Email address for receiving result and choose or type the brand and model of the Real-time PCR system they use for technique support. Once you submit your data, a Job_ID will generate automatically (e.g. "976360283_381464") and you can use it to retrieve your results in future.

 

6)        Set the direction of the data / results

                     Please indicate the direction of the sample in the submitted data and the results. The default direction is by column. If you have more than 255 samples and want use Microsoft Excel (can only handle data set with 256 columns or less) to handle it, you need to set the direction as "by row". is The default setting is "by column".

 

7)        Submit your data and wait for the results (results are in .txt format, which can be easily imported to Microsoft Excel. Also, an email with the results attached will send to the address you provided when you submitted your data.)

 

                         i.              Click "Submit raw data" to submit your analysis request.

 

                       ii.              Click "Reset raw data" to reset the analysis request form to the example data and default options.

 

                      iii.              Click "Clean raw data" to clean the data in text area before you paste your own data.

 

3.      Explanation of the result

 

   In most case, we recommend to use the average efficiency of each gene and the average CTs of each replicates (replicates from the same sample, or "PCR replicates") to do quantification. While when you concerned the tissue or treatment specific inhibitory or activatory effect of PCR reaction for the same gene in your experiment, you might need to use the average efficiency of replicates instead of the average efficiency of each gene. But in this case, you had better use more replicates (6 or more) for each group to get accurate average efficiency from replicates because you have fewer samples to do the average (like only three if you do triplicates). Here is an example in Excel file for how to do the quantification using the average efficiency of each gene and the average CT of replicates. 

 

Keywords:   SampleNames     Logistic_a     Logistic_b     Logistic_X0     Logistic_Y0     Logistic_Pvalue     Noise(SPE)     EndofExpPhase(SDM)     CP(SPE)     CP(SDM)     DynamicThreshold     LowerCycleNumber     UpperCycleNumber     PointsForRegression     Number_OfRegressionWindows     WeightedAverage_OfPvalue     Stderr_OfWeightedPvalue     WeightedAverage_OfEfficiency     StdErr_OfWeightedEfficiency     CT     TotalSampleNumber     ReplicateSampleNames     AverageEfficiency_OfReplicatesamples     StdErr_OfReplicateSamplesEfficiency     CoefficienctVariation(CV%)_OfReplicateSamplesEfficiencies     AverageCT_OfReplicatesamples     StdErr OfReplicateSamplesCT     CV%_OfReplicateSamplesCTs     GeneNames     AverageEfficiency_OfGenes     Stdev_OfGenesEfficiencies     MeanCV%_OfGenesEfficiencies     MeanCV%_OfGenesCTs     Warnings     Errors    

 

1)        SampleNames:

Sample name given by user

 

2)        Logistic_a:

Parameter "a" in four parameters Logistic model

 

3)        Logistic_b:

Parameter "b" in four parameters Logistic model

 

4)        Logistic_X0:

Parameter "X0" in four parameters Logistic model

      

5)        Logistic_Y0:

Parameter "Y0" in four parameters Logistic model

 

6)        Logistic_Pvalue:

P-value of F-statistic computed for the regression using four parameters Logistic model

      

7)        Noise(SPE):

Standard deviation of noise cycles, also considered as the start point of the exponential phase

            ?

8)        EndofExpPhase(SDM):

Fluorescence reading at secondary derivative maximal, also considered as the end of the exponential phase

      

9)        CP(SPE):

Crossing point of the start point of the exponential phase

 

10)     CP(SDM):

Crossing point of the second positive second derivative maximum

 

11)     DynamicThreshold:

A threshold chosen for each sample dynamically based on its own kinetics

 

12)     LowerCycleNumber:

The lower boundary row number (or column number when data submitted by row) of the exponential phase

 

13)     UpperCycleNumber:

The higher boundary row number (or column number when data submitted by row) of the exponential phase

 

14)     PointsForRegression:

The number of point eventually used for exponential phase regression

 

15)     Number_OfRegressionWindows:

The number of windows eventually used in exponential phase for regression

 

16)     WeightedAverage_OfPvalue:

The weighted average of the P-value of F-statistic for the regression windows

      

17)     StdErr_OfWeightedPvalue:

The standard error of the weighted average of the P-value of F-statistic for the regression windows

 

18)     WeightedAverage_OfEfficiency:

The weighted averaged efficiency of each sample

 

19)     StdErr_OfWeightedEfficiency:

The standard error of the weighted averaged efficiency for each sample

 

20)     CT:

The fractional cycle number of dynamic threshold value of the sample

 

21)     TotalSampleNumber:

The total number of the sample in the submitted data set

 

22)      ReplicateSampleNames:

Partial name of the sample ("GeneName_TreatmentName") of the full name ("GeneName_TreatmentName_No."), representing the replicate samples

 

23)     AverageEfficiency_OfReplicateSamples:

The mean Efficiency of the replicate samples

 

24)     StdErr_OfReplicateSamplesEfficiency:

The standard error of the Efficiency for replicate samples

 

25)     CoefficientVariation (%)_OfReplicateSamplesEfficiencies:

Coefficient Variation (%) of Efficiencies within replicate samples:

            ?

26)     AverageCT_OfReplicateSamples:

The mean CT of replicate samples

 

27)     StdErr_OfReplicateSamplesCT:

The standard error of the CT for replicate samples

 

28)     CV%_OfReplicateSamplesCTs:

Coefficient Variation (%) of CT within replicate samples:

 

29)     GeneNames:

Partial name of the sample ("GeneName") of the full name ("GeneName_TreatmentName_No."), representing the detecting gene for a set of samples

 

30)     AverageEfficiency_OfGenes:

The mean Efficiency of each gene

 

31)     Stdev_OfGenesEfficiency:

The standard deviation of the Efficiency for each gene

 

32)     MeanCV%_OfGenesEfficiencies:

The mean coefficient variation (%) of Efficiency for each gene

 

33)     MeanCV%_OfGenesCTs:

The mean coefficient variation (%) of CT for each gene

 

34)     Warnings:

Warning messages found by Miner

 

35)     Errors:

Warning messages found by Miner

 

4.      Future improvement

 

1)        Currently, Miner works perfectly with MyIQ and iCycler system from Bio-Rad. Some raw data from ABI PRISMTM 7700, 7900, Stratagen MX 3000, 4000, Roche LightCycler, MJ Research DNA Engine Opticon2, etc. also passed tests successfully. Data samples from various platforms are welcome to be sent to us for additional tests so that we can make Miner more powerful and cross more platforms.

 

2)        We are continually updating Miner and this website. Your inputs and discussions will always welcome.

 

3)        Additional options for data analysis may be included in the future.

 

5.      How to cite Miner

 

Sheng Zhao, Russell D. Fernald. Comprehensive algorithm for quantitative real-time polymerase chain reaction. J. Comput. Biol. 2005 Oct;12(8):1045-62. PubMed and PDF