Package picard.sam
Class DuplicationMetrics
java.lang.Object
htsjdk.samtools.metrics.MetricBase
picard.analysis.MergeableMetricBase
picard.sam.DuplicationMetrics
- Direct Known Subclasses:
FlowBasedDuplicationMetrics
@DocumentedFeature(groupName="Metrics",
summary="Metrics")
public class DuplicationMetrics
extends MergeableMetricBase
Metrics that are calculated during the process of marking duplicates
within a stream of SAMRecords.
-
Nested Class Summary
Nested classes/interfaces inherited from class picard.analysis.MergeableMetricBase
MergeableMetricBase.MergeByAdding, MergeableMetricBase.MergeByAssertEquals, MergeableMetricBase.MergingIsManual, MergeableMetricBase.NoMergingIsDerived, MergeableMetricBase.NoMergingKeepsValue
-
Field Summary
FieldsModifier and TypeFieldDescriptionThe estimated number of unique molecules in the library based on PE duplication.The library on which the duplicate marking was performed.The fraction of mapped sequence that is marked as duplicate.long
The number of read pairs that were marked as duplicates.long
The number of read pairs duplicates that were caused by optical duplication.long
The number of mapped read pairs examined.long
The number of reads that were either secondary or supplementarylong
The total number of unmapped reads examined.long
The number of fragments that were marked as duplicates.long
The number of mapped reads examined which did not have a mapped mate pair, either because the read is unpaired, or the read is paired to an unmapped mate. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
addDuplicateReadToMetrics
(htsjdk.samtools.SAMRecord rec) Adds duplicated read to the metricsvoid
addReadToLibraryMetrics
(htsjdk.samtools.SAMRecord rec) Adds a read to the metricsvoid
Fills in the ESTIMATED_LIBRARY_SIZE based on the paired read data examined where possible and the PERCENT_DUPLICATION.void
Deprecated.htsjdk.samtools.util.Histogram<Double>
Calculates a histogram using the estimateRoi method to estimate the effective yield doing x sequencing for x=1..10.static Long
estimateLibrarySize
(long readPairs, long uniqueReadPairs) Estimates the size of a library based on the number of paired end molecules observed and the number of unique pairs observed.static double
estimateRoi
(long estimatedLibrarySize, double x, long pairs, long uniquePairs) Estimates the ROI (return on investment) that one would see if a library was sequenced to x higher coverage than the observed coverage.static void
Methods inherited from class picard.analysis.MergeableMetricBase
canMerge, merge, merge, mergeIfCan
Methods inherited from class htsjdk.samtools.metrics.MetricBase
equals, hashCode, toString
-
Field Details
-
LIBRARY
The library on which the duplicate marking was performed. -
UNPAIRED_READS_EXAMINED
public long UNPAIRED_READS_EXAMINEDThe number of mapped reads examined which did not have a mapped mate pair, either because the read is unpaired, or the read is paired to an unmapped mate. -
READ_PAIRS_EXAMINED
public long READ_PAIRS_EXAMINEDThe number of mapped read pairs examined. (Primary, non-supplemental) -
SECONDARY_OR_SUPPLEMENTARY_RDS
public long SECONDARY_OR_SUPPLEMENTARY_RDSThe number of reads that were either secondary or supplementary -
UNMAPPED_READS
public long UNMAPPED_READSThe total number of unmapped reads examined. (Primary, non-supplemental) -
UNPAIRED_READ_DUPLICATES
public long UNPAIRED_READ_DUPLICATESThe number of fragments that were marked as duplicates. -
READ_PAIR_DUPLICATES
public long READ_PAIR_DUPLICATESThe number of read pairs that were marked as duplicates. -
READ_PAIR_OPTICAL_DUPLICATES
public long READ_PAIR_OPTICAL_DUPLICATESThe number of read pairs duplicates that were caused by optical duplication. Value is always < READ_PAIR_DUPLICATES, which counts all duplicates regardless of source. -
PERCENT_DUPLICATION
The fraction of mapped sequence that is marked as duplicate. -
ESTIMATED_LIBRARY_SIZE
The estimated number of unique molecules in the library based on PE duplication.
-
-
Constructor Details
-
DuplicationMetrics
public DuplicationMetrics()
-
-
Method Details
-
calculateDerivedFields
public void calculateDerivedFields()Fills in the ESTIMATED_LIBRARY_SIZE based on the paired read data examined where possible and the PERCENT_DUPLICATION.- Overrides:
calculateDerivedFields
in classMergeableMetricBase
-
calculateDerivedMetrics
Deprecated.Fills in the ESTIMATED_LIBRARY_SIZE based on the paired read data examined where possible and the PERCENT_DUPLICATION.Deprecated, use
calculateDerivedFields()
instead. -
estimateLibrarySize
Estimates the size of a library based on the number of paired end molecules observed and the number of unique pairs observed.Based on the Lander-Waterman equation that states: C/X = 1 - exp( -N/X ) where X = number of distinct molecules in library N = number of read pairs C = number of distinct fragments observed in read pairs
-
estimateRoi
public static double estimateRoi(long estimatedLibrarySize, double x, long pairs, long uniquePairs) Estimates the ROI (return on investment) that one would see if a library was sequenced to x higher coverage than the observed coverage.- Parameters:
estimatedLibrarySize
- the estimated number of molecules in the libraryx
- the multiple of sequencing to be simulated (i.e. how many X sequencing)pairs
- the number of pairs observed in the actual sequencinguniquePairs
- the number of unique pairs observed in the actual sequencing- Returns:
- a number z <= x that estimates if you had pairs*x as your sequencing then you would observe uniquePairs*z unique pairs.
-
calculateRoiHistogram
Calculates a histogram using the estimateRoi method to estimate the effective yield doing x sequencing for x=1..10. -
main
-
addDuplicateReadToMetrics
public void addDuplicateReadToMetrics(htsjdk.samtools.SAMRecord rec) Adds duplicated read to the metrics -
addReadToLibraryMetrics
public void addReadToLibraryMetrics(htsjdk.samtools.SAMRecord rec) Adds a read to the metrics
-