NGS basics: Quality metrics for NGS library preparation

Lab photo

NGS basics: Quality metrics for NGS library preparation

The aim of every NGS library prep workflow is to convert input molecules of DNA or cDNA by the ligation of vendor-specific adapters into a format that can be sequenced with an NGS platform. The concept does appear very straightforward, but achieving this goal is more complex than you can imagine, and only rigorous quality control during and after the experiment can help ensure optimal outcomes.

How can you evaluate the quality of an NGS library?

Two determinants of successful sequencing are the quality and quantity of the library material. The fragment size distribution and the accurate library concentration assessment by qPCR approach are two critical parameters that can help you evaluate the quality of your library prior to sequencing. Besides these two standard quality metrics, the following additional important metrics enable you to judge the overall quality of your library prep.

Conversion rate

The conversion rate describes how many input molecules were converted into “sequenceable” fragments, or fragments that have adapters attached to both ends. One simple method to calculate conversion rate is to compare the measured specific library yield divided by theoretical maximum yield. Certain assumptions regarding library amplification PCR efficiency and DNA loss due to cleanup steps have to be taken into consideration for such calculation. Furthermore, it has to be taken into consideration that qPCR library quantification method does not discriminate specific libraries and adapter dimers and is not the suitable quantification method for libraries with high adapter dimer contamination. For such libraries, we recommend quantification with an electrophoresis-based method such as QIAxcel, Agilent Bioanalyzer or Agilent TapeStation.

For example, by assuming 95% PCR efficiency and 20% sample loss in cleanup steps, the theoretical maximum yield can be calculated as:

Capture1

The conversion rate is then defined as the ratio between the measured yield and the theoretical maximum yield:

Capture2

The conversion rate is largely influenced by the ligation efficiency. A good library prep chemistry will have enzyme and buffer formulations optimized in a way so as to ensure maximum ligation efficiency.

Complexity

High conversion rate and low bias in library prep and PCR amplification will capture more unique molecules from your sample. The more unique molecules that are sequenced, the less duplicated reads will be present in your data set. Duplicate reads do not add meaningful information to the NGS data set and can lead to skewed variant frequencies. As a result, duplicate reads are removed during data analysis.

Another important factor determining the complexity of an NGS library is the amount of sample used for library prep. The lower the sample input, the less complex an NGS library becomes. Thus, a high ligation efficiency and conversion rate is especially important for sub-nanogram input amounts to capture the maximum possible complexity of a limited sample.

Uniformity

The coverage uniformity describes how even reads are distributed along the genome or in a set of target regions. The more uniform the coverage, the less sequencing is required to reach sufficient depth from all regions of interest. Bias on the coverage uniformity is usually introduced in the library prep and library amplification stages. Very often, the coverage uniformity shows a strong GC bias – i.e., less or more coverage, depending on the GC content. Exploring the evenness of coverage as a dependent factor on the GC content helps uncover GC bias. A good library prep will show little effect on the coverage uniformity, depending on the GC content.

Accuracy

The higher the accuracy of your NGS library prep, the more you can trust your variant reporting. Nucleotide errors can be typically introduced through library amplification PCR as well as during the sequencing, itself. Sequencing errors tend to typically remain below 1%. Library amplification PCR errors can be minimized through high-fidelity PCR reagents. NGS reference samples with well-defined variants and respective frequencies help assess the accuracy of your NGS workflow.

During our library prep product development cycle, these key quality indicators are taken into account and used to optimize our QIAseq library prep products to obtain better end results. If you’d like to learn more about QIAGEN’s QIAseq NGS Portfolio, click here.

Interested in learning more about NGS library construction technology? Sign up for a free 3-part webinar series on NGS here!

Verena Schramm

Dr. Verena Schramm is a Global Product Manager for NGS library preparation products at QIAGEN. She holds a degree in Molecular Biology and Bioinformatics from the University of Applied Sciences, Gelsenkirchen and has worked on various NGS-related applications in industry and academia since 2008. After completing her PhD at the European Molecular Biology Laboratory (EMBL) in Heidelberg, Germany in 2014 and prior to joining QIAGEN, she worked for a Swiss start-up in the sector of Data Driven Medicine.

Your email address will not be published. Required fields are marked *