Читать книгу Genotyping by Sequencing for Crop Improvement - Группа авторов - Страница 60

3.2 Basic Steps Involved in Whole‐Genome Sequencing and Resequencing

Оглавление

Whole‐genome sequencing (WGS) can be divided into two groups, which include de novo WGS and whole‐genome resequencing (WGR) (Bhat et al. 2020). WGS involves the genome sequence assembly for the first time while WGR compares genomic variability within individuals or populations (Patil et al. 2019). WGR requires the prior availability of reference genome for mapping and variant detection. Among WGS, de novo WGS involves the complete assembly of a species genome for the first time (Sevanthi et al. 2018). First, for the library preparation, high quality of genomic DNA is subjected to fragmentation followed by the addition of adaptors to the DNA fragments. For the detection of small structural variants such as INDELs or CNVs (copy number variations), short reads (350–550 bp insert size) from standard libraries are utilized while long‐read data or mate‐pair libraries with an insert size of around 2 to 20 kb will be required for the detection of large structural variants. For high‐throughput sequencing, Illumina is often used. The sequences are mapped on the genome sequence based on similarity and local contigs are developed. While assembling the sequence, repetitive regions show difficulty in alignment with the short reads. In that case, mate pair‐end sequencing reads aids in aligning large sequences which are also referred as scaffolds or supercontigs by linking and orienting contig. Unknown sequences gaps are denoted as Ns. The final result of a genome assembly corresponds to the contiguous scaffold sequences in a series separated by gaps.

In contrast to the WGS, WGR helps in the comparison of the variable sequences present between the genome of an individual or the population. In the case of WGR, the species genome sequence is a prerequirement for the read mapping. For example, in the case of an individual, genomic DNA of high quality is fragmented for library preparation in which adaptors are added to the fragments with an average insert size of 350–500 bp. With the help of high‐throughput sequencing paired‐end, short reads of about 100 bp are obtained. These short reads based on sequence similarity are mapped on the reference genome. When a particular nucleotide differs from the species‐specific base single‐nucleotide polymorphisms (SNPs) are detected. In some cases, SNPs might get lost as these are not present in the reference genome while some are heterozygous. And others may get lost due to low coverage. In the case of the population, the aim is to obtain genomic data from a wide range of individuals which are analyzed as a whole and are sequenced. These techniques have a wide range of applications in conservation and management (Fuentes‐Pardo and Ruzzante 2017).

Genotyping by Sequencing for Crop Improvement

Подняться наверх