Development of a hybridization-based capture NGS assay to assess genome-wide diversity in Xylella fastidiosa infected samples


Abstract: Xylella fastidiosa (Xf) early detection and its assignment to subspecies and sequence type (ST) level is critical for its management since can help to determine the potential range of host plants of Xf populations present in epidemic foci and to take appropriate measures to its eradication, containment and monitoring. Currently, Xf-typing at the subspecies and ST level is based on the use of MultiLocus sequence typing analysis (MLST). However, the Xf-detection could fail or may be inconsistent due to interference of the DNA amplification with PCR inhibitors leading to not clear results or false negatives. In addition, genetic resolution at ST-level based on seven housekeeping genes does not provide sufficient phylogenetic resolution to determine dispersal paths or relationships among strains that are of biological and quarantine relevance. Consequently, the use of whole-genome sequence (WGS) data should probably be used specially when developing management strategies for Xf outbreaks. Unfortunately, when quickness is a need after detection of an outbreak it is not always possible to isolate the Xf strain to obtain its genome sequence, or the data obtained by direct NGS analysis of infected samples do not contain enough Xf reads to adequately identify the strain intercepted. In this study, we developed a Xf-Targeted Sequence Capture Enrichment (TSCE) in combination with High-Throughput Sequencing (HTS) procedure using an Illumina platform to provide efficient access to Xf reads to identify Xf at strain level. More than 7,000 baits targeting 140 Xf gene sequences present in the Xf chromosome or plasmids were selected to cover genomic markers of all subspecies and STs described to date. We showed that whereas < 0.25 % of Xf reads were detected by direct WGS of host DNA this number increased up to 41-73 % after using the TSCE-HTS approach in individual samples or in mixtures of up to four multiplexed plant samples. We were able to identify all seven loci commonly used for Xf MLST and correctly identify the subspecies and ST of all Xf strains present in mock-inoculate and naturally infected plant and insect samples. After assembly of captured reads we were able to identify up to 284 Xf coding sequences (CDS), indicating that more CDS than expected were captured. Furthermore, phylogenetic analysis of 90 captured and aligned genes correctly positioned the infected samples with the reference Xf strains known to be infecting the samples.

Cookie Consent with Real Cookie Banner