HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench

dc.contributor.authorShen-Gunther, Jane
dc.contributor.authorXia, Qingqing
dc.contributor.authorCai, Hong
dc.contributor.authorWang, Yufeng
dc.date.accessioned2021-08-26T13:28:26Z
dc.date.available2021-08-26T13:28:26Z
dc.date.issued2021-08-13
dc.date.updated2021-08-26T13:28:27Z
dc.description.abstractNext-generation sequencing (NGS) has actualized the human papillomavirus (HPV) virome profiling for in-depth investigation of viral evolution and pathogenesis. However, viral computational analysis remains a bottleneck due to semantic discrepancies between computational tools and curated reference genomes. To address this, we developed and tested automated workflows for HPV taxonomic profiling and visualization using a customized papillomavirus database in the CLC Microbial Genomics Module. HPV genomes from Papilloma Virus Episteme were customized and incorporated into CLC “ready-to-use” workflows for stepwise data processing to include: (1) Taxonomic Analysis, (2) Estimate Alpha/Beta Diversities, and (3) Map Reads to Reference. Low-grade (n = 95) and high-grade (n = 60) Pap smears were tested with ensuing collective runtimes: Taxonomic Analysis (36 min); Alpha/Beta Diversities (5 s); Map Reads (45 min). Tabular output conversion to visualizations entailed 1–2 keystrokes. Biodiversity analysis between low- (LSIL) and high-grade squamous intraepithelial lesions (HSIL) revealed loss of species richness and gain of dominance by HPV-16 in HSIL. Integrating clinically relevant, taxonomized HPV reference genomes within automated workflows proved to be an ultra-fast method of virome profiling. The entire process named “HPV DeepSeq” provides a simple, accurate and practical means of NGS data analysis for a broad range of applications in viral research.
dc.description.departmentMolecular Microbiology and Immunology
dc.identifierdoi: 10.3390/pathogens10081026
dc.identifier.citationPathogens 10 (8): 1026 (2021)
dc.identifier.urihttps://hdl.handle.net/20.500.12588/672
dc.rightsAttribution 4.0 United States
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectbioinformatics
dc.subjectcervical cancer
dc.subjectdeep sequencing
dc.subjecthuman papillomavirus
dc.subjectHPV genotyping
dc.subjectmetagenome
dc.subjectnext generation sequencing
dc.subjecttaxonomic classification
dc.subjectvirome
dc.titleHPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
pathogens-10-01026-v2.pdf
Size:
26.22 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: