Investigation of data processing pipeline for high supersequence based methylation (Methyl-Seq)
The purpose of this thesis is to understand and implement a pipeline for processing high supersequence DNA data. This research project investigates the technology behind the most widely used platforms in the industry. It also investigates the interpretation of standard data formats. Research was conducted to investigate alignment software and key algorithms used to speed up the alignment process. Finally a pipeline is put in place so that a raw data file can be aligned and visualized for methylation data. The software is available at the computational biology initiative so that in the future it will be accessible through the University of Texas at San Antonio.