Citation
Eldjarn Hjoerleifsson, Kristjan (2023) Graph Modeling for Genomics and Epidemiology. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/s32c-a211. https://resolver.caltech.edu/CaltechTHESIS:02122023-103759689
Abstract
The last decades have seen great leaps made in the development of RNA sequencing technologies, yielding lower cost and greater throughput of experiments, to the point where the scale of the data produced on a daily basis is staggering. While computational hardware is also continuously improving, famously (or perhaps infamously) described by Gordon Moore (Moore, 1965), the rate at which data are produced eclipses advances on the hardware front. Over the last few years, many new methods have been proposed for bridging that ever-widening chasm, more than a few of which harness the latent graphical structure of genomic data to reduce the number of calculations required and pack the data tighter in memory. This body of work continues this development on three different, but related, fronts. Firstly, I present developments that greatly improve upon the efficiency of state-of-the-art methods for the quantification of RNA-seq reads, and describe a method that improves the accuracy of quantification without substantially increasing the computational over- head. Secondly, I introduce a procedure for the discovery of associations between novel gene isoforms and phenotypes, without prior knowledge of those isoforms. Lastly, I present the largest reconstruction of the transmission tree of a viral outbreak to date, modeled from viral genome sequences, contact tracing, and symptom data. I then use the reconstructed transmission tree to assess the efficacy of different vaccination strategies.
Item Type: | Thesis (Dissertation (Ph.D.)) | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subject Keywords: | Computational Biology, Molecular Epidemiology, RNA-seq, RNA quantification, SARS-CoV-2 | |||||||||||||||
Degree Grantor: | California Institute of Technology | |||||||||||||||
Division: | Engineering and Applied Science | |||||||||||||||
Major Option: | Computing and Mathematical Sciences | |||||||||||||||
Thesis Availability: | Public (worldwide access) | |||||||||||||||
Research Advisor(s): |
| |||||||||||||||
Thesis Committee: |
| |||||||||||||||
Defense Date: | 16 December 2022 | |||||||||||||||
Non-Caltech Author Email: | kristjan (AT) eldjarn.net | |||||||||||||||
Record Number: | CaltechTHESIS:02122023-103759689 | |||||||||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:02122023-103759689 | |||||||||||||||
DOI: | 10.7907/s32c-a211 | |||||||||||||||
Related URLs: |
| |||||||||||||||
ORCID: |
| |||||||||||||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | |||||||||||||||
ID Code: | 15105 | |||||||||||||||
Collection: | CaltechTHESIS | |||||||||||||||
Deposited By: | Kristjan Eldjarn Hjoerleifsson | |||||||||||||||
Deposited On: | 17 Feb 2023 17:48 | |||||||||||||||
Last Modified: | 23 May 2023 20:02 |
Thesis Files
PDF
- Final Version
See Usage Policy. 11MB |
Repository Staff Only: item control page