A Caltech Library Service

Functional Genomic Studies of the Structure and Regulation of Eukaryotic Transcriptomes


Marinov, Georgi Kolev (2014) Functional Genomic Studies of the Structure and Regulation of Eukaryotic Transcriptomes. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/1BXM-ZM46.


The main focus of this thesis is the use of high-throughput sequencing technologies in functional genomics (in particular in the form of ChIP-seq, chromatin immunoprecipitation coupled with sequencing, and RNA-seq) and the study of the structure and regulation of transcriptomes. Some parts of it are of a more methodological nature while others describe the application of these functional genomic tools to address various biological problems. A significant part of the research presented here was conducted as part of the ENCODE (ENCyclopedia Of DNA Elements) Project.

The first part of the thesis focuses on the structure and diversity of the human transcriptome. Chapter 1 contains an analysis of the diversity of the human polyadenylated transcriptome based on RNA-seq data generated for the ENCODE Project. Chapter 2 presents a simulation-based examination of the performance of some of the most popular computational tools used to assemble and quantify transcriptomes. Chapter 3 includes a study of variation in gene expression, alternative splicing and allelic expression bias on the single-cell level and on a genome-wide scale in human lymphoblastoid cells; it also brings forward a number of critical to the practice of single-cell RNA-seq measurements methodological considerations.

The second part presents several studies applying functional genomic tools to the study of the regulatory biology of organellar genomes, primarily in mammals but also in plants. Chapter 5 contains an analysis of the occupancy of the human mitochondrial genome by TFAM, an important structural and regulatory protein in mitochondria, using ChIP-seq. In Chapter 6, the mitochondrial DNA occupancy of the TFB2M transcriptional regulator, the MTERF termination factor, and the mitochondrial RNA and DNA polymerases is characterized. Chapter 7 consists of an investigation into the curious phenomenon of the physical association of nuclear transcription factors with mitochondrial DNA, based on the diverse collections of transcription factor ChIP-seq datasets generated by the ENCODE, mouseENCODE and modENCODE consortia. In Chapter 8 this line of research is further extended to existing publicly available ChIP-seq datasets in plants and their mitochondrial and plastid genomes.

The third part is dedicated to the analytical and experimental practice of ChIP-seq. As part of the ENCODE Project, a set of metrics for assessing the quality of ChIP-seq experiments was developed, and the results of this activity are presented in Chapter 9. These metrics were later used to carry out a global analysis of ChIP-seq quality in the published literature (Chapter 10). In Chapter 11, the development and initial application of an automated robotic ChIP-seq (in which these metrics also played a major role) is presented.

The fourth part presents the results of some additional projects the author has been involved in, including the study of the role of the Piwi protein in the transcriptional regulation of transposon expression in Drosophila (Chapter 12), and the use of single-cell RNA-seq to characterize the heterogeneity of gene expression during cellular reprogramming (Chapter 13).

The last part of the thesis provides a review of the results of the ENCODE Project and the interpretation of the complexity of the biochemical activity exhibited by mammalian genomes that they have revealed (Chapters 15 and 16), an overview of the expected in the near future technical developments and their impact on the field of functional genomics (Chapter 14), and a discussion of some so far insufficiently explored research areas, the future study of which will, in the opinion of the author, provide deep insights into many fundamental but not yet completely answered questions about the transcriptional biology of eukaryotes and its regulation.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:ChIP-seq, RNA-seq, transcriptomics, ENCODE, functional genomics
Degree Grantor:California Institute of Technology
Division:Biology and Biological Engineering
Major Option:Biology
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Wold, Barbara J.
Thesis Committee:
  • Aravin, Alexei A. (chair)
  • Wold, Barbara J.
  • Davidson, Eric H.
  • Sternberg, Paul W.
Defense Date:1 May 2014
Funding AgencyGrant Number
NIHU54 HG004576
NIHU54 HG006998
Beckman Institute Functional Genomics CenterUNSPECIFIED
Donald Bren EndowmentUNSPECIFIED
Record Number:CaltechTHESIS:05122014-102729631
Persistent URL:
Related URLs:
URLURL TypeDescription A: Effects of sequence variation on differential allelic transcription factor occupancy and gene expression B: The ENCODE Project Consortium - An integrated encyclopedia of DNA elements in the human genome C: Landscape of transcription in human cells D + Ch. 9: ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia E + Ch. 4: Gene expression changes in a tumor xenograft by a pyrrole-imidazole polyamide F: Antitumor activity of a pyrrole-imidazole polyamide G: Piwi induces piRNA-guided transcriptional silencing and establishment of a repressive chromatin state pone.0074513DOIAppendix H + Ch. 5: Genome-Wide Analysis Reveals Coating of the Mitochondrial Genome by TFAM I: Integrating and mining the chromatin landscape of cell-type specificity using self-organizing maps J + Ch. 3: From single-cell to cell-pool transcriptomes: stochasticity in gene expression and RNA splicing K + Ch. 10: Large-scale quality analysis of published ChIP-seq data L + Ch. 7: Evidence for site-specific occupancy of the mitochondrial genome by nuclear transcription factors M + Ch. 15: Defining functional DNA elements in the human genome N: A User's Guide to the Encyclopedia of DNA Elements (ENCODE) N: An encyclopedia of mouse DNA elements (Mouse ENCODE)
Marinov, Georgi Kolev0000-0003-1822-7273
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:8228
Deposited By: Georgi Marinov
Deposited On:22 May 2014 18:45
Last Modified:08 Nov 2023 00:36

Thesis Files

PDF (Thesis - display version) - Final Version
See Usage Policy.

PDF (Thesis - double spaced version) - Final Version
See Usage Policy.


Repository Staff Only: item control page