Genome Research

  1549-5469

  1088-9051

  Mỹ

Cơ quản chủ quản:  COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT , Cold Spring Harbor Laboratory Press

Lĩnh vực:
Genetics (clinical)Genetics

Các bài báo tiêu biểu

Interactive Exploration of Microarray Gene Expression Patterns in a Reduced Dimensional Space
Tập 12 Số 7 - Trang 1112-1120 - 2002
Jatin Misra, William Schmitt, Daehee Hwang, Li-Li Hsiao, S. R. Gullans, George Stephanopoulos, Gregory Stephanopoulos
The very high dimensional space of gene expression measurements obtained by DNA microarrays impedes the detection of underlying patterns in gene expression data and the identification of discriminatory genes. In this paper we show the use of projection methods such as principal components analysis (PCA) to obtain a direct link between patterns in the genes and patterns in samples. This feature is useful in the initial interactive pattern exploration of gene expression data and data-driven learning of the nature and types of samples. Using oligonucleotide microarray measurements of 40 samples from different normal human tissues, we show that distinct patterns are obtained when the genes are projected on a two-dimensional plane spanned by the loadings of the two major principal components. These patterns define the particular genes associated with a sample class (i.e., tissue). When used separately from the other genes, these class-specific (i.e., tissue-specific) genes in turn define distinct tissue patterns in the projection space spanned by the scores of the two major principal components. In this study, PCA projection facilitated discriminatory gene selection for different tissues and identified tissue-specific gene expression signatures for liver, skeletal muscle, and brain samples. Furthermore, it allowed the classification of nine new samples belonging to these three types using the linear combination of the expression levels of the tissue-specific genes determined from the first set of samples. The application of the technique to other published data sets is also discussed.[Online supplementary material available atwww.genome.org.]
Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world
Tập 17 Số 11 - Trang 1572-1585 - 2007
Minglei Wang, Liudmila S. Yafremava, Derek Caetano-Anollés, Jay E. Mittenthal, Gustavo Caetano‐Anollés
The repertoire of protein architectures in proteomes is evolutionarily conserved and capable of preserving an accurate record of genomic history. Here we use a census of protein architecture in 185 genomes that have been fully sequenced to generate genome-based phylogenies that describe the evolution of the protein world at fold (F) and fold superfamily (FSF) levels. The patterns of representation of F and FSF architectures over evolutionary history suggest three epochs in the evolution of the protein world: (1) architectural diversification, where members of an architecturally rich ancestral community diversified their protein repertoire; (2) superkingdom specification, where superkingdoms Archaea, Bacteria, and Eukarya were specified; and (3) organismal diversification, where F and FSF specific to relatively small sets of organisms appeared as the result of diversification of organismal lineages. Functional annotation of FSF along these architectural chronologies revealed patterns of discovery of biological function. Most importantly, the analysis identified an early and extensive differential loss of architectures occurring primarily in Archaea that segregates the archaeal lineage from the ancient community of organisms and establishes the first organismal divide. Reconstruction of phylogenomic trees of proteomes reflects the timeline of architectural diversification in the emerging lineages. Thus, Archaea undertook a minimalist strategy using only a small subset of the full architectural repertoire and then crystallized into a diversified superkingdom late in evolution. Our analysis also suggests a communal ancestor to all life that was molecularly complex and adopted genomic strategies currently present in Eukarya.
Whole population, genome-wide mapping of hidden relatedness
Tập 19 Số 2 - Trang 318-326 - 2009
Alexander Gusev, Jennifer K. Lowe, Markus Stoffel, Mark J. Daly, David Altshuler, Jan L. Breslow, Jeffrey M. Friedman, Itsik Pe’er
We present GERMLINE, a robust algorithm for identifying segmental sharing indicative of recent common ancestry between pairs of individuals. Unlike methods with comparable objectives, GERMLINE scales linearly with the number of samples, enabling analysis of whole-genome data in large cohorts. Our approach is based on a dictionary of haplotypes that is used to efficiently discover short exact matches between individuals. We then expand these matches using dynamic programming to identify long, nearly identical segmental sharing that is indicative of relatedness. We use GERMLINE to comprehensively survey hidden relatedness both in the HapMap as well as in a densely typed island population of 3000 individuals. We verify that GERMLINE is in concordance with other methods when they can process the data, and also facilitates analysis of larger scale studies. We bolster these results by demonstrating novel applications of precise analysis of hidden relatedness for (1) identification and resolution of phasing errors and (2) exposing polymorphic deletions that are otherwise challenging to detect. This finding is supported by concordance of detected deletions with other evidence from independent databases and statistical analyses of fluorescence intensity not used by GERMLINE.
Genome-Wide In Silico Identification of Transcriptional Regulators Controlling the Cell Cycle in Human Cells
Tập 13 Số 5 - Trang 773-780 - 2003
Ran Elkon, Chaim Linhart, Roded Sharan, Ron Shamir, Yosef Shiloh
Dissection of regulatory networks that control gene transcription is one of the greatest challenges of functional genomics. Using human genomic sequences, models for binding sites of known transcription factors, and gene expression data, we demonstrate that the reverse engineering approach, which infers regulatory mechanisms from gene expression patterns, can reveal transcriptional networks in human cells. To date, such methodologies were successfully demonstrated only in prokaryotes and low eukaryotes. We developed computational methods for identifying putative binding sites of transcription factors and for evaluating the statistical significance of their prevalence in a given set of promoters. Focusing on transcriptional mechanisms that control cell cycle progression, our computational analyses revealed eight transcription factors whose binding sites are significantly overrepresented in promoters of genes whose expression is cell-cycle-dependent. The enrichment of some of these factors is specific to certain phases of the cell cycle. In addition, several pairs of these transcription factors show a significant co-occurrence rate in cell-cycle-regulated promoters. Each such pair indicates functional cooperation between its members in regulating the transcriptional program associated with cell cycle progression. The methods presented here are general and can be applied to the analysis of transcriptional networks controlling any biological process.[Supplemental material is available online atwww.genome.org, including full lists of genes whose promoters were found to contain high scoring sites for any of the enriched transcription factors reported in Tables 1 and 3.]
Human Haplotype Block Sizes Are Negatively Correlated With Recombination Rates
Tập 14 Số 7 - Trang 1358-1361 - 2004
Tiffany A. Greenwood, Brinda K. Rana, Nicholas J. Schork
The International Haplotype Map (“HapMap”) Project is motivated, in part, by the belief that the organization of the human genome, the mechanics of recombination, and the population-level behavior of alleles at adjacent loci should allow researchers to parse the genome into small segments, or “blocks,” that show strong linkage disequilibrium (LD) between alleles at loci within those segments. The discovery and evidence for these blocks is to be based solely on the observed LD strength and patterns between alleles at adjacent loci throughout the genome. Although there are many factors that contribute to LD strength, we assessed the correlation between block structure, in terms of length and percentage of the genome assembled into blocks within a region, and recombination rate obtained from two independent sources. We found evidence of a striking negative correlation between the average recombination rate and average block length, suggesting that recombination rate is a strong contributor to haplotype block structure within the genome. We discuss the potential implications of this negative correlation in the context of the organization, properties, and potential ubiquity of a block-like structure in the human genome.
Quantitative PCR and RT-PCR in virology.
Tập 2 Số 3 - Trang 191-196 - 1993
Massimo Clementi, Stefano Menzo, Patrizia Bagnarelli, Aldo Manzin, A Valenza, Pietro E. Varaldo
Time-dependent genetic effects on gene expression implicate aging processes
Tập 27 Số 4 - Trang 545-552 - 2017
Julien Bryois, Alfonso Buil, Pedro G. Ferreira, Nikolaos Panousis, Andrew Brown, Ana Viñuela, Alexandra Planchon, Deborah Bielser, Kerrin S. Small, Tim D. Spector, Emmanouil T. Dermitzakis
Gene expression is dependent on genetic and environmental factors. In the last decade, a large body of research has significantly improved our understanding of the genetic architecture of gene expression. However, it remains unclear whether genetic effects on gene expression remain stable over time. Here, we show, using longitudinal whole-blood gene expression data from a twin cohort, that the genetic architecture of a subset of genes is unstable over time. In addition, we identified 2213 genes differentially expressed across time points that we linked with aging within and across studies. Interestingly, we discovered that most differentially expressed genes were affected by a subset of 77 putative causal genes. Finally, we observed that putative causal genes and down-regulated genes were affected by a loss of genetic control between time points. Taken together, our data suggest that instability in the genetic architecture of a subset of genes could lead to widespread effects on the transcriptome with an aging signature.
Genome sequence of a proteolytic (Group I) <i>Clostridium botulinum</i> strain Hall A and comparative analysis of the clostridial genomes
Tập 17 Số 7 - Trang 1082-1092 - 2007
Mohammed Sebaihia, Michael W. Peck, Nigel P. Minton, Nicholas R. Thomson, Matthew T. G. Holden, Wilfrid J. Mitchell, Andrew T. Carter, Stephen D. Bentley, David R. Mason, Lisa Crossman, Catherine J. Paul, Alasdair Ivens, M.H.J. Wells-Bennik, Ian J. Davis, Ana Cerdeño-Tárraga, Carol Churcher, Michael A. Quail, Tracey Chillingworth, Theresa Feltwell, Arnaud Kerhornou, Ian Goodhead, Zahra Hance, Kay Jagels, Natasha Larke, Mark Maddison, Sharon Moule, Andrew J. Mungall, Halina Norbertczak, Ester Rabbinowitsch, Mandy Sanders, Mark Simmonds, Brian R. White, Sally Whithead, Julian Parkhill
Clostridium botulinum is a heterogeneous Gram-positive species that comprises four genetically and physiologically distinct groups of bacteria that share the ability to produce botulinum neurotoxin, the most poisonous toxin known to man, and the causative agent of botulism, a severe disease of humans and animals. We report here the complete genome sequence of a representative of Group I (proteolytic) C. botulinum (strain Hall A, ATCC 3502). The genome consists of a chromosome (3,886,916 bp) and a plasmid (16,344 bp), which carry 3650 and 19 predicted genes, respectively. Consistent with the proteolytic phenotype of this strain, the genome harbors a large number of genes encoding secreted proteases and enzymes involved in uptake and metabolism of amino acids. The genome also reveals a hitherto unknown ability of C. botulinum to degrade chitin. There is a significant lack of recently acquired DNA, indicating a stable genomic content, in strong contrast to the fluid genome of Clostridium difficile, which can form longer-term relationships with its host. Overall, the genome indicates that C. botulinum is adapted to a saprophytic lifestyle both in soil and aquatic environments. This pathogen relies on its toxin to rapidly kill a wide range of prey species, and to gain access to nutrient sources, it releases a large number of extracellular enzymes to soften and destroy rotting or decayed tissues.
Large duplications at reciprocal translocation breakpoints that might be the counterpart of large deletions and could arise from stalled replication bubbles
Tập 21 Số 4 - Trang 525-534 - 2011
Karen Howarth, Jessica C. Pole, Juliet C. Beavis, Elizabeth M. Batty, Scott Newman, Graham R. Bignell, Paul A. Edwards
Reciprocal chromosome translocations are often not exactly reciprocal. Most familiar are deletions at the breakpoints, up to megabases in extent. We describe here the opposite phenomenon—duplication of tens or hundreds of kilobases at the breakpoint junction, so that the same sequence is present on both products of a translocation. When the products of the translocation are mapped on the genome, they overlap. We report several of these “overlapping-breakpoint” duplications in breast cancer cell lines HCC1187, HCC1806, and DU4475. These lines also had deletions and essentially balanced translocations. In HCC1187 and HCC1806, we identified five cases of duplication ranging between 46 kb and 200 kb, with the partner chromosome showing deletions between 29 bp and 31 Mb. DU4475 had a duplication of at least 200 kb. Breakpoints were mapped using array painting, i.e., hybridization of chromosomes isolated by flow cytometry to custom oligonucleotide microarrays. Duplications were verified by fluorescent in situ hybridization (FISH), PCR on isolated chromosomes, and cloning of breakpoints. We propose that these duplications are the counterpart of deletions and that they are produced at a replication bubble, comprising two replication forks with the duplicated sequence in between. Both copies of the duplicated sequence would go to one daughter cell, on different products of the translocation, while the other daughter cell would show deletion. These duplications may have been overlooked because they may be missed by FISH and array-CGH and may be interpreted as insertions by paired-end sequencing. Such duplications may therefore be quite frequent.
Detecting differential usage of exons from RNA-seq data
Tập 22 Số 10 - Trang 2008-2017 - 2012
Simon Anders, Alejandro Reyes, Wolfgang Huber
RNA-seq is a powerful tool for the study of alternative splicing and other forms of alternative isoform expression. Understanding the regulation of these processes requires sensitive and specific detection of differential isoform abundance in comparisons between conditions, cell types, or tissues. We presentDEXSeq, a statistical method to test for differential exon usage in RNA-seq data.DEXSequses generalized linear models and offers reliable control of false discoveries by taking biological variation into account.DEXSeqdetects with high sensitivity genes, and in many cases exons, that are subject to differential exon usage. We demonstrate the versatility ofDEXSeqby applying it to several data sets. The method facilitates the study of regulation and function of alternative exon usage on a genome-wide scale. An implementation ofDEXSeqis available as an R/Bioconductor package.