ARRA IMPACT REPORT:
Discovering the Genetic Variation in Cardiovascular, Lung, and Blood Disease Risk
Public Health Burden
Chronic cardiovascular, lung, and blood diseases account for more than one-third of U.S. deaths each year. All involve genetic causes to a greater or lesser extent, and rare and low-frequency genetic variants are thought to explain a substantial portion of their heritability.
The NHLBI “Grand Opportunity” Exome Sequencing Project (GO–ESP)
GO–ESP was designed to identify variants in the exome (the portion of the genome formed by exons—regions coding for proteins) that are associated with heart, lung, and blood diseases. The project brought together a multidisciplinary team of researchers representing technological innovation,1 longstanding involvement with large well-characterized cohorts focused on cardiovascular2 or lung diseases,3 and the computational expertise needed to process and analyze enormous amount of data. More than 7,000 exome sequences were generated from European Americans and African Americans with diseases or disease manifestations of interest (e.g., cystic fibrosis, early-onset myocardial infarction) or with extreme measured levels of traits of interest (e.g., blood pressure, body mass index). ARRA funding enabled the creation of a rich data resource for the scientific community and provided a unique opportunity to accelerate scientific discovery.
Advances Made and Enabled by the ARRA-funded GO-ESP
GO–ESP captured and sequenced all protein-coding exons (about 1–2 percent of the human genome) so that the rare and low-frequency variants within them could be identified.
The first set of exomes represented 1,351 individuals of European ancestry and 1,088 individuals of African ancestry. Analysis identified over 500,000 genetic variants, the great majority of which were rare (i.e., were found in less than one-half of one percent of the exomes studied).4
One approach used by the investigators in GO–ESP to increase efficiency and decrease the need for a very large sample size was “phenotype extremes design,” which focuses on subsets of individuals with very high or very low levels of a particular disease manifestation or trait (phenotype).
Using that approach to analyze samples from patients with cystic fibrosis, investigators identified variants in a gene that were associated with the age at which a first Pseudomonas infection was experienced.5 This discovery could enhance understanding of susceptibility to infection and ultimately enable identification of subsets of patients likely to benefit from aggressive preventive efforts.
The GO–ESP effort was one of the largest medical sequencing studies ever undertaken, and an urgent imperative was development of efficient methods and automated systems to facilitate the interpretation of millions of variants from thousands of samples.
Investigators developed a new method to discover a specific subset of rare genetic variants and their loci from exome sequencing data.6 The ability to generate these discoveries directly from such data magnifies the potential utility of existing and future exome sequencing efforts.
Researchers also developed an approach to testing the association between rare variants and phenotypes in sequencing association studies.7 The new method was evaluated with simulation studies and applied to GO–ESP exome sequencing data from individuals with acute lung injury. This test provides a new tool that should save time and money in future studies.
The exome sequencing of selected families through GO–ESP and availability of GO–ESP data as a resource have facilitated the identification of rare variants for familial cardiovascular diseases.
The exome sequencing approach enabled the evaluation of rare variants for idiopathic dilated cardiomyopathy (disease of unknown cause that weakens the left ventricle of the heart).8
Small exome studies helped to identify rare mutations that may cause familial thoracic aortic aneurysms.9,10
The unique resource generated by the GO–ESP resource can be accessed by researchers through various avenues:
The GO–ESP sequencing data and associated phenotypes are available through the database of Genotypes and Phenotypes (dbGaP) maintained by the National Library of Medicine.
The data on genetic variants known as single nucleotide polymorphisms (SNPs) have been submitted to dbSNP.
GO–ESP established the NHLBI Exome Variant Server (EVS) at the Northwest Genomic Center to provide web-based access to their aggregate SNP data.
The GO–ESP investigators collaborated with leaders of smaller exome sequencing projects to identify and select variants to be included in the design of a new Exome Chip array. Although assessment of the very rare variants will still require exome sequencing, use of the Exome Chip has allowed about 250,000 variants from exome sequencing to be genotyped in whole cohorts at relatively modest cost.
Contributing NIH Institutes & Centers
- National Heart, Lung, and Blood Institute (NHLBI)
- NIH Office of the Director (OD)