Instrumenting the health care enterprise for discovery research in the genomic era

  1. Shawn Murphy1,2,
  2. Susanne Churchill2,
  3. Lynn Bry3,
  4. Henry Chueh4,
  5. Scott Weiss5,
  6. Ross Lazarus5,
  7. Qing Zeng6,
  8. Anil Dubey1,
  9. Vivian Gainer1,
  10. Michael Mendis1,
  11. John Glaser2,7,8 and
  12. Isaac Kohane2,8,9,10,11
  1. 1 Informatics, Partners Healthcare Systems, Boston, Massachusetts 02115, USA;
  2. 2 i2b2 National Center for Biomedical Computing, Boston, Massachusetts 02115, USA;
  3. 3 Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA;
  4. 4 Massachusetts General Hospital Laboratory for Computer Science, Boston, Massachusetts 02114, USA;
  5. 5 Channing Laboratory, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA;
  6. 6 Decision Systems Group, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA;
  7. 7 Information Systems, Partners Healthcare Systems, Boston, Massachusetts 02115, USA;
  8. 8 Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA;
  9. 9 Children's Hospital Informatics Program at the Harvard–Massachusetts Institute of Technology Division of Health Sciences and Technology, Boston, Massachusetts 02115, USA;
  10. 10 Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA

    Abstract

    Tens of thousands of subjects may be required to obtain reliable evidence relating disease characteristics to the weak effects typically reported from common genetic variants. The costs of assembling, phenotyping, and studying these large populations are substantial, recently estimated at three billion dollars for 500,000 individuals. They are also decade-long efforts. We hypothesized that automation and analytic tools can repurpose the informational byproducts of routine clinical care, bringing sample acquisition and phenotyping to the same high-throughput pace and commodity price-point as is currently true of genome-wide genotyping. Described here is a demonstration of the capability to acquire samples and data from densely phenotyped and genotyped individuals in the tens of thousands for common diseases (e.g., in a 1-yr period: N = 15,798 for rheumatoid arthritis; N = 42,238 for asthma; N = 34,535 for major depressive disorder) in one academic health center at an order of magnitude lower cost. Even for rare diseases caused by rare, highly penetrant mutations such as Huntington disease (N = 102) and autism (N = 756), these capabilities are also of interest.

    Footnotes

    • 11 Corresponding author.

      E-mail isaac_kohane{at}hms.harvard.edu; fax (206) 333-1182.

    • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.094615.109.

      • Received April 5, 2009.
      • Accepted July 13, 2009.
    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server