In Vivo is part of Pharma Intelligence UK Limited

This site is operated by Pharma Intelligence UK Limited, a company registered in England and Wales with company number 13787459 whose registered office is 5 Howick Place, London SW1P 1WG. The Pharma Intelligence group is owned by Caerus Topco S.à r.l. and all copyright resides with the group.

This copy is for your personal, non-commercial use. For high-quality copies or electronic reprints for distribution to colleagues or customers, please call +44 (0) 20 3377 3183

Printed By

UsernamePublicRestriction

Genomics Through The Lens Of Precompetitive Data Sharing

Executive Summary

With low-cost whole genome sequencing on the horizon, issues around acquiring and assessing large data sets of sequence information are front and center. The value proposition is also shifting, from the means of acquiring data to their interpretation. But finding a sustainable business model around data analysis and clinical services is far from straightforward.

  • A group of world class geneticists has submitted a grant proposal to develop a curated public database of known, clinically important gene variants.
  • The proposal reflects the increasing ease of acquiring large genomic data sets on individual patients in lieu of tests that target only one or several gene variants.
  • The interest also suggests how the value proposition around gene-based molecular diagnostics is shifting from the means of acquiring data to its interpretation.
  • But finding a sustainable business model around data analysis and clinical services is far from straightforward. The issue is potentially relevant to several sets of stakeholders: clinical labs, instrument providers looking downstream, and other companies with enabling workflow tools.

A little over a year ago, Harvard University’s Harvard Medical School professors Isaac Kohane, MD, PhD, and David Margulies, MD, hosted a meeting to advocate that the collective experience of researchers and clinicians with gene variants needs to be moved to the precompetitive space. According to Margulies, having universal open access to all data from sequence-derived assays is a safety matter. “It’s no less critical an issue,” he says, “than airlines’ sharing of aircraft safety data with each other for the public good and voluntarily deciding not to compete based on innuendo about unproven safety issues.”

The upshot of this sentiment has been a ratcheting up of discussions around potential venues for genomic data storage and sharing, the degree of cooperation to anticipate from clinical labs, and what the sustainability model might be both for labs that participate in such open-access efforts and for companies that offer enabling workflow tools. The discussion also highlights issues around commercial data silos, the most prominent example being Myriad Genetics Inc.’s monopoly on testing for mutations in the BRCA breast cancer genes.

Advances in sequencing technology and the availability of more and more complex data sets from genome-wide association studies have ripened the discussion. “Over the last 10 years, genomics was about our ability to produce data,” says Jorge Conde, founder and chief strategy officer of personal genomics company Knome Inc. “Over the next 10 years it’ll be about our ability to absorb data.” That said, a major hurdle to enabling the clinical use of genomic data is having sufficient numbers of well-documented cases for analysis. (See (Also see "The How And When Of Applying Sequencing To Clinical Diagnostics" - In Vivo, 1 Sep, 2010.).) Large data sets are essential for the successful development of validated tests that regulators can sign off on. Even with the costs of acquiring the raw data declining rapidly, however, the sheer numbers of patients required make it an expensive proposition. To build that knowledge base, it makes sense to document and share data as they emerge to the greatest extent possible.

Still Early Days

Understandably, geneticists today are just beginning to get their arms around a whole genome problem, says Peter Kolchinsky, PhD, general partner at RA Capital Management. They may specialize in tests for a dozen or so genes around specific conditions, and are generally limited to a small number of known mutations in each gene. “Look at the effort involved in just figuring out the significance of a variant in one gene,” he says. “What will happen when you look at a thousand clinically relevant genes, you start discovering all the different rare ways that the genome can have a variant, and must determine whether it’s a typo that matters or is it one that doesn’t? You can’t possibly have a world where you do carrier testing a thousand genes at a time using old school methods.” The problem of figuring out which mutations and combination of mutations are deleterious versus benign is daunting for the physician. And coming to grips with the implications of having a variant of unknown significance is certainly a difficult concept for patients.

“For a long time, we’ve all had our niches for diagnostic testing,” adds Harvard geneticist Heidi Rehm, PhD – in her case, hearing loss and cardiomyopathy. “But it’s clear to us all that the technical approaches with which we look at our genome are going to evolve to whole-genome approaches.” To Rehm, the goal is to be able to approach molecular diagnostic testing in the whole-genome era both to answer a primary question and also to gather information on the rest of the genome. “I think many of us see that what we have to add to the service goes beyond the individual understanding of a single variant,” she says. “It’s really the collective analysis of many variants and their contributions to not only Mendelian disorders but complex disorders.” Indeed, it’s recognized that the only way to understand low penetrance variants, which are the vast majority of the ones associated with complex diseases, is with large data sets.

Implicit in that observation is acknowledgment that most genomic data are not well understood. And that’s a problem because the time is fast approaching when the raw cost of looking at an entire genome is low enough to enable widespread use of that information over the course of a patient’s lifetime. The endeavor will become less expensive than one-off sequencing for each query. “When that cost model makes sense, it’s really about re-interpreting that genome as symptoms arise and indications arise,” says Rehm.

Ultimately, variants will have to be viewed in an individual’s overall genomic context. In early genome wide association studies, researchers were looking to understand disease through common variants they could put on a chip. They screened many people looking to validate the clinical use of these common variants. But that turned out to be very weakly informative, says Nathan Pearson, PhD, director of research at Knome. “Even common diseases as well as rare ones trend toward individually rare variants – a different variant in each family, essentially. There’s going to be a not-inexhaustible but great supply of them out there that make people sick.”

An Ambitious Proposal

In September, a group led by Rehm and co-principal investigators Christa Lese Martin, PhD, of Emory University’s Emory University School of Medicine, and Robert Nussbaum, MD, of the University of California, San Francisco campus of the University of California, submitted a grant proposal to the National Human Genome Research Institute (NHGRI) of the National Institutes of Health. It identified eight gene disease areas around which to build out a universal human genomic variant database, which would be curated by a team of experts in each disease area. By creating an infrastructure to facilitate communication among experts in each disease area, the database would enable consensus-driven evaluation of evidence for variants and expert level curation to achieve a clinical grade database. The proposal includes commitments by academic hospitals and commercial companies to submit their existing data to the tune of more than 160,000 samples – a baseline reflecting only what was identifiable and had been screened at the time of submission of the grant proposal, with the expectation that the numbers will increase significantly. (SeeExhibit 1.)

The Rehm/Martin/Nussbaum proposal (the “U41 grant”) builds on a similar precompetitive model, the International Standards for Cytogenomic Arrays (ISCA) consortium, a repository for information on structural variations measured using array-CGH (comparative genomic hybridization) technology. ISCA is identifying common chromosomal rearrangements or deletions and whether they represent a clinical phenotype. But the larger body of curated sequence-level variation, according to the U41 grant, “still remains in fragmented and poorly annotated environments or is simply inaccessible with data held in proprietary clinical laboratory databases.”

Exhibit 1.

Samples Promised For Proposed Clinical Variant Database (as of Sept. 2011)

Disease Area

Contributing Labs (minimum)

Number Of Samples (minimum)

Hypertrophic cardiomyopathy

4

8,918

Noonan syndrome, Cardio-facio-cutaneous syndrome, LEOPARD syndrome, Costello syndrome

13

7,672

Hereditary colorectal cancer

7

34,205

Inborn errors of metabolism

6

3,242

Developmental delay (Rett/Angelman/early infantile epileptic encephalopathy)

17

23,911

Congenital muscular dystrophy

1 plus a registry drawing on several labs

728

PTEN gene related (hamartoma tumor syndrome, Cowden syndrome, Banayan-Riley-Ruvalcaba syndrome, macrocephaly/autism)

5

4,732

Mowat-Wilson syndrome

4

234

Other

-

77,208

TOTAL

36

160,850

Source: Rehm/Martin/Nussbaum U41 Grant Proposal to NHGRI

Issues of quality control and in the interpretation of data exist even within a single lab, Rehm says. The ISCA process illuminated this problem: a lab may say one thing about a variant, but when it found that same variant later, it may have said something different about it. Further, as would be expected, the inconsistencies existed across labs as well as within them. Thus, mirroring the ISCA submission process, the U41 proposal gives participating labs the ability to know whether a variant overlaps with anything that has been described before and if it has, to know what somebody else has classified that variant as, including that lab itself, in prior submissions. The quality of assessment of copy number changes has improved through the ISCA infrastructure, which was created to support the sharing of data, Rehm says. She expects the same will hold for the proposed new database.

In addition, ISCA set a fairly low standard for the level of clinical information submitted, unlike previous database efforts. Rehm and colleagues are also following the ISCA model in that respect, recognizing that data from clinical labs is by and large not well populated with phenotypic information and not wanting that to be a barrier.

To maximize interest in the project and facilitate data sharing, the investigators focused on disease areas where several labs were offering testing. “We absolutely made the decision to focus on areas where the barriers for knowledge sharing were less and the potential to gain from knowledge sharing was greatest,” Rehm says. Some labs have volunteered their entire data sets including her own Partners Laboratory for Molecular Medicine, Emory, and even Laboratory Corp. of America Holdings (LabCorp). Others, including Medical Genetics Laboratories at Baylor University’s Baylor College of Medicine, Quest Diagnostics Inc., and the Mayo Clinic, have agreed on a disease specific basis only, according to Rehm.

“My hope is that when they see the benefit of the interaction in data sharing and expert curation, the labs will find it useful and it will stimulate them to continue to expand that participation,” she says. Once the model is established, it can also be extended into other diseases with a genetic component. To induce participation, the U41 provides participating labs with FTE support to help gather up submissions and get them into a structure that fits with the database.

Minimizing – And Moving Beyond – The Silo Problem

The aspirations of the U41 proposal, which NHGRI should decide on in the next month or two, are much broader than addressing problems of existing data silos. But they do put the pros and cons of such operations into relief.

Myriad has done a fantastic job, RA Capital’s Peter Kolchinsky acknowledges, marshaling its resources to characterize thousands of variants in the BRCA1 and 2 genes, moving more aggressively than academics would have to characterize those genes, and bringing the percentage of variants of unknown significance down to around 3%.

But along with that, the company has built up a proprietary, trade secret protected database of variants. That’s where, as a matter of public policy, Kolchinsky and others take issue. “I’d argue it would be great if each company would claim one gene, profit for some period of time while they really ramp up the world’s knowledge about that gene and its association with disease,” he says, arguing that at some point, that information should become public.

That clearly has not happened with BRCA mutations. After Myriad stopped contributing to the Breast Cancer Information Core (BIC), an open access online breast cancer mutation database hosted by NHGRI, the growth in the number of BRCA-related variants being documented dropped precipitously. “It flatlined,” Kolchinsky says: because of Myriad’s presence, oncologists and geneticists had grown out of practice in submitting their reports to the BIC, even as Myriad continued to identify new variants.

According to Myriad, the company stopped contributing to the BIC database because the information was supposed to be for research use only and “the database was not validated for providing test results to patients in a clinical setting, which posed regulatory and quality system concerns.” Moreover, the company believes that next-generation sequencing platforms do not currently have the coverage necessary for whole-genome sequencing to be used with sufficient clinical accuracy, and says it is not aware of any clinical labs using whole-genome sequencing for clinical purposes to provide BRCA status. In a written response to questions for this article, Myriad stated that it “continues to believe that patients are best served through an integrated service including both sequencing and data interpretation,” adding that its research to classify variants “requires a significant ongoing relationship with health care providers, patients, and their families which were established through our upfront sequencing activities.”

The company’s arguments about undocumented workflow and the current lack of clinical validation of whole-genome sequencing in general ring true. But that’s not to say the siloing of BRCA data does not have potential economic consequences because it discourages the development of easy “rule-out” tests around well-characterized BRCA variants. Then, for variants not readily discernible, physicians would turn to Myriad.

“Put Myriad to work on those cases where they have differentiated knowledge,” Kolchinsky argues. That could even expand the market for BRCA testing, which is currently limited to a well-defined patient population with certain risk factors, such as family history or having gotten cancer at a young age. “If you were to have a cheap rule-out test to eliminate those hard to discern variants only Myriad would know, that costs $100, you could drop the restrictions on who gets BRCA sequencing and potentially open it up to everyone,” Kolchinsky says, and in so doing identify many more patients who have BRCA variants who would not have known it – young women who don’t have close family members, for example.

On the other hand, if Myriad didn’t have a lock on that channel, it might not have the resources to continue to provide better interpretations. And the notion that distributing variants and in so doing potentially creating knock-off tests that could interpret a genome at lower cost and in compliance with regulatory standards “is something we should pressure test,” says geneticist Dietrich Stephan, PhD, founder of Navigenics Inc., a personal genomics firm. “It’s not a given we would be saving lives by doing that.”

Stephan envisions an alternative to the Kolchinsky style rule-out test: allow broad use of whole genome data in exchange for a royalty to Myriad or whomever. Indeed, CEOs of several molecular diagnostics companies have alluded to similar, behind-the-scenes discussions occurring around this cross-licensing notion. “The writing is on the wall that the technology that powers molecular diagnostics will collapse onto a one-shot assay once it gets cheap enough,” says Stephan – a genome or exome sequence. It’ll be less expensive from a workflow perspective than managing thousands of primer sets and capillary-based sequencers. An exome will actually cost less than one gene on a capillary machine today. “We’re seeing it get close already,” he says, with the cost of an exome now around $900.

When that happens, people will be looking at things they don’t have a license to look at. And if everyone’s running the same assay and suddenly has a computer program where they can click right into it and say, “Oh, you have x variant,” then the activation energy will be in paying the royalty, not in setting up the assay. “I think we are coalescing on a model where there’s a huge cross-licensing, switchboarding opportunity for people to get paid for their IP,” says Stephan. “Patients could still get access to information, and it would open up the volume around the channel.”

Thinking Commercially

To the extent that data on clinical variants are freely available, the commercial opportunity will be downstream, toward the development of workflow tools that deliver data to clinicians in an accessible form. Kolchinsky, whose investment in the prenatal molecular diagnostics company Sequenom Inc. is well known, uses Sequenom as an example of how companies with genomics-based tests may develop an efficient workflow system around them. In so doing, they will be in a good position to attract the interest of labs like Quest or LabCorp, which will need to acquire those tools.

“Collecting a sample of blood, processing it as Sequenom does to enrich it with fetal DNA, running it through a sequencer, analyzing the data looking not particularly at sequence as much as at dosage, and providing a high quality report in a CLIA-regulated environment is not an easy process to replicate,” says Kolchinsky. “Being the first to create this capital intensive workflow will create value.” And over time, he says, it will become “a somewhat distributed model,” where some of the value will come in the form of licenses to other labs. Indeed, LabCorp demonstrated the value of workflow solutions through its January 2010 acquisition of Correlagen Diagnostics Inc., a provider of a wide variety of genetic tests. Correlagen’s core expertise was in process automation and software, which gave it the ability to operate at that breadth.

In the future, a handful of large companies will perform common genome or exome prep – offering up different views or fields of interest, integrating the different instrument feeds and complex databases and rule-out algorithms to produce good technical data sets, which clinicians will then need to interpret, says David Margulies. They presumably will emerge from tool companies such as current sequencing market leaders Illumina Inc. and Life Technologies Corp., newer companies like Complete Genomics Inc., Pacific Biosciences of California Inc., and others. Those firms may just aim to do sequencing as cheaply as possible, competing on accuracy, faster turnaround time, and data presentation. Companies such as LabCorp could then access those capabilities and bundle services together, forming big contracts with high-throughput health care provider facilities. “And they will query a public database,” Kolchinsky suggests, a different model from silos like Myriad, or for that matter, many developers of complex, high-value diagnostics run in designated centralized labs. (See (Also see "Delivering Cancer Diagnostics Tools" - In Vivo, 1 Jun, 2011.).)

The open question, says Kolchinsky, is whether companies can thrive in an environment where they don’t have the capital costs of doing their own sequencing and are focusing on the analysis.

“The sustainable model is in the higher interpretation [of variants] and the clinical services,” says Margulies, who founded Correlagen and is now director of The Gene Partnership, a genomics infrastructure initiative at Children’s Hospital Boston, “not at the curation of variants.” Those services include the issuance of a patient report, the assembly of sequence and expression and phenotype by software, and interpretation. “That’s competitive and proprietary,” he says, and very different from asking, “Have you ever seen this rare variant and what did you call it?” He also sees an opportunity for value capture by large centers that can provide counseling, medical genetics, systems biology, and biomedical systems integration. Providing such “composite views” is a new subspecialty of genomics now being formed. “The analogy is very strictly to radiology,” Margulies says, “as it emerged from the experimental work of physicists and tinkerers to become a clinical discipline of interpreting images.”

Stephan, who after leaving Navigenics led the design of The Gene Partnership at Children’s as well as several other genomics programs within health care provider systems, agrees. “The high-value, high-margin business will be interpreting that genome in the context of a clinical encounter.” And whereas for some well-characterized gene variants, it may be a simple matter to correlate a variant with disease risk, there is also enough complexity in the function of variants in terms of the relationship between genotype and phenotype, he says, that a highly specialized interpretive shop will offer value.

“People who are looking at one gene across many people will say there’s no limit to the number of new mutations we see,” says Stephan, because there is an infinite number of ways you can kill or modify a gene – we just haven’t seen them all. Plus, given how post-translational modifications of proteins affect phenotype, understanding the nuances of how variants affect given individuals will require either highly specialized tracking of a large cohort of people or highly specialized interpretation around each variant. Following Kolchinsky’s logic, that’s what Myriad will become – a specialist in analyzing rare variants and variants of unknown significance.

Moreover, viewing the measurement of clinical variants generally as precompetitive would inure to the benefit of large laboratories like Quest and LabCorp, whose businesses center on aggregating a lot of tests and optimizing the workflow processes by which hospitals can order them and get them done, while leaving the cutting edge analysis and development of new technologies to someone else. Indeed, one reason underlying LabCorp’s support of the U41 grant is to encourage labs around the world to submit their data, which would in turn make it easier for LabCorp to launch whole genome/exome services.

A Tie-in To Health Economics?

As with biomarker development, pharma may provide a fertile proving ground for the power of genomics in clinical practice, as tools for patient selection and enrichment in clinical trials and for getting a better handle on the reasons underlying adverse events as they arise. The Food and Drug Administration already wants drug developers to focus on the metabolic impact of two dozen genes, and that number is sure to rise.

A handful of companies is collaborating with pharma – and is aiming at payors and health care provider systems as well – to attempt to use genomics data to improve outcomes. It’s part of Knome’s current mission and the core capability of PanGenX, an angel backed genomics IT start-up in Newton, MA. PanGenX assists in the design of clinical trials by gathering genetic variations across potential patient populations, using the information to predict outcomes, and organizing it to be able to make economic projections for customers. “We can design the models and lenses pharma will use” to look at the data, says chief technology officer Eric Neumann, PhD. These are not just separate measurements of phenotypes, he says – the data are grouped and form patterns and correlations between genetics of groups of people and the drugs they take. “You need large statistical numbers, which we can now begin to capture in hospitals and clinics because of commoditization and technology advances,” he says. “Now the question is, will that ability make a difference?”

Related Content

Topics

Related Companies

Latest Headlines
See All
UsernamePublicRestriction

Register

IV003747

Ask The Analyst

Ask the Analyst is free for subscribers.  Submit your question and one of our analysts will be in touch.

Your question has been successfully sent to the email address below and we will get back as soon as possible. my@email.address.

All fields are required.

Please make sure all fields are completed.

Please make sure you have filled out all fields

Please make sure you have filled out all fields

Please enter a valid e-mail address

Please enter a valid Phone Number

Ask your question to our analysts

Cancel