Cancer Moonshot: Unleashing the power of data

Is open data the cure for cancer?

Less than 5 percent of American adults living with  cancer will be part of a clinical trial, according to the American Cancer Society.

Michael Balint, a presidential innovation fellow working with the National Cancer Institute, says that low percentage is caused by the perception of being a guinea pig, negative marketing and misinformation sharing.

“But there’s also an issue … it’s difficult to match up patients to trials, and there’s a very data intensive element or component to clinical trials, where if you don’t have someone, if you don’t have maybe a family friend, or your physician isn’t as aware of the clinical trials that are available, it’s very difficult to find trials that you might be eligible for,” Balint said, during the White House Open Data Innovation Summit.

Advertisement

That’s why the National Cancer Institute recently launched a beta application programming interface (API), Balint said, “to expose more information in clinical trials.”

“And we just realized the beta of that API is being used by a bunch of different start ups in the space, and advocacy groups,” Balint added.

The API is part of the overarching Cancer Moonshot initiative, a $1 billion effort spearheaded by the Vice President to speed up cancer research — and ultimately eliminate the disease.

The Moonshot has five goals: foster scientific breakthroughs, bring new therapies to patients faster, strengthen prevention, improve patient care.

The fifth is “unleashing the power of data.”

Anabella Aspiras, director for patient engagement for the Cancer Moonshot task force, said data is essential in patient care.

“I think of data in the healthcare space three ways: the art of medicine is treating the person beyond the patient, the person beyond the data, and it takes years to hone that art,” Aspiras said. “The science of medicine is based on data: what a patient reports to me about their symptoms, their family history of illness. The science of medicine is based on data driving a diagnosis of that data and treating the patient. The third piece is the technology of medicine.”

That third piece, Aspiras said, can be the most frustrating. She offered an anecdote of her own experience, when she moved from Brooklyn, New York to Washington, D.C.

It took multiple requests to her provider and subsequently delayed medical appointments before her hard copy health records were in her hands. And that was for an annual mammogram and ultrasound.

“Imagine the delay that I’m describing for a patient that’s just been diagnosed with a very aggressive form of cancer,” Aspiras said.

Balint said there are technical challenges around data in the healthcare space.

“It wasn’t that people hadn’t tried to structure healthcare information and specifically clinical trials information … the challenge there though, there really isn’t one size fits all for that kind of eligibility information,” Balint said. “And there were previous efforts that were undertaken to try to come up with some sort of comprehensive solution, but they all inevitably stalled or failed. The real advance was instead preventing perfect being the enemy of good. Just taking a small step to take one piece of eligibility criteria at a time, really structure it, maybe even hard code it. That’s fine, because it’s so valuable to have that information out there. It might not be comprehensive, it might not be this elegant, perfect solution, but it needs to be done.””

The Defense Department is also leaning away from a one size fits all approach to open data sharing. Under the Applied Proteogenomics Organizational Learning and Outcomes (APOLLO) consortium, patients’ genes are studied at the protein level to try to learn why they developed the disease, and possible treatments for their cancer.

“All the [personally identifiable information] is removed before it’s open data released,” said Col. Craig Shriver, director of the John P. Murtha Cancer Center at Walter Reed National Military Medical Center. “That data then finds its way with all  PII removed, into the data commons run by the [cancer institute].”

Biden recently opened the Genomic Data Commons, and the protein and imaging data commons are coming soon, Shriver said.

“All of this research data will be then open sourced into the community, so that new analytic tools can be developed by all of you out there and the wider community of programmers in an open data system,” Shriver said.

Open data done right

Pushing data out into the community comes with an obligation to protect personal information.

Office of Management and Budget Director Shaun Donovan said during the summit that privacy is a focus for everyone because as technology evolves and the government stores more and more sensitive information, “there are real concerns about privacy including releasing personally identifiable information, because that data could be re-identified.”

In mid-September the Office of Management and Budget released guidance on the role and designation of senior agency officials for privacy.

“We know when done right, open data and privacy complement, rather than conflict, with one another,” Donovan said. “In fact, it is only if we can show that we are doing everything we can to protect Americans’ data, we are going to be able to move forward with the most robust and promising uses of open data.”