Oct 11

I woke up this morning and found that my twitter feed had exploded with the news that GSK is promising to start making individual level data from clinical trials “open” to independent research.

There’s not a lot of detail to go on. But there is enough to do some inferential analysis, and a little bit of historical context as well.

I’m not surprised to see this. Perhaps that it happened now - I’d have bet on 2013 - but not that it’s happening. Sanofi is doing DataSphere (toll access, sorry for sucking), Eli Lilly is running a Clinical Open Innovation portal, etc. Pharma has already figured out that the creation of precompetitive space will be part of whatever new business models wind up supporting drug discovery.

As for the swipes against academia…well, I’ll only speak from my own experience: I have had pretty much all of the top 20 global pharmas call me at some point over the last decade to explore data sharing. I have had maybe three academics call me. There are some good - indeed great - shared academic projects, but the relentless funding requirements and sharp elbows in grantmaking mean that data sharing is hard. There is a business case against sharing, from an economic perspective, in academia - someone else finds a golden ticket in your data, and you lose.

That business case does not exist in biology data or in clinical data if you’re a pharmaceutical company. It’s about getting compounds into the patient and getting reimbursed.

OK, on to the open part of all this.

First, based on what the press release and news reports are saying, we don’t know if the data will be “open” or not - because the press reports don’t mention intellectual property status of the data.

So, to the first point - is it open? We don’t know. Honestly, it doesn’t matter very much for a company like GSK, so I’d expect that they will indeed use a CC0 public domain waiver - they did so previously with their Tres Cantos malaria compound set, which is now underpinning open source drug discovery work. I helped, just a little bit, with that project while I was still at CC and I can tell you the hard part was getting GSK to agree to make the data public, not getting a good license onto it.

Second, frankly, open from a copyrights or patents perspective probably doesn’t matter very much compared to the other mechanisms of control that are explicitly referenced in the reports.

The release intimates that a select panel of judges will review a select panel of applicants and grant access. This points out the failure of “open” definitions to adequately grapple with data in my opinion. It’s easy to meet an open definition with this kind of data but only allow an elite group of scientists in to touch it, using strong norms that say “if you share it, you’re out of the pool forever” which don’t violate the various definitions - because of the obsession with intellectual property as the source of openness.

Who will be the judges? Under what terms will they allow access? Will it be a liberal policy - one that says, by default, any reasonable request will be granted, and here’s what reasonable means? Or one that says, by default the answer is no?

If GSK wants this to work, they need to go all the way and make it not just legally open, but accessible and usable.

I’d like to see the data deposited in Synpase at Sage Bionetworks, as well as at a federal repository like NCBI or EBI. I’d like to see it under a set of terms that specify that any researcher willing to comply with certain terms - not attempt to reidentify, not attempt to harm, perhaps agree not to use the data to bring class action lawsuits (hey, for me, it’d be worth it if we could actually build a map of tox and ADME that worked - sort of a truth and reconciliation commission approach). And I’d like to see the world take a whack at it, not have to apply to have their research ideas judged.

Anything less than this will be “half” open. And being half open is like half learning how to break a board with your hand in karate. You don’t break the board. You break your hand. It’d be a shame to have the first big pharma that tries this fail because they didn’t have the guts to go the distance.

