Just a quick note to put out a thought. I’ve done a lot of work with informed consent, and am one of the people pushing the idea that in its current form it isn’t serving our needs as a society, as patients, as scientists, as people. That’s pretty much the whole point of my TED talk. And I’m a supporter of projects like Reg4All, which embrace fair information services practices as their basis. There’s a real role to play for that approach, especially in aggregate level data, or in recruitment of people for studies that will in turn require informed consent.
But I am noticing a disturbing trend in the conversation around consent, one that I intend to combat fiercely, which is the idea that informed consent itself is the problem.
I don’t think it is. I think it’s the way we do it that’s a problem. When it’s used as a shield against patient involvement, or as an excuse to deny people their data, that’s a problem. When people don’t understand the forms and are rushed through the process, that’s a problem. When data can’t be integrated into larger and larger studies because of consent, that’s a problem.
But the idea that we should be informed about the risks and benefits to ourselves and society before we decide to share our data? I don’t think that’s a problem that needs disrupting. We need to embrace better design for informed consent. We need to bring it into the 21st century, to make it compatible with distributed computing, with large-scale mathematics, with the internet. That’s clear.
None of these problems, together or collectively, mean we should treat the issue of consenting people into research as a simple transaction cost. It’s not about reducing the “friction” of “participation” to a few clicks here or there. If we treat informed consent as a transaction cost to be minimized or eliminated, we may solve the problems of data liquidity.
But data liquidity, as we’ve seen in national security, social media, and more, doesn’t always come alongside beneficence, justice, and respect for people. That’s the point of informed consent, not liquidity of data. Health data is one of the only places where informed consent - not just simple clicking - is part of our cultural heritage.
I think that’s a feature, and not a bug. If we can’t find a way to balance informed consent, and its ethical heritage, with data liquidity, that failure is on us.
Today is the White House’s Champions of Change event celebrating Open Science. You can watch and join in at 1PM US Eastern time today.
But on this day, I wanted to remind everyone that “open” is a word that comes with history and, more importantly, with meaning. It’s not a word that someone can take and sprinkle on their work and claim without understanding the history and the meaning. If someone does that, they are going to get bullshit called on them - like the Yale-Medtronic supposedly open data agreement, which is anything but open.
So let’s get this clear. Just because you’re making something available that wasn’t previously available doesn’t qualify as open. Just because you’re reducing the transaction costs of access to something doesn’t qualify as open. Just because you’re involving more people than before doesn’t qualify as open.
Open means that there is no prejudice against any kind of user, anywhere, any time. Open means that commercial use is allowed, in advance. Open means that new works, new research, new products are allowed, in advance. Open means following the Open Definition.
Here’s the Definition. It’s short and sweet.
“A piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.”
The reason for a definition is so that the word open actually means something. Open is a great word. But it has to mean something for it to mean anything. And it’s hard to meet this definition. It means you can’t be granular, or differential, or non-commercial, or academics only, or only for some kinds of uses in science and not for other kinds of uses. It means a bright line test that a project either meets or doesn’t meet.
The Yale-Medtronic example is a classic case of misappropriation of the word open. It’s anything but. Yet there goes the science establishment trumpeting it as such.
If “open” is going to mean something in science the way it means something in culture and in software, those of us who’ve spent years toiling in the trenches have to collectively examine, carefully, every claim to open science. And we have to speak truth to power, to hold those who would claim openness as their mantle to the definitions and call them out when they fail to meet them.
Being open - really open - is hard to do. It doesn’t give the people what they want. As Bob Young noted in his keynote at this year’s Sage Congress, what people wanted out of Red Hat software was for it to run Microsoft Office.
He didn’t give them that. He gave them what they needed, which was the right and the power to control their own operating systems. And people thought he was crazy for not giving them what they wanted, for not compromising on open, right up until Red Hat was so successful that everyone agreed it had been obvious all along that open was the smart choice.
It wasn’t easy being open in software, then. It isn’t easy now in science. But the value comes over the long term from the decision of a small set of people being unreasonably committed to making true public goods. Today’s event is a small but pivotal step in recognizing those people, and in carrying those true public goods forward.
Big news today: the Supreme Court has ruled that naturally occurring genes are not patentable.
This is a Big Fucking Deal, as Joe Biden might say. But it’s not as complete a victory as it may seem on the surface. The ruling explicitly excludes cDNA, and notes that the case didn’t address methods, applications of knowledge, synthetic DNA, or alterations of gene sequences that do occur in nature*. Lots of room to negotiate privatization there - indeed, that sentence could be the next "by means of a computer program" in patent law, though it doesn’t have to be so.
This is a ruling that resets the default to unpatentable, as Mike Eisen rightly pointed out on twitter. That’s why it’s big. But it doesn’t close the door, at all, on lots of patents in and around DNA*. That’s why the biotech stock index is up today*.
I’m actually not nearly as interested in those parts of the ruling though. What this means for me is more that the biggest barrier to building a commons of mutations with diagnostic potential is gone: the inability of DTC sequencing companies like 23andme to reveal the status of its customers to its customers because of patents. The companies that rely on these patents now have to move to trade secret approaches, as Myriad already has done, and the thing about trade secrets is…we can compete with them.
What the patents did was make it impossible, illegal, for us to build commons-based competition. They were enforcers on trade secrecy. Those enforcers are gone. We can now go straight to the citizen and say, get yourself genotyped, and donate your data to science.
At Sage Bionetworks, my non-profit employer, we’ve built a system that allows precisely that. It’s called Portable Consent, and we’ve got a study called the Self-Contributed Cohort for Common Genomics Research. You can enroll and donate your data in less than ten minutes, start to finish.
So go get yourself genotyped. Download your data. And donate it to science. Let’s stop fighting companies that privatize, and start competing with them.
Note: edited post at 12:15PM EST for clarification of a few points. Those sentences carry asterisks at the end.
My notes for the panel today at Health Datapalooza. I’ll come back later and add links, fix typos, and so forth…
Why consumer access to data is important
- because data is a digital representation of us. and it’s increasingly being used to affect our real-world services, their costs, their benefits.
- we’re increasingly able to generate data that used to be the exclusive space of the clinical system. genomes are just part of it. health data is in everything. Facebook cups, keyboards, iPhones, google’s next phone.
- but it’s faulty at worst, and incomplete at best. if we can’t access it, we can’t tell how and where it’s incomplete or wrong.
Use cases and potential applications for consumer access to data
- send to app provider to interpret data and help me make better choices about my care.
- send to app provider to interpret data and help me make better lifestyle choices (ideally not a panopticon of health provider, ATT, and government - but a decentralized, competitive market)
that’s what this event is mainly concerned with. but there’s more.
- extract the “real” clinical data (compared to the “dry” data from phones etc) from the record. needs a lot of normalization etc but it starts to paint a longitudinal phenotype of me and my life. mapped to my genome, in large in sample sizes, we can start to correlate lifestyle and medical treatment outcomes to individual genomic variation.
Current state of consumer access to data
- raw in every sense of the word. most folks aren’t exactly aware of the things we here are aware of.
- most of the data collection is happening in zones like mobile and social where we have no positive rights to privacy, like health, and where the entire system depends on designing awareness of the data (and its ownership) out of the hands of us as citizens.
- in health, access to data is hamstrung by a combination of rapidly advancing technology and slowly (that’s a nice way of putting it) adapting law.
- we just moved here from Oakland. when i wanted my son’s immunization records to get him into his new child care center here in DC, the provider couldn’t fax them to me because they were afraid it violated hipaa. but it would have been legal to hire a TaskRabbit - total stranger - to go pick them up, take them to Kinkos, and fax them to me.
- that’s insane. we’re not being protected. we’re not getting access at the rate or in the form that we need yet. it’s getting better, but it’s still slow. and we’re less willing to tolerate it, because we’ve been trained to expect more from our institutions by good technology.
Where we have to go next / role of Blue Button+
- the hard part is that most of us, when handed a file of data, don’t know what to do with it. i downloaded my genotype, my fitbit data. No idea what to do with it. compared to my “medical record” in PDF, which is computationally useless but human readable.
- the investment in BB will pay off best when I have the right to direct my file to someone who can do something with it, for whatever reasons i choose. whether to run an app that makes sense of the data or to donate the data to research.
- BB+ is a great example of this. It allows for both market solutions (banks!) to emerge and for pre-competitive or public private solutions to emerge where we can donate data, or share it conditionally.
- i’m looking forward to pushing Sage Bionetworks, the non profit where I work, to be one of the first certified recipients of BB+ precisely to enable the non-market reuse of health records data.