Dec 11

Well, that didn’t take long. Less than seven hours after I posted a request and reward for a 23andme app, I had two submissions. Beau Gunderson and Eric Jain (who turn out to be friends) both submitted short programs that fulfilled my request. The winning version is at Beau’s site.

First, a little context. The point of this wasn’t to create a consumer experience, so don’t go to the link and expect one. It’s a download tool, no more and no less. But it didn’t exist before and now it does. What you will get if you click the link is the ability to store local copies of your genotype and health results.

You will have to accept conditions to execute the app. Read them carefully before you do so. I am not sure if the 20 user limit is concurrent or total, and the tool may break at some point if it’s the latter until it goes through account review. This is alpha software, caveat emptor, you should probably wait to use it until lots of other people do, and so forth. I generated a zip file and downloaded it, for reference’s sake.

Second, let’s thank 23andme for supplying the API and enabling the health GET commands. They didn’t have to do that at all, much less make it so easy to do. This doesn’t get done in seven hours without clean API design. Credit where it is deserved. More reason to hope they can get data into the FDA that gets approved and they keep DTC genomics economically accessible.

Third, we now have the challenge of integrating the output into a Blue Button format, so it can benefit from the ecosystem of applications emerging there. I’m not sure I can pull that off in a twitter challenge but we’ll see - at least we’ve moved to the next step of the process.

More in a bit. I’m going to wire this into a few places over the next week and think about it some more. This was both wonderful and unexpected in the best way. Thanks to everyone who joined the conversation and to 23andme for making the API available.

23andme has an API. They announced it in a talk called “Own Your Data" - which i thoroughly recommend. And they back it up in terms of your actual genotype on their website. You can download your SNPs, and upload them (as I have in multiple places).

They also provide on their site interpretations of the genotype. Those interpretations are what make the data useful to us as patients. They’re at the center of the FDA-23andme battle, which isn’t the point of this post.

I can’t download and export my interpretations the way I can download and export my genotype, and that seems a missed opportunity. It’s directly supported in the API, which features the elements at the bottom of the post.

So, I hereby pledge to give $500 to the first person who makes a web application that works in Firefox, and lets me log into 23andme and download my health and ancestry data into a structured file (CSV at least - no text files). The app must have its source code on github, and be licensed under an FSF approved free software license. The app must work not just for me, but for four more testers. It must pass a security review to make sure it’s not leaking passwords. The file should contain my genomic and health data, not my user data or my ancestry data.

"First" will be judged by twitter timestamp - send me a message if you have a submission.

I know $500 isn’t a lot. If I could code, I would. But I suck at it.

This is out of my own pocket. I’ll also promise to give just incredible amounts of attribution, acclaim, and more. It’s a chance to get known too. It’s part of my “stop complaining and do useful stuff” resolution.

If no one bites, I’ll post this as a claimable job somewhere else.

Nov 30

I’ve been following the 23andme saga with dread and fascination and frustration since it hit earlier in the week. If you’re not up to date, David Dobbs has the canonical collection of stories to get you there.

This is a post about business models and culture clashes. It’s long. I should also start by disclosing that my former non profit project received funding from Anne Wojcicki’s philanthropic foundation, that I hold some shares in small startups that could benefit from the success of direct to consumer genomics (though not in 23andme) and that I now work at a non profit that is premised on the idea that personal access to data will fundamentally change research. My 23andme data is online for all to download.

So I could personally benefit if 23andme survives and thrives, and I personally admire Anne and her work, even when I criticize some of the choices she makes. It’s hard as hell to be a CEO and she deserves credit for icebreaking, vision, and perserverance.

That said, I am deeply frustrated by the simplistic narrative of OMG FDA BIG GUBBERMINT SILENCING DARING ENTREPRENEUR. It’s not that simple.

23andme has been trying to bring a technology business model, and a bayesian data culture, to the genetics world for six years. Both of those were gambles. And I think the problem they’re in right now is a direct consequence of those gambles not paying off yet.

First, by a technology business model, I mean the idea that you go get all the users you can get by giving your service away for free and figure out a way to make money later. This is no different than social networks business models. Facebook, Twitter, Instagram, Snapchat - all of them started with this concept of acquiring the most users and then working out a model for revenue. And all of them converged on advertising, because for all the promises of the new network, selling people sweaters worked better than anything else.

As a business model for genetics though, it’s different, because your cost isn’t just distributing an app and paying Amazon cloud fees. 23andme has been selling their direct-to-consumer genotyping kits for six years now, always at a deep discount to their actual price. They exacerbated that financial model when they dropped the monthly subscription requirement last year. If they can’t find a way to make money soon, they’re going to wind up in the box - you can’t sell a dollar for fifty cents and expect to make money on volume.

I would guess that the hope was that population genetics would yield a business model that wasn’t regulated. If you could get enough people in the door, then your internal analysis could inform efforts at pharma, at insurance companies, and more - without ever having to market your spit kits as medical devices in the regular business model. Their push to hit a million users indicate this remains a goal.

But since they’re not doing that yet, it seems they’re not at the magic number. That means going the traditional route, which is to develop and rigorously validate a test, get it approved by the FDA, and then market it at an incredible markup. That’s what 23andme tried to do in this run. They invited the FDA in the front door, and at least so far, haven’t been very good about serving them tea. We’ll find out more about why later.

But I posit the data culture that permeates technology companies is at the root of the frosty relationship between the FDA and 23andme. It makes it very hard to pivot from a model based on loss leadership and data mining to one based on regulatory submission.

That “traditional” submission to the FDA would be of a very specific kind of analysis based on randomized controlled trials. It is designed to keep bad things from happening to people, not to make sure good things happen to people. As one of my favorite papers lays out, parachutes would not receive FDA approval as a gravity-resisting device.

Modern tech culture doesn’t work that way. Bayes’ rule is about probability. It’s a different way of knowing that you know something, and it’s one in which there is far more tolerance for uncertainty than the FDA is accustomed to. And there’s been statements by FDA officials that they are deeply uncomfortable with that. Note in particular Janet Woodcock’s statement that causal inference needs a “level of rigor” - and the date in late April. It’s right before 23andme cut off communication with the FDA.

The FDA isn’t saying that 23andme needs to shut down, or stop giving people data, or stop providing spit kits. It’s saying their causal inference doesn’t hit the level of rigor that allows it to pursue a technology business model in the health regulatory space. That’s a gamble 23andme was always taking, that the FDA wouldn’t know how to regulate it in time for it to make money. It’s not an overreach, or a screw job on patients. It’s just business.

And I hope this isn’t the end of it. We need DTC screening. It helped me. It’ll help many others. But until the FDA learns how to deal with Bayes’s rule and its discomforts - and until DTC companies figure out a business model that isn’t based on massive loss leadership - we’re going to keep coming back to this clash of culture and business models. Both sides need to make some changes if we’re going to avoid doing this over, and over, and over.

Oct 30

Just a quick note to put out a thought. I’ve done a lot of work with informed consent, and am one of the people pushing the idea that in its current form it isn’t serving our needs as a society, as patients, as scientists, as people. That’s pretty much the whole point of my TED talk. And I’m a supporter of projects like Reg4All, which embrace fair information services practices as their basis. There’s a real role to play for that approach, especially in aggregate level data, or in recruitment of people for studies that will in turn require informed consent.

But I am noticing a disturbing trend in the conversation around consent, one that I intend to combat fiercely, which is the idea that informed consent itself is the problem.

I don’t think it is. I think it’s the way we do it that’s a problem. When it’s used as a shield against patient involvement, or as an excuse to deny people their data, that’s a problem. When people don’t understand the forms and are rushed through the process, that’s a problem. When data can’t be integrated into larger and larger studies because of consent, that’s a problem.

But the idea that we should be informed about the risks and benefits to ourselves and society before we decide to share our data? I don’t think that’s a problem that needs disrupting. We need to embrace better design for informed consent. We need to bring it into the 21st century, to make it compatible with distributed computing, with large-scale mathematics, with the internet. That’s clear.

None of these problems, together or collectively, mean we should treat the issue of consenting people into research as a simple transaction cost. It’s not about reducing the “friction” of “participation” to a few clicks here or there. If we treat informed consent as a transaction cost to be minimized or eliminated, we may solve the problems of data liquidity.

But data liquidity, as we’ve seen in national security, social media, and more, doesn’t always come alongside beneficence, justice, and respect for people. That’s the point of informed consent, not liquidity of data. Health data is one of the only places where informed consent - not just simple clicking - is part of our cultural heritage.

I think that’s a feature, and not a bug. If we can’t find a way to balance informed consent, and its ethical heritage, with data liquidity, that failure is on us.

Sep 27

Lots of stuff trumping blogging right now. TCGA Pan-Cancer analysis published. BRIDGE funded. ImpactStory funded. Giving a ton of talks, and a few big publications coming in the next few months. Be right back.