Why cancers are the least — and most — genetic of diseases

The plot thickens. The crab […] only runs backwards.
– Arsenio Rodríguez, Cangrejo Fue a Estudiar

In time, finer data and statistical models will test the specific claims of a widely discussed recent paper, by Bert Vogelstein and colleagues, on the prospect of genomic risk prediction. Though the paper’s meta-analysis of twin studies has taken some heat for repackaging longstanding knowledge about heritability, the fuss over it has usefully underscored real complexities of disease and healthcare. And thoughtful responses to the paper have made clearer, to the public, that few geneticists are the zealous determinists of caricature. Rather, we tend to grasp the causal basis of disease (and other phenotypes) with inclusive nuance.

But while the dialog may have enlightened many, both the paper and its critiques have largely missed two pertinent points:

  1. In the world of genetic risk, cancers are big exceptions.
  2. Inasmuch, they prove some underlying rules.

And in the context of the paper itself, these points nestle into one nutshell:

In research, twins with tumors are no longer really twins.

To see why, and fold this point into the wider discourse on genomically personalized healthcare, let’s peer briefly through the looking-glass of cancer genetics.

Gemini, Cancer, and genomic horoscopes

As noted, the new paper aimed to summarize what studies of twins say about how well our genomes, alone, may predict what diseases we’ll get. The premise, of course, is that twins from the same fertilized egg resemble live runs of a telling thought experiment: if you and your genome lived twice, would you get the same diseases?

The basic answer has long been clear: twins don’t always get sick the same way. But Vogelstein and colleagues reasonably asked, for particular common adult diseases, how often they do.

Sensibly, they reviewed available twin data for several complex (and interrelated) epidemic killers, such as heart disease, diabetes, and stroke. To boot, they also combed the literature on pregnancy complications (an intriguing evolutionary nexus of health), some nerve and autoimmune diseases, and more mysterious ailments like chronic fatigue and irritable bowel syndromes.

But nine of the twenty-four diseases they surveyed — the bulk, by class — were cancers. And this choice sapped any suspense from their findings. For while cancers indeed kill many people (so demand study), they are long known to be far less heritable — that is, to show a smaller portion of cases running in families — than many other grave diseases. As Vogelstein (a renowned cancer researcher) surely knows, loading a heritability survey with cancers is like padding a Russian presidential ballot with token dissidents. Conclusion: foregone.

To be fair, the authors likely didn’t set out to mislead. After all, they could only survey available studies of twins. And a big study of cancers was ripe to include (in the end, it supplied all their data on the question). While that study did find some evidence of genetic heritability in cancers, it concluded that such heritability plays little role in most cases, bolstering the well-established bottom line: in the grand scheme, cancers rarely run in families.

Where the authors shouldn’t be excused so easily is in a) actively hyping their findings as novel, while b) burying this key grain of salt, far from headlines and press releases, in a brief aside near the end of the paper:

For diseases with a lower heritable component, such as most forms of cancer, whole-genome based genetic tests will be even less informative.

Which any clinical geneticist could have told you, thirty years ago.

And yet…

Unsurprisingly, all nine surveyed cancers in Vogelstein’s survey were deemed less genetically predictable (heritable) than the other diseases studied. Nonetheless, any oncologist will tell you that cancers are quintessentially genetic diseases. As we’ll see, they require — and are even defined by — genomes gone awry.

Indeed, some of the most widely screened-for genetic risk variants underlie rare familial forms of breast and colon cancer. And whole genome interpretation is finding its first vital clinical use in treating tumors, along with sick children.

So what gives? How can cancers be intrinsically genetic, yet so hard to predict from our genomes?

The answer, it turns out, is that it matters which genomes we mean.

The hive within

To understand cancer’s peculiar nature as a genetic disease, first picture your body…as a colony of bees.

Bear with me here: both entities — you and the swarm — are, ultimately, teeming masses of individuals (cells or bees), most of whom work hard to help just a few of their kind (eggs or sperm, or the hive’s queen and few males) breed on their behalf.

In this view, the two basic kinds of cells in your body — gametes (which biologists also call the germline) and somatic cells — respectively resemble a bee colony’s two main castes, breeders and workers.

In a sexual many-celled organism, like you, somatic cells vary kaleidoscopically in form, embodying the many specialized tissues that that help a body thrive from moment to moment. Gametes, by contrast, specialize mainly in storing DNA for coming generations, and — as big, slow, costly eggs, or small, fast, cheap sperm — in finding a partner gamete to merge with.

In most animals, the cells that become the germline split from other cells very early in embryonic development. As with bee royalty, it’s nearly impossible for a somatic commoner to infiltrate their reproductively privileged ranks later.

While details prompt debate[1], natural selection is thought to have driven the emergence of the soma-germline and worker-queen distinctions, and in both cases to have brokered a lasting social compact between the two kinds of individuals.

In this contract, a somatic cell in your hand, like a dutiful worker bee, effectively says to your gametes

Cousins, if you spread our shared genes, I’ll give my life to help you. Count on me to build sturdy shelter, find good food, fend off attackers, write honeyed words to entice a mate, and care tenderly for the children that come after. You do the rest, and send our line forth.

Seen in this Hobbesian light, a tumor is, effectively, a mutiny of somatic cells, who break that evolutionary covenant with their germline cousins. A budding tumor cell effectively says ‘Hell no, I won’t work and die for other cells — I’ll reproduce, myself, instead,’ and starts proliferating unchecked.[2]

A disease of genomes

Crucially, the cellular treachery of each cancer case typically traces to one or more sudden changes — mutations — in the genome of a somatic cell. The mutations in question may be slight — say, the miscopying of one DNA letter from the parent cell’s genome. Or they may be drastic, as when a rogue gamma ray shatters a whole chromosome. In the latter case, the cell may, in mending the resulting fragments of DNA, inadvertently scramble them.

Whether slightly or severely altering DNA sequence, the mutations that turn cells into tumors tend to do so in particular ways. Typically, they either throw, or freeze stuck, one or more functional switches in the budding tumor cell’s genome — switches that had, til then, tightly governed the reproduction of that cell’s immediate ancestors, yoking their proliferation to the overall best interests of the developing body.

The switches in question are often protein-coding genes that directly govern a) whether a cell divides or, instead, takes a moribund, tissue-specific form; and/or b) how the cell exchanges signals with other cells, e.g., halting growth at the touch of a neighboring cell, or telling that neighbor to build more blood vessels, to bring more food and oxygen. In some cases, the key switch may govern how well the cell corrects DNA copying errors; one mutation in such a gene may thus spark many more, some of which eventually knock out other switches that more directly reined in the cell’s growth.

In the end, a tumor grows unchecked thanks directly to one or more newly arisen genetic variants that distinguish it from other cells in the same body.[3] Importantly, such cellular mutinies are (with some exceptions[4]) typically doomed: in their greed for resources, the rebelling cells weaken the body overall, and, with no way to escape, sink with the ship that they’ve commandeered.

Sui genetis

All this switch-throwing ultimately means that tumors grow, spread, and kill specifically because their genomes differ from other genomes. As such, cancers are not just genetic in origin, but break a key assumption underlying Vogelstein’s paper: that monozygotic twins are genetically identical.

That is, as soon as a person is diagnosed with a tumor, all bets premised on her genetic identity – or even near identity – to a twin are off.[5] Not only do billions of her cells now differ genetically from those of her twin, but they differ in ways that are, by definition, biologically important. That is, the distinctive genetic variants in question drastically change the way that cells work, letting them divide unchecked.

In this sense, a tumor genetically distinguishes its host — of whom it is intrinsically part — from other people de facto, much as inherited genetic variants may distinguish someone with a strongly heritable disease from other people. Notably, this insight tempers any expectation that the person’s twin should get the same disease (a cancer driven by genetic variants that the twin likely doesn’t carry). And it underscores the uniformitarian rule that cancers, in their poor heritability, seem at first to violate: in diseases of all stripes, genomic differences matter.

Tumor genomes: noisy, mixed, changing

The first active clinical uses of whole genome sequencing have been in pediatrics and oncology. And this makes sense. In bluntly formal terms, sick kids and tumors are both masses of cells — one beloved, the other loathed — that are growing awry. And they’re both growing, as such, too fast for us to wait for sequencing to get cheaper, or for medical knowledge to get deeper. We’ll sequence now, if we can afford to, in order to gain some foothold into the medical mystery at hand.

For tumors in particular, we hope that sequencing the tumor (and, importantly, healthy tissue for comparison) will reveal key genetic clues to how it arose, grows, spreads, and might be slowed or killed. Alas that turns out to be hard to do, for three key reasons.

First, as noted, tumor genomes are pocked by mutation. Small spelling changes often abound, hiding a few functionally important ones (called drivers) in a cacophony of incidental noise (passengers). And, at bigger scales, long segments of chromosomes are often repeated, missing, or scrambled. Such rampant genetic variation is not just tough to functionally interpret, but also makes it hard to draw an accurate picture of the genome in the first place. Why? It turns out that the computer algorithms used in modern sequencing work poorly for genomes that differ greatly from the standard reference genome, because snippets of raw sequence data that don’t match up well to that genome (like puzzle pieces that don’t closely match the picture on the box) are hard to correctly place. Moreover, tracts of DNA letters that appear in multiple spots in a genome (like uniform fenceposts in a farmscape puzzle) are especially hard to accurately sequence — an acute challenge in tumors, where rampant mutational copying, cutting, and pasting turns the genome into a bewildering house of broken mirrors.

Second, each tumor actually harbors not one, but a mix, of such noisy genomes — often with non-tumor cell genomes inadvertently mixed in. While a tumor is indeed a clump of closely related cells that distinctively share particular variants, it’s also a population of genetically varied cell lines, each effectively striving to grow faster than the others, thanks to its own secondary stock of functionally relevant variants. But because sequencing today requires pooling many cells, genetic variation in the tumor tends to get homogenized, as if in a blender. An important variant carried only in a few cells may not be prominent enough to show up in the final reckoning of a singular tumor genome sequence.

Third, the mix of genomes in the tumor evolves, partly in response to treatment. Thus we may want to track how treating the tumor with a particular regimen kills of some cell lines in the tumor, while letting other lines, by chance resistant to the treatment, spread quickly. To best characterize and treat a tumor, we might want to see a movie, rather than just a snapshot, of its mix of genomes, letting us watch how they change in response to treatment. But doing so is, for the foregoing reasons, tough — and will be until we can sequence fewer cells at a time, for less money, and with longer snippets of raw sequence (analogous to bigger puzzle pieces that can be more reliably pieced together to get the whole moving picture).


On a late summer afternoon when I was six, my mom came to my room, sat next to me, and showed me fresh bruises on her arms and legs. Speaking with determined nonchalance, she trained a young boy’s restless attention to a moment of revelation.

Each squall-blush in her skin was, I learned, real bloodshed from a war below. In the marrow of her bones, delinquent cells were teeming, wrecking the nurseries of sticky platelets that she needed to heal small, everyday blood vessel leaks. The bruises were collateral damage from that mutiny — her own cells betraying her, and those she loved.

She sought treatment, but leukemia wore her down quickly. On Halloween night, muted by breathing tubes in intensive care, she could welcome my visit only with a tiny nod and a waiting cup of candy. She died a few days later, at 34. My first-grade classmates, struggling to comprehend from afar, sketched colorful cards of condolence that I still keep.

Writing today, on her birthday, I’m older than she got to be. Childhood reading, long before Vogelstein’s paper, taught me that leukemia shows little heritability. Yet I still watch for bruises…and admit to a tinge of affirmed relief that, among the diseases that the paper assayed for genetic predictability, leukemia came in last.

But the comfort is cold. Cancers remain an especially vexing kind of plague: sprung from our own selves, tumors are, in the paper’s geminal terms, something like evil conjoined twins. Growing relentlessly, ever changing, they cloak genomic secrets in genomic smoke, and evade our harshest treatments. They are the last horcrux.

At a recent conference, I listened to Washington University’s Elaine Mardis explain how she and her colleagues are systematically characterizing the genomes of thousands of tumors. Their yeoman work is building a broadly useful critical mass of detailed knowledge about how tumors arise, grow, and spread. But their findings are also helping real doctors and patients, today, choose treatments that lengthen lives and lessen suffering.

After Mardis’s talk, I told her how touched I am not just by her team’s work itself, but to know that Wash U (where my mom earned her PhD) and Barnes Hospital (where she gave birth to me, and died) now spearhead a data-driven fight against leukemia and other cancers.

And I’m proud that my own work at Knome supports such efforts. By thoroughly characterizing tumor genomes, and developing algorithms to do so better, we’re helping clinical researchers spot genetic variants that directly drive tumor growth or, more rarely, predispose some families to recurrent cancers. That work, like Mardis’s, is already leveraging individuated genome data to help people live longer.

Looking ahead, those of us lucky enough to afford good healthcare today will likely be talking a lot about tumor genomes, with our families and friends, in coming decades. Relevant insights will likely guide vital choices for many of us and those we love.

We’ll see how soon other major adult diseases — think of those surveyed by Vogelstein et al., but also of liver and kidney diseases, mental illnesses, breathing problems (asthma, respiratory tract infections), bone diseases, etc. — likewise become more amenable to personal genomic insights, first for diagnosis, and, in the long run, for prognosis too. Here’s to those twin prospects.

[1] In bees (as well as ants and termites), this compact is thought to be reinforced by the fact that a queen and her worker sister may be especially closely related — often moreso than either would be to her own daughter. This odd twist of kinship follows from male bees having just one copy of each chromosome (having hatched from unfertilized eggs), while females have two (hatching from fertilized eggs). As a result, bee full sisters share, on average, three-fourths of their DNA with each other, while mothers and daughters share just half.

In a real hive, the numbers are complicated by many pairs of workers being just half-sisters (with different dads). But that turns out not to matter much; the strong social contract of eusociality can be mathematically understood even without the extra-closeknit kinship that bee sisters share.

[2] As in many personified examples of rivalry between organisms, the tumor cell’s effective strategy is not conscious, but rather a mathematical truism — that is, cells in which a chance mutation tears the web of reproductive constraint that natural selection has woven will, at least in the short term, tend to outgrow neighboring cells.

[3] Importantly, a person may be born with at least one such genetic switch already thrown; this scenario underlies rare cases of strong familial cancer risk. Cancer can then strike as soon as a second switch — a copy of the same gene, or another gene — is randomly thrown by a new mutation.

People born with one broken copy of the cell growth-suppressing RB1 gene, for example, tend to eventually get tumors in both eyes. In such cases, their working second copy of RB1 tends to eventually mutate in one of the fast-dividing, light-bombarded cells of the retina of one eye, leading to a first tumor. Later, a second such mutation may strike a retinal cell in the other eye, spawning a tumor there too.

Thus even in rare forms of cancers that run in families, getting a tumor requires some new mutation in a somatic cell. And that mutation may, in turn, be driven by some environmental factor (such as radiation, toxins, or even infectious germs), highlighting that both genetic and environmental factors are key to understanding cancers — just as in other diseases.

[4] One tumor that beat the odds, living on beyond its original victim, belonged to American Henrietta Lacks. As detailed in Rebecca Skloot’s compelling personal and cellular biography, in 1951, Lacks’s doctor took cells, without her consent, from the ovarian tumor that was killing her, and grew them in dishes. Though born in suffering and scientific misconduct, those cells nonetheless proved remarkably resilient in laboratory culture, and have since spread worldwide as a key resource for six decades of biomedical research.

Another tumor that slipped the mortal coil of its origins is the contagious facial cancer that plagues Tasmanian devils — the subject of a fascinating recent paper. And some rare cancers of germline (rather than somatic) cells may manage to make gametes healthy enough to transmit themselves — rampant cell growth and all — to coming generations.

[5] Of course, even monozygotic twins without tumors aren’t really genetically identical. The genome’s great size, and the many generations of mutation-prone cell division needed to build the myriad cells in their bodies, nearly assure that no single cell in either one is genetically identical to the fertilized egg from which they came — much less to a typical cell pulled from the other twin.

But tumors flout the genetic identity assumption of twin studies even more severely. The many cells that make up a tumor tend to be especially closely related to each other, thanks to their recent growth spurt. Thus the newly arisen genetic variants that the tumors’ cells share, and that distinguish them from other cells in the body (and even moreso from cells of the other twin, thanks to the extra rounds of cell division that separate the two cell lineages in question), are especially common among the cells of the person overall. As such, if we think of a person’s genome as comprising, at each site, a weighted mix of the genotypes of all her cells, then a tumor makes her particularly genetically distinctive — even beyond the prospect that those differences may be more functionally important in tumors than in other kinds of cells.