Things to Think About From 2016


A word cloud of theScinder’s output for 2016, made with


This subject includes throwbacks to 2015, when I did most of my writing about CRISPR/Cas9. That’s not to say 2016 didn’t contain any major genetic engineering news. In particular scientists are continue to move ahead with the genetic modification of human embryos.

If you feel like I did before I engaged in some deeper background reading, you can catch up with my notes on the basics. I used the protein structures for existing gene-editing techniques to highlight the differences between the old-school gene editing techniques and editing with cas9. I also compared the effort it takes to modify a genome with cas9 to how difficult it was using zinc-finger nucleases, the previous state-of-the-art (spoiler: it amounts to days of difference).

TLDR: The advantage of genetic engineering with Cas9 over previous methods is the difference between writing out a sequence of letters and solving complex molecular binding problems.

aLIGO and the detection of gravitational waves

Among the most impressive scientific breakthroughs of the previous hundred years or so, a bunch of clever people with very sensitive machines announced they’ve detected the squidge-squodging of space. A lot of the LIGO data is available from the LIGO Open Science Center, and this is a great way to learn signal processing techniques in Python. I synchronized the sound of gravitational wave chirp GW150914 to a simulated visualization (from SXS) of a corresponding black hole inspiral and the result is the following video. You can read my notes about the process here. I also modified the chirp to play the first few notes of the “Super Mario Brothers” theme.

Machine Learning

I’ve just started an intensive study of the subject, but machine learning continues to dip its toes into everything to do with modern human life. We have a lot of experience with meat-based learning programs, which should give us some insight into how to avoid common pitfalls. The related renewed interest in artificial intelligence should make the next few years interesting. If we do end up with a “hard” general artificial intelligence sometime soon, it might make competition a bit tough, if you could call it competition at all.

Devote a few seconds of thought to the twin issues of privacy and data ownership.


2016 also marked a renewed interest in manned space exploration, largely because of the announcement from space enthusiast Elon Musk that he’s really stoked to send a few people to Mars. NASA is still interested in Mars as well, and might be a good partner to temper Musk’s enthusiasm. In the Q&A at about 1:21 in the video below, Musk seems to suggest a willingness to die as the primary prerequisite for his first batch of settlers. There’s some known unavoidable and unknown unknowable dangers in the venture, but de-prioritizing survivability as a mission constraint runs a better chance of delaying manned exploration as long as it remains as expensive as Musk optimistically expects.

Here’s some stuff that’s a little a lot less serious about living on Mars.

It doesn’t grab the headlines with such vigor, but Jeff Bezo’s Blue Origins had an impressive year: retiring their first rocket after five flights and exceeding the mission design in a final test of a launch escape system.
Blue Origin is also working on an orbital launch system called New Glenn, in honor of the first astronaut from the USA to orbit the earth.

In that case, where are we headed?

The previous year provided some exciting moments to really trip the synapses, but we had some worrying turns as well. The biggest challenges of the next few decades will all have technical components, and understanding them doesn’t come for free. Humanity is learning more about biology at more fundamental levels, and medicine won’t look the same in ten years. A lot of people seem unconcerned that we probably won’t make the 2 degrees Celsius threshold for limiting climate change, although not worrying about something doesn’t mean it won’t kill anyone. Scientists and engineers have been clever enough to develop machine learners to assist our curiosity, and it’s exciting to think that resurgent interest in AI might give us someone to talk to soon. Hopefully they’ll be better conversationalists than the currently available chatbots, and a second opinion on the nature of the universe could be useful. It’s not going to be easy to keep up with improving automation, and humans will have to think about what working means to them.

Take some time to really dig into these subjects. You probably already have some thoughts and opinions on some of them, so try to read a contrary take. If you can’t think of evidence that might change your mind, you don’t deserve your conclusions.

Remember that science, technological development, and innovation have a much larger long-term effect on humans and our place in the universe than the petty machinations of human fractionation. So keep learning, figure out something new, and remember that if you possess general intelligence you can approach any subject. On the other hand, autogenous annihilation is one of the most plausible solutions to the Fermi Paradox. This is no time to get Kehoed

The structure behind the simplicity of CRISPR/Cas9


The International Summit on Human Gene Editing took place in Washington D.C. a few weeks ago, underlining the critical attention continuing to follow CRISPR/Cas9 and its applications to genome editing. Recently I compared published protocols for CRISPR/Cas9 and a competing technique based on Zn-finger nucleases. Comparing the protocols suggests editing with CRISPR/Cas9 is vaguely simpler than using Zn-fingers, but didn’t discuss the biomolecular mechanisms underlying the increased ease of use. Here I’ll illustrate the fundamental difference between genome editing with Cas9 in simple terms, using relevant protein structures from the Protein Data Bank.

Each of the techniques I’ll mention here have the same end-goal: break double stranded DNA in a specific location. Once a DNA strand undergoes this type of damage, a cell’s own repair mechanisms take over to put it back together. It is possible to introduce a replacement strand and encourage the cell to incorporate this DNA into the break, instead of the original sequence.

The only fundamental difference in the main techniques used for genome editing is the way they are targeted. Cas9, Zn-finger, and Transcription Activator Like (TAL) nucleases all aim to make a targeted break in DNA. Other challenges, such as getting the system into cells in the first place, are shared alike by all three systems.


Zinc Fingers (red) bound to target DNA (orange). A sufficient number of fingers like these could be combined with a nuclease to specifically cut a target DNA sequence.


Transcription Activator Like (TAL) region bound to target DNA. Combined with a nuclease, TAL regions can also effect a break in a specific DNA location.


Cas9 protein (grey) with guide RNA (gRNA, red) and target DNA sequence (orange). The guide RNA is the component of this machine that does the targeting. This makes the guide RNA the only part that needs to be designed to target a specific sequence in an organism. The same Cas9 protein, combined with different gRNA strands, can target different locations on a genome.

Targeting a DNA sequence with an RNA sequence is simple. RNA and DNA are both chains of nucleotides, and the rules for binding are the same as for reading out or copying DNA: A binds with T, U binds with A, C binds with G, and G binds with C [1]. Targeting a DNA sequence with protein motifs is much more complicated. Unlike with nucleotide-nucleotide pairing, I can’t fully explain how these residues are targeted, let alone in a single sentence. This has consequences in the initial design of the gRNA as well as the efficacy of the system and the overall success rate.

So the comparative ease-of-application stems from the differences in protein engineering vs. sequence design. Protein engineering is hard, but designing a gRNA sequence is easy.

How easy is it really?

Say that New Year’s Eve is coming up, and we want to replace an under-functioning Acetaldehyde Dehydrogenase [2] with a functional version. First we would need a ~20 nucleotide sequence from the target DNA, like this one from just upstream of the ALDH1B gene:


You can write out the base-pairings by hand or use an online calculator to determine the complementary RNA sequence:


To associate the guide RNA to the Cas9 nuclease, the targeting sequence has to be combined with a scaffold RNA which the protein recognises.

Scaffold RNA:

Target Complement:

Target complement + scaffold = guide RNA:

With that sequence we could target the Cas9 nuclease to the acetaldehyde dehydrogenase (ALDH1B) gene, inducing a break and leaving it open to replacement. The scaffold sequence above turns back on itself at the end, sinking into the proper pocket in Cas9, while the target complement sequence coordinates the DNA target, bringing it close to the cutting parts of Cas9. If we introduce a fully functional version of the acetaldehyde dehydrogenase gene at the same time, then we surely deserve a toast as the target organism no longer suffers from an abnormal build-up of toxic acetaldehyde. Practical points remain to actually prepare the gRNA, make the Cas9 protein, and introduce the replacement sequence, but from an informatic design point of view that is, indeed, the gist.

That’s the basics of targeting Cas9 in 1,063 words. I invite you to try and explain the intricacies of TAL effector nuclease protein engineering with fewer words.


[1] That’s C for cytosine, G for guanine, U for uracil, and A for adenine. In DNA, the uracil is replace with thymine (T).

[2] Acetaldehyde is an intermediate produced during alcohol metabolism, thought to be largely responsible for hangovers. A mutation in one or both copies of the gene can lead to the so-called “Asian Flush”.

Sources for structures:

I rendered all of the structures using PyMol. The data come from the following publications:

PDB structure: 3VEK (Zn-finger)

Wilkinson-White, L.E., Ripin, N., Jacques, D.A., Guss, J.M., Matthews, J.M. DNA recognition by GATA1 double finger.To Be Published

PDB structure: 3ugm (TAL)

Mak, A.N., Bradley, P., Cernadas, R.A., Bogdanove, A.J., Stoddard, B.L. The Crystal Structure of TAL Effector PthXo1 Bound to Its DNA Target. (2012) Science 335: 716-719

PDB structure: 4oo8 (Cas9)
Nishimasu, H., Ran, F.A., Hsu, P.D., Konermann, S., Shehata, S.I., Dohmae, N., Ishitani, R., Zhang, F., Nureki, O. Crystal structure of Cas9 in complex with guide RNA and target DNA. (2014) Cell(Cambridge,Mass.) 156: 935-949

Comic cover original source:
“Amazing Stories Annual 1927” by Frank R. Paul – Scanned cover of pulp magazine. Licensed under Public Domain via Wikimedia Commons –

What’s the big deal with CRISPR/Cas9?


Cas9 (grey) in complex with yellow guide RNA and red target DNA. PDB structure 4oo8 manipulated in PyMOL by yours truly. Cas9, like competing genome editing technologies (TALENs and ZFNs), is a nucelase. Click to view animated GIF.

Summary: Eliminate hereditary diseases. Re-program pathological tissue. Design babies. Bring back the T. rex. The peril and promise of genetic engineering has been a long-time coming. Generally speaking, none of the wonders we began collectively imagining with the deduction of DNA structure in the 1950s have come to fruition. At the turn of the millenium with the completion of the human genome project(s), we expected personalized medicine to eradicate inefficacies and side effects in modern medicine. Current development based on bacterial immune systems promises to either revolutionise the treatment of genetic disease or fill the world with ten-foot tall babies shooting lasers out of their perfect blue eyes while playing professional basketball and winning Nobel Prizes.

My first foray into a wet lab consisted of a project straight out of the astounding futures your favourite sci-fis promised you- or warned you about: incorporating functional genetic elements from humans into fungal cells. After a summer spent pushing the limits of what is possible and blurring the lines of what it means to be human, I created a terrible organism neither man nor yeast. Unable to find acceptance among people and no longer satisfied by nature’s intentions, these fungal colonies, the bizarre offspring of one man’s twisted mind and leavening products found the cruel world to be too much and jumped into an autoclave while reciting Macbeth.

Despite the hyperbolic passage above, the monsters yet live. The strain ended up in a laboratory-grade freezer at negative eighty degrees (Celsius, of course, the lab being free of both astrologers and barbarians). The little yeasties are probably still chilling in the small cardboard box where I left them, covered in frost and enjoying a nice bath of glycerol cryo-protectant, traveling through time in suspended animation until the world is ready for them.

The human genes and their counterparts in baker’s yeast are similar enough that in this case one could substitute for the other (at least in one direction). The function of these metabolic keystones known as ATP synthases is an ancient one: churning the potential energy of an electron gradient to make the cellular energy storage molecule adenine triphosphate (ATP). They are primeval enough that the human version acts as a suitable stand-in for a strain of Saccharomyces cerevisiae otherwise incapable of aerobic respiration. I had precisely engineered a genetic vector that inserted directly into the location of the yeast’s genome where the native version had been removed. And by “precisely engineered” I mean that it was so easy, an undergrad could do it, as I did.

Recently a technique based on CRISPR (Clustered Regularly Interspersed Short Palindromic Repeats) and CRISPR-Associated Proteins (such as Cas9) has garnered a lot of attention in the press as well as the scientific community. The word-sequences up-regulating all the excitement highlight the ease and effectiveness of CRISPR/Cas9 over previous methods. The technique’s critical reception has run the full range from drooling anticipation to worried alarm to bad puns.

Since my early days in the lab playing as a god with design of human-yeast splices, I’ve continued down the rabbit-hole of biological scale to the point that I now work more often with the single molecule(s) of biomolecular machinery than with cells directly. So I’m certainly out of the loop and out of a practical grasp of the rational underlying CRISPR/Cas9 genome editing. After all, spider silk proteins have been produced in mammalian cells since before 2002, and are regularly produced in goat’s milk. Does CRISPR/Cas9 change the game to such a degree that warrants the flood of interest?


The interest surrounding CRISPR

I’ll skip over the high-level technical overviews that you’ve probably read before, but for those with the time and interest I can recommend Jennifer Doudna’s Breakthrough Prize lecture. Instead I’ll compare two protocols, the first based on CRISPR/Cas9 and the second based on an older technique using another type of engineered nuclease known as zinc-finger nucleases (ZFNs). I scraped both protocols from the same publication, so apparent differences due to style should be small. To get a sense of the complexity of each technique, here are the two protocols as wordle word-clouds, displaying the size of the 256 most frequently used words in each protocol according to their relative usage.


ZFN protocol: word frequency word cloud


CRISPR/Cas9 protocol: word frequency word cloud

The table below compare the complexity and length of either protocol. The reading complexity measures were generated with this tool, and in short the first measure decreases with increased complexity while the second two increase with added complexity.


At first glance we see that the CRISPR/Cas9 protocol is much longer and more complicated, but if we consider that the Zn-finger nuclease protocol only describes the process up to in vitro validation of the process, we can make a much more equivalent comparison by truncating the CRISPR/Cas9 protocol to the first 13 steps. The resulting comparison:


The associated Wordle even looks a bit friendlier.


So suffice it to say that it’s not easy to see the underpinnings of the excitement surrounding major developments such as CRISPR/Cas9. Essentially the advantages of the CRISPR-based approach stems from the level of difficulty of engineering guide RNAs versus engineering DNA-binding domains based on amino acid residues required for competing techniques ZFNs and TALENs (not compared here). In the brewer’s yeast I modified “back in the day,” targeting the desired genes to the desired location was as simple as including a sequence from the target location on the DNA to be inserted; there are sufficient double-stranded breaks in a flask of yeast culture to allow the gene to find its target a few times. With the specifically targetable nucleases such as Cas9, Zinc-finger nucleases and TALENs, one doesn’t have to count on such an easy model organism to precisely manipulate a small number of cells for a desired change to the genome.

The increased interest alone is sure to drum up funding, public intrigue, and private investment, driving the impact forward as a self fulfilling prophecy. The more interested and excited people are for CRISPR/Cas9, particularly those people with the deep pockets to fill out scientists’ salaries, the more the technique will be subjected to use and refinement. More people using the tool drives the potential for meaningful breakthroughs. On the other hand, we have been promised and warned of this same onrushing biopunk dystopia before, and as they say: if this is the future, where are my gene-driven superpowers?


Published protocols referenced in this post:
[1] Carroll, D., Morton, J. J., Beumer, K. J., & Segal, D. J. (2006). Design, construction and in vitro testing of zinc finger nucleases. Nature Protocols, 1(FEBRUARY 2006), 1329–1341.

[2] Ran, F. A., Hsu, P. P. D., Wright, J., Agarwala, V., Scott, D. a, & Zhang, F. (2013). Genome engineering using the CRISPR-Cas9 system. Nature Protocols, 8(11), 2281–308.

[2015/12/14 EDIT – copyediting]