While reading press coverage on the UW-Madison primate caloric restriction study for my essay, I kept getting deja vu as I noticed I was coming across the same language over and over. Much of this was due to the heavy reliance of early coverage on the press release from the University of Wisconsin-Madison, and sites buying stories from each other,and I decided it might be informative to make a phylogenetic tree of the coverage. To do so I used the text from the first two pages of google news results for “wisconsin monkey caloric restriction” and built a phylogenetic tree based on multiple sequence alignment after converting the english text to DNA sequences. I found a total of 27 articles on the CR study, and included one unrelated outgroup for a total of 28.
I used DNA Writer by Lensyl Urbano (CC BY NC SA) to convert the text of the article into a DNA sequence. This algorithm associates each character with a three nucleotide sequence, just like our own genome defines amino acids with a three letter code. Unlike our own genetic code, Urbano’s tool is not degenerate (each letter has only one corresponding 3 letter code). With base four (Adenine, Thymine, Guanine, and Cytosine provide our bases) there is room for (64) unique codes. For example “I want to ride my bicycle” becomes
The translation table and tool are available at http://earthsciweb.org/js/bio/dna-writer/.
To build the trees and alignments I used MAFFT. The sequences derived from each article can be relatively long, and MAFFT can handle longer sequences due to its use of the Fast Fourier Transform. MAFFT is available for download or use through a web interface here. I used the web interface, checking the Accurate and Minimum Linkage run options.
Once I had copied the tree in Nexus format, I ran FigTree by Andrew Rambaut to generate a useful graphical tree. I had included an unrelated article at Scientific American as an outgroup, and I chose the branch between that article and the group composed of press coverage of the UW macaque caloric restriction study as the root. This would correspond to a last common ancestor on a real phylogeny tree.
The resulting tree produces some interesting clades, for example ScienceDaily, esciencenews, and News-Medical, who essentially all just reproduced the UW-Madison press release, are grouped together. Another obvious group is the Tampa Bay Times and the Herald Tribune, which sourced the article from the New York Times and pared it down for their readers.
Here is the tree in Nexus format:
. . .and this is a list of all the addresses for the articles I used and their labels on the tree: https://thescinder.com/pages/key-to-uwm-mac…logenetic-tree/