Why is there no confidence in science journalism?

UncertainScienceStoriesMS1939Small

Living in the so-called anthropocene, meaningful participation in humanity’s trajectory requires scientific literacy. This requirement is a necessity at the population level, it is not enough for a small proportion of select individuals to develop this expertise, applying them only to the avenues of their own interest. Rather, a general understanding and use of the scientific method in forming actionable ideas for modern problems is a requisite for a public capable of steering policy along a survivable route. As an added benefit, scientific literacy produces a rarely avoided side-effect of knowing one or two things for certain, and touching upon the numinous of the universe.

Statistical literacy is a necessary foundation for building scientific literacy. Widespread confusion about the meaning of such terms as “statistical significance” (compounded by non-standard usage of the term “significance” on its own) abounds, resulting in little to no transferability of the import of these concepts when scientific results are described in mainstream publications. What’s worse, this results in a jaded public knowing just enough to twist the jargon of science to support their own predetermined, potentially dangerous, conclusions (e.g. because scientific theories can be refuted by evidence to the contrary, a given theory, no matter the level of support by existing data, can be ignored when forming personal and policy decisions).

I posit that a fair amount of the responsibility for improving the state of non-specialist scientific literacy lies with science journalists at all scales. The most popular science-branded media does little to nothing in imparting a sense of the scientific method, the context and contribution of published experiments, and the meaning of statistics underlying the claims. I suggest that a standardisation of language for describing scientific results is warranted, so that results and concepts can be communicated in an intuitive manner without resorting to condescension, as well as conferring the quantitative, comparable values used to form scientific conclusions.

A good place to start (though certainly not perfect) is the uncertainty guidance put out by the Intergovernmental Panel on Climate Change (IPCC). The IPCC reports benefit from translating statistical concepts of confidence and likelihood into intuitive terms without sacrificing the underlying quantitative meaning (mostly). In the IPCC AR5 report guidance on addressing uncertainty [pdf], likelihood statements of probability are standardised as follows:

LikelihoodIPCCguidance

In the fourth assessment report (AR4), the guidance [pdf] roughly calibrated confidence statements to a chance of being correct. I’ve written the guidance here in terms of p-values, or the chance that results are due to coincidence (p = 0.10 = 10% chance), but statistical tests producing other measurements of confidence were also covered.

confStandardAR4

The description of results via their confidence rather than statistical significance, which is normally used, is probably more intuitive to most people. Few people in general readership readily discern between statistical significance, i.e. the results are likely to not be due to chance, and meaningful significance, i.e. the results matter in some way. Likewise, statistical significance statements are not even very well established in scientific literature and vary widely by field. That being said, the IPCC’s AR4 guidance threshold for very high confidence is quite low. Many scientific results are only considered reportable at a p-value of less than 0.05, or 5% chance of being an experimental artifact in the data due to coincidence, whereas the AR4 guidance links a statement of very high confidence to anything with less than a 10% chance of being wrong. Likewise, a 5-in-10 chance of being correct hardly merits a statement of medium confidence in my opinion. Despite these limitations, I think the guidance should have been merely updated to better reflect the statistical reality of confidenceand it was a mistake for the guidance for AR5 to switch to purely qualitative standards for conveying confidence based on the table below, with highest confidence in the top right and lowest confidence in the bottom left.

confAR5Standards

Adoption (and adaptation) of standards like these in regular usage by journalist could do a lot to better the communication of science to a general readership. This would normalise field-variable technical jargon (e.g. sigma significance values in particle physics, p-values in biology) and reduce the need for daft analogies. Results described in this way would be amenable to meaningful comparison by generally interested but non-specialist audiences, while those with a little practice in statistics won’t be any less informed by dumbing-down the meaning.

Edited 2016/06/25 for a better title, added comic graphic. Source for file of cover design by Norman Saunders (Public Domain)
23 Aug. 2014: typo in first paragraph corrected:

. . . meaningful participation in participating in humanity’s trajectory. . .

References:

Michael D. Mastrandrea et al. Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties. IPCC Cross-Working Group Meeting on Consistent Treatment of Uncertainties. Jasper Ridge, CA, USA 6-7 July 2010. <http://www.ipcc.ch/pdf/supporting-material/uncertainty-guidance-note.pdf&gt;

IPCC. Guidance Notes for Lead Authors of the IPCC Fourth Assessment Report on Addressing Uncertainties. July 2005. <https://www.ipcc-wg1.unibe.ch/publications/supportingmaterial/uncertainty-guidance-note.pdf&gt;

A Phylogeny of Internet Journalism

While reading press coverage on the UW-Madison primate caloric restriction study for my essay, I kept getting deja vu as I noticed I was coming across the same language over and over. Much of this was due to the heavy reliance of early coverage on the press release from the University of Wisconsin-Madison, and sites buying stories from each other,and I decided it might be informative to make a phylogenetic tree of the coverage. To do so I used the text from the first two pages of google news results for “wisconsin monkey caloric restriction” and built a phylogenetic tree based on multiple sequence alignment after converting the english text to DNA sequences. I found a total of 27 articles on the CR study, and included one unrelated outgroup for a total of 28.

I used DNA Writer by Lensyl Urbano (CC BY NC SA) to convert the text of the article into a DNA sequence. This algorithm associates each character with a three nucleotide sequence, just like our own genome defines amino acids with a three letter code. Unlike our own genetic code, Urbano’s tool is not degenerate (each letter has only one corresponding 3 letter code). With base four (Adenine, Thymine, Guanine, and Cytosine provide our bases) there is room for 4^3 (64) unique codes. For example “I want to ride my bicycle” becomes

CTGAGCATGACTCTCTAGAGCTAGTGTAGCCACCTGTACCTAAGCACAGACAGCCATCTGTCAGACTCAATCCTA

The translation table and tool are available at http://earthsciweb.org/js/bio/dna-writer/.

To build the trees and alignments I used MAFFT. The sequences derived from each article can be relatively long, and MAFFT can handle longer sequences due to its use of the Fast Fourier Transform. MAFFT is available for download or use through a web interface here. I used the web interface, checking the Accurate and Minimum Linkage run options.

Once I had copied the tree in Nexus format, I ran FigTree by Andrew Rambaut to generate a useful graphical tree. I had included an unrelated article at Scientific American as an outgroup, and I chose the branch between that article and the group composed of press coverage of the UW macaque caloric restriction study as the root. This would correspond to a last common ancestor on a real phylogeny tree.

The resulting tree produces some interesting clades, for example ScienceDaily, esciencenews, and News-Medical, who essentially all just reproduced the UW-Madison press release, are grouped together. Another obvious group is the Tampa Bay Times and the Herald Tribune, which sourced the article from the New York Times and pared it down for their readers.

UWMacaqueCRPressTree

Here is the tree in Nexus format:

(((1_theScinder-:0.845,(((((((((((((((2_UWMPressRelease:0.0085,((4_escienceNews_UWM_:5.0E-4,5_ScienceDaily_UWPressRelease:5.0E-4):0.0,15_news-medical_UWM:5.0E-4):0.008):0.3115,26_aniNews:0.32):0.392,(14_natureWorldNews:0.7055,16_techTimes:0.7055):0.0065):0.006,25_expressUK:0.718):0.0025,20_hngn:0.7205):0.0195,(8_MedicalNewsToday:0.0,18_bayouBuzz_medicalNewsToday:0.0):0.74):0.0025,27_newsTonightAfrica:0.7425):0.047,(17_perezHilton:0.7805,(19_theVerge:0.6905,24_cbsLocalAtlanta:0.6905):0.09):0.009):0.0075,7_IFLS:0.797):0.007,21_seattlepi:0.804):0.006,12_nature:0.81):0.021,(6_yahooNews:0.0285,10_livescience:0.0285):0.8025):5.0E-4,((3_NYTimes:0.1875,11_HeraldTribune_NYT:0.1875):0.344,13_tampaBayTimes_NYT:0.5315):0.3):0.008,22_iol_dailyMail:0.8395):5.0E-4,9_healthDay/Philly_com:0.84):0.005):0.004,23_bbc:0.849):0.0245,28_OUTGROUPSciAmYeastyBeasties:0.8735);

. . .and this is a list of all the addresses for the articles I used and their labels on the tree: https://thescinder.com/pages/key-to-uwm-mac…logenetic-tree/

Come on you monkeys, do you want to live forever?

fatMonkey

Members of the control group for the Wisconsin National Primate Research Center caloric restriction study were fed an ad libitum diet of processed food.

The infinite monkey theorem, perhaps first invoked by French mathematician Émile Borel, posits that a monkey condemned to randomly punch keys on a typewriter for an infinite period of time would eventually produce the complete works of Shakespeare. The thought experiment may also be a good metaphor for encapsulating the experience of writing amateur science journalism.

Now consider the same experiment, replacing the generic monkey with members of the species Macaca mulatta, rhesus macaques, and the typewriter with as much processed food as the macaques can stuff into their furry little faces. Modestly pare down the timescale of the experiment from infinite time to about 25 years, increase the number of macaques from one lonely typist to about 38 individuals, and you have a pretty good first approximation of the control group for the University of Wisconsin-Madison Energy Metabolism and Chronic Disease study. You’ll be more familiar with the name used in the popular press, something including the words “caloric restriction,” “longevity” or “lifespan,” and “monkey.”

Caloric restriction (CR) has a long history of increasing longevity in yeast, nematodes, and mice. Youtube is full of mini-documentaries detailing the lives of the voluntarily emaciated, and many a blogger describes their day to day struggle to minimize caloric intake. The human caloric restriction community may have breathed a combined sigh of frustration and relief in 2012 when de facto rivals at the National Institutes of Aging (NIA), led by Dr. Rafa da Cabo, published an article contradicting the 2009 claim that it works in monkeys, too.

The most recent foray in the field of macaque CR published in Nature Communications by Dr. Ricki Colman et al. from Wisconsin, claims the NIA study control monkeys were actually on a CR diet as well, albeit less extreme than the 30% reduction of the experimental diet. They compared the mean weight of control monkeys in both studies to a national database of research macaque mass, the internet Primate Ageing Database or iPAD. The NIA controls were indeed as much as 15% lighter than the averages in the database, as would be expected if the animals were on a restricted diet. However, the UW controls were 5-10% heavier than average, blurring the line between normal feeding and overeating. iPAD does not distinguish between solitary or group housing in macaques, while both the NIA and the Wisconsin study house each individual separately.

The difference ultimately comes down to a discrepancy in what is considered a normal diet. Colman et al fed controls as much of a fortified, low-fat diet, relatively rich in sugar content, as they wanted. This ad libitum feeding was meant to mirror the eating habits of humans. At the NIA, controls wer given a diet based on estimated nutritional need, rather than appetite, and the food was less processed.

Since the goal of using primates in this research is to translate the results to humans, the differing diet choices for controls represent a meaningful philosophical difference: should we compare experiments to how we are or how we should be? Granted the industrialized world is now more overweight than not, and the control group studied by UW researchers may be a more realistic mirror of the human condition. But the survival benefits seen in the CR group may boil down to the benefits of eating a reasonable diet, avoiding excessive sugar and getting out of the cage once in a while. In short the UW study was designed in a way that would err on the side of confirming their hypothesis, while the NIA study was much more conducive to leaving room for the null alternative.

The controversy underlines the difficulty of taking promising results in “lower” animals and common model organisms and applying them to humans. The idea of putting 76 humans into controlled conditions for 25 years to test a radical diet or any other intervention is limited to the realm of the horror subtype of science fiction. This is why much of the health reports that trickle down into the popular press are based on “survey science,” in which respondents answer questionnaires regarding their diet and lifestyle, with varying degrees of quantitative oversight. This is in large part what leads to the impression that every other week the things that kill you are healthy again and vice-versa. It pays in terms of publicity for a university press office to encourage journalists to parrot a warning that eating meat is as deadly as smoking, even if human self-reporting is notoriously bad, and the underlying data may be a bit more subtle.

The climate for ethical considerations in even non-human primate research is evolving. In early 2013, the National Institutes of Health announced that they would begin retiring active chimpanzees from research with no intent to replace them. It is unlikely that either the experimental conditions for the NIA or the Wisconsin study will be reproduced in the near-future, so there won’t be any mulligans for CR in monkeys. This increases the scrutiny and standard of evidence for the results from these experiments, and makes it all the more important for the scientific community and popular press to come to cohesive conclusions.

The “need for consensus” may be overstated, as the studies are very different experiments. It is likely that those both scientifically literate and with the time and inclination to read the literature wouldn’t be misled in their conclusions, but this group will not include most people who may be affected by the outcome. After all, everyone gets old eventually, if they are lucky enough. The responsibility to avoid painting the situation as a sensational controversy and accurately convey the results of these experiments belongs to science journalists and academics in combination.

Caloric restriction reduces age-related and all-cause mortality in rhesus monkeys

Relevant articles (appended 2016/01/06):
Ricki J. Colman, T. Mark Beasley, Joseph W. Kemnitz, Sterling C. Johnson, Richard Weindruch & Rozalyn M. Anderson. Caloric restriction reduces age-related and all-cause mortality in rhesus monkeys. Nature Communications 5, Article number: 3557 doi:10.1038/ncomms4557
Received 12 October 2013 Accepted 05 March 2014 Published 01 April 2014

Evi M. Mercken, Bethany A. Carboneau, Susan M. Krzysik-Walker, and Rafael de Cabo.Of Mice and Men: The Benefits of Caloric Restriction, Exercise, and Mimetics Ageing Res Rev. 2012 Jul; 11(3): 390–398. Published online 2011 Dec 20. doi: 10.1016/j.arr.2011.11.005