Why is there no confidence in science journalism?

UncertainScienceStoriesMS1939Small

Living in the so-called anthropocene, meaningful participation in humanity’s trajectory requires scientific literacy. This requirement is a necessity at the population level, it is not enough for a small proportion of select individuals to develop this expertise, applying them only to the avenues of their own interest. Rather, a general understanding and use of the scientific method in forming actionable ideas for modern problems is a requisite for a public capable of steering policy along a survivable route. As an added benefit, scientific literacy produces a rarely avoided side-effect of knowing one or two things for certain, and touching upon the numinous of the universe.

Statistical literacy is a necessary foundation for building scientific literacy. Widespread confusion about the meaning of such terms as “statistical significance” (compounded by non-standard usage of the term “significance” on its own) abounds, resulting in little to no transferability of the import of these concepts when scientific results are described in mainstream publications. What’s worse, this results in a jaded public knowing just enough to twist the jargon of science to support their own predetermined, potentially dangerous, conclusions (e.g. because scientific theories can be refuted by evidence to the contrary, a given theory, no matter the level of support by existing data, can be ignored when forming personal and policy decisions).

I posit that a fair amount of the responsibility for improving the state of non-specialist scientific literacy lies with science journalists at all scales. The most popular science-branded media does little to nothing in imparting a sense of the scientific method, the context and contribution of published experiments, and the meaning of statistics underlying the claims. I suggest that a standardisation of language for describing scientific results is warranted, so that results and concepts can be communicated in an intuitive manner without resorting to condescension, as well as conferring the quantitative, comparable values used to form scientific conclusions.

A good place to start (though certainly not perfect) is the uncertainty guidance put out by the Intergovernmental Panel on Climate Change (IPCC). The IPCC reports benefit from translating statistical concepts of confidence and likelihood into intuitive terms without sacrificing the underlying quantitative meaning (mostly). In the IPCC AR5 report guidance on addressing uncertainty [pdf], likelihood statements of probability are standardised as follows:

LikelihoodIPCCguidance

In the fourth assessment report (AR4), the guidance [pdf] roughly calibrated confidence statements to a chance of being correct. I’ve written the guidance here in terms of p-values, or the chance that results are due to coincidence (p = 0.10 = 10% chance), but statistical tests producing other measurements of confidence were also covered.

confStandardAR4

The description of results via their confidence rather than statistical significance, which is normally used, is probably more intuitive to most people. Few people in general readership readily discern between statistical significance, i.e. the results are likely to not be due to chance, and meaningful significance, i.e. the results matter in some way. Likewise, statistical significance statements are not even very well established in scientific literature and vary widely by field. That being said, the IPCC’s AR4 guidance threshold for very high confidence is quite low. Many scientific results are only considered reportable at a p-value of less than 0.05, or 5% chance of being an experimental artifact in the data due to coincidence, whereas the AR4 guidance links a statement of very high confidence to anything with less than a 10% chance of being wrong. Likewise, a 5-in-10 chance of being correct hardly merits a statement of medium confidence in my opinion. Despite these limitations, I think the guidance should have been merely updated to better reflect the statistical reality of confidenceand it was a mistake for the guidance for AR5 to switch to purely qualitative standards for conveying confidence based on the table below, with highest confidence in the top right and lowest confidence in the bottom left.

confAR5Standards

Adoption (and adaptation) of standards like these in regular usage by journalist could do a lot to better the communication of science to a general readership. This would normalise field-variable technical jargon (e.g. sigma significance values in particle physics, p-values in biology) and reduce the need for daft analogies. Results described in this way would be amenable to meaningful comparison by generally interested but non-specialist audiences, while those with a little practice in statistics won’t be any less informed by dumbing-down the meaning.

Edited 2016/06/25 for a better title, added comic graphic. Source for file of cover design by Norman Saunders (Public Domain)
23 Aug. 2014: typo in first paragraph corrected:

. . . meaningful participation in participating in humanity’s trajectory. . .

References:

Michael D. Mastrandrea et al. Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties. IPCC Cross-Working Group Meeting on Consistent Treatment of Uncertainties. Jasper Ridge, CA, USA 6-7 July 2010. <http://www.ipcc.ch/pdf/supporting-material/uncertainty-guidance-note.pdf&gt;

IPCC. Guidance Notes for Lead Authors of the IPCC Fourth Assessment Report on Addressing Uncertainties. July 2005. <https://www.ipcc-wg1.unibe.ch/publications/supportingmaterial/uncertainty-guidance-note.pdf&gt;

Advertisements

Much ado about sitting

A few years ago, athletic shoe companies began to cash in on a study or two suggesting that running in shoes was dangerous, guaranteed to ruin your joints and your life, make you less attractive and confident, etc. (at least, that’s how it was translated to press coverage). The only viable answer, vested marketing implied, was to buy a new pair of shoes with less shoe in them.

Despite the obvious irony, consumers flocked in droves to purchase sweet new kicks and rectify their embarrassing running habits. Much like any other fitness craze, popular active-lifestyle magazines ran articles about the trend spinning a small amount of scientific research into definitive conclusions right next to advertisements for the shoes themselves. Fast forward to 2014 wherein the makers of arguably the most notorious shoes in the minimalist sector, the Vibram Five Fingers line, have moved to settle a lawsuit alleging the claimed health benefits of the shoes were not based on evidence. The market frenzy for minimalist footwear appears to have sharply abated. There are even blatant examples of market backlash in the introduction of what could be described as “marshmallow shoes,” such as the Hakko, with even more padding than runners were used to before the barefoot revolution.

An eerily similar phenomenon has appeared, i.e. market capitalisation on nascent scientific evidence, in the latest demon threatening our health: sitting. At the bottom of the orogenic marketplace for accessories designed to get workers a bit less semi-recumbent in the workplace. This market was virtually non-existent only a few years ago, yet now is substantial enough to have spawned an entire genre of internet article.

There is even a new term gaining traction for the condition: “sitting disease.” I sure hope it’s not catching. For now at least the term seems to remain quarantined in quotation marks most places it is used.

Many of the underlying articles in science journals are what is euphemistically referred to as survey science. Long generational time, lack of uniform cultivation standards and potential ethical considerations make Homo sapiens a rather poor model organism. Even if survey data were considered reliable (a dubious assumption), this only reveals associations. Even accelerometer studies, like those at the Mayo Clinic, only measure activity for a few weeks. The results can’t tell you that sitting alone causes obesity. An equally fair hypothesis would be that obesity increase the likelihood to stay sitting, but that’s just called inertia.

Although the studies and their press coverage motivate a burgeoning marketplace for NEAT accessories they don’t actually tell us much in the way of new information. A sedentary lifestyle is unhealthy. Attempts to increase the amount of low-intensity activity throughout the day, such as using a walking desk, are likely to motivate appetite. Without considering diet (and downplaying the importance of exercise), a standing desk, sitting ball, or occasional walking meeting is not likely to have tremendous health benefits when taken alone. And despite the rhetoric, maintaining a smoking habit to break up your sit-time with walks to the outdoors is probably not an equivalent trade-off. Presenting health management in such an unbalanced, single-variable way seems more motivated by trendiness for some, revenue for others, and both for the press. It is not that sitting is actually good for you, it’s just myopic to focus solely on that one health factor. As part of a a sedentary lifestyle gestalt, yes, it does play a role in promoting ill-health. Then again, if you think about it, you probably already knew that before it was cool.


Avoid sensationalist science journalism, consider the sources:
Ford, E.S., and Caspersen, C.J. (2012). Sedentary behaviour and cardiovascular disease: a review of prospective studies. Int J Epidemiol 41, 1338–1353.
Hamilton, M.T., Hamilton, D.G., and Zderic, T.W. (2007). Role of low energy expenditure and sitting in obesity, metabolic syndrome, type 2 diabetes, and cardiovascular disease. Diabetes 56, 2655–2667.
Katzmarzyk, P.T., Church, T.S., Craig, C.L., and Bouchard, C. (2009). Sitting time and mortality from all causes, cardiovascular disease, and cancer. Med Sci Sports Exerc 41, 998–1005.
Rosenkranz, R.R., Duncan, M.J., Rosenkranz, S.K., and Kolt, G.S. (2013). Active lifestyles related to excellent self-rated health and quality of life: cross sectional findings from 194,545 participants in The 45 and Up Study. BMC Public Health 13, 1071.
Rovniak, L.S., Denlinger, L., Duveneck, E., Sciamanna, C.N., Kong, L., Freivalds, A., and Ray, C.A. (2014). Feasibility of using a compact elliptical device to increase energy expenditure during sedentary activities. Journal of Science and Medicine in Sport 17, 376–380.
Schmid, D., and Leitzmann, M.F. (2014). Television Viewing and Time Spent Sedentary in Relation to Cancer Risk: A Meta-analysis. JNCI J Natl Cancer Inst 106, dju098.
Young, D.R., Reynolds, K., Sidell, M., Brar, S., Ghai, N.R., Sternfeld, B., Jacobsen, S.J., Slezak, J.M., Caan, B., and Quinn, V.P. (2014). Effects of Physical Activity and Sedentary Time on the Risk of Heart Failure. Circ Heart Fail 7, 21–27.

“Where is everybody?”

SONY DSC

Don’t get too excited about finding E.T. just yet. Get excited about the engineering.

A few days ago NASA had a press conference moderated by NASA Chief Scientist Ellen Stofan. The filtered headline that eventually made its way into the popular consciousness of the internet is that the discovery of extraterrestrial life is a paltry couple decades away. The way the conference was parsed into news form range from the relatively guarded “NASA scientists say they’re closer than ever to finding life beyond Earth” at the LA Times to the more sensational “NASA: ALIENS and NEW EARTHS will be ours inside 20 years” at the The Register. As statements, the former headline is almost unavoidably true given an assumption that humans eventually stumble upon life off-planet, and the latter is only one more over-capitalised word from being wholly fantastic. Neither actually touches on the content of the NASA press conference.

The impetus of the conference was partially fueled by an April announcement of a discovery by the Kepler program of the Earth-similar Kepler 186f, which happens to reside in the habitable zone of its siminymous parent star. Although Kepler 186f definitely might be sort of a bit more Earth-like, its discovery was only the latest in a long list of over 1800 exoplanets posited to exist to date. Although the techniques for exoplanet discovery planetary transit attributable stellar dimming, are not infalliable [paywalled primary source], the continued refinement of modern signal processing for unearthing (heh) exoplanet planet signatures makes this an exciting time to look skyward.

The speakers took a broad view of progression toward answering the question “are we alone?” John Grunsfeld, Hubble mechanic extraordinaire, emphasised the approach of looking for spectral signals corresponding to bio-signatures with the upcoming James Webb telescope. Of course, the terracentric focus shared by the panel means that NASA plans to look for signals associated with Earth life: water, methane, oxygen, etc. Carl Sagan et al. considered the task of finding similar biosignatures on Earth itself. Looking for signs we know to be associated with our own personal experience of life is our best current guess for what we should be looking for, but no guarantee exists that it is the right one. We are no longer too enthralled by the idea of trading arsenate for phosphate, but our own planet has plenty of examples of strange metabolism, that we should expect life off planet to consist of more peculiar possibilities. Imagine our chagrin if we spend a few centuries looking for spectral signatures of water before stumbling across hydrophobic biochemistry on Titan.

Many of us may remember the nanobe-laden Martian meteorite ALH84001 that touched off a burst of interest and a flurry of Mars probes in the latter half of the 1990s. Like the 100-200 nm fossilised “bacteria” in the Mars meteorite, the tone suggesting imminent discovery of extraterrestrial life (particularly the sensationalist coverage by the lay press) serves as nothing more than hyperbolic rhetoric. If this effect carries over to those with a hand on the purse-strings, so much the better, but don’t get too caught up as a member of the scientifically literate and generally curious public. The likelihood of finding life outside our own planet in a given time span is essentially impossible to predict with no priors, hence the famous Fermi’s paradox which graces the title of this post. The actual content of the video is much more important than the wanton speculation that fuels its press coverage.

A major advantage of placing the Hubble space telescope above the atmosphere was to avoid optical aberrations generated by atmospheric turbulence. The present state of the art in adaptive optics and signal processing essentially obviates this need, as ground-based telescopes such as the Magellan II in Chile can now outperform the Hubble in terms of resolution. The James Webb will offer some fundamentally novel capabilities in what it can see, with a 6.5m primary mirror and sensors sensitive to wavelengths from 600 nanometre red to the mid infrared at 28 microns.

The upcoming TESS survey, described by McArthur Fellow Sarah Seager, will use the same basic technique-observing planetary transits-as the Kepler mission to look for exoplanets. TESS will launch in 2017, slightly in advance of the main attraction of JWST. Looking for planetary transits has served us well in the past, but direct imaging is the holy grail. Seager described a starshade for occluding bright, planet-hosting stars to further that goal as part of the New Worlds mission. The design resembles a sunflower in pattern rather than a circular shade, the latter would introduce airy rings from diffaction around the edges, and desert tests of the prototypes have been encouraging so far. The precision engineering of the shade unfolding is another masterpiece. Due to its size, deployment cannot be tested in a terrestrial vacuum chamber, requiring its engineering to be all the more precise. I could see scale versions of the design as parasols doing quite well in the gift shop.

800px-Artist's_concept_of_the_New_Worlds_Observatory

Image from NASA via Wikipedia

The natural philosophy that we now call science has roots in the same fundamental questions as “regular” philosophy. “Are we alone?” Is really just a proxy for “Where are we, how does it work, and why are we here?” Without any definitive answers to these questions on the horizon, I think we can safely say that building the machines that allow us to explore them and conditioning our minds in order to think about our universe is a pretty good way to spend our time. It will be a lonely universe if we find ourselves to be a truly unique example of biogenesis, but not so lonely in the looking.

As for yours truly, I’m looking forward to the “Two Months of Terror” (to quote Grunsfeld), October-December 2018, as the James Webb telescope makes its way to the L2 Lagrange point to unfold and cool in preparation for a working life of precipitous discovery.

Link to video

Panel:
Ellen Stofan- Chief Scientist, NASA
John Grunsfeld- Astrophysicist, former astronaut, Hubble mechanic
Matt Mountain- Director: Space telescope Science Institute
John Mather- Project scientist James Webb telescope, 2006 physics Nobel laureate
Sarah Seager- Astrophysicist, MIT Principal Investigator, McArthuer fellow 2013
Dave Gallagher Electrical Engineer, Director of Astronomy and Physics at Jet Propulison Laboratory

Also read up on ESA projects: the Herschel Space Observatory, observing at 60 to 500 microns, and Gaia, a satellite set to use parallax to generate a precise galactic census.

Top image by the author

TANSTAAFL!

optimisationsTriangle Ever find yourself wishing for the last microscope you will ever need to buy, the instrument that can view anything at any scale and any speed? It’s very tempting to imagine an optical microscope with the diffraction-unlimited resolution of STED, the volumetric imaging speed of light sheet illumination, the deep-tissue penetration of multiphoton microscopy, and the ability to do it all in phase and scattering without invoking a need for exogenous fluorophores or dyes. Perhaps the gamma-ray microscope employed in a Heisenberg thought experiment or the tricorder from Star Trek would come close, but unfortunately we are still waiting on the underlying technologies to mature for the latter. In microscopy as in life, optimisation in one capability comes at a trade-off cost in another. Put more plainly, TANSTAAFL.

Earlier in the summer I attended a biophotonics summer school at the University of Illinois at Urbana-Champaign’s Beckman Institute (link). At the end of a combination of lab tours, seminar-style lectures and poster sessions, we were treated to an hour-long presentation by the president of Carl Zeiss, James Sharp. Perhaps you have heard of Zeiss, a company eponymous with its founder, who teamed up with Ernst Abbe in the late 1800s to invent and commercialise the field of microscopy. After painting a stark contrast of the present job market with that of days past with a story of being stuck in his interviewer’s office for a day by a locked filing cabinet and errant bell-bottom pants, Sharp went on to give what essentially amounted to a 45 minute advertisement for Zeiss (spoiler, they are not best friends with Leica) as a company to work for or buy things from. It was an insightful set of slides that emphasised how far I have to go in my own career before I could fathom spending half a million dollars on a microscope. One insight that I came away with that will stick with me for the foreseeable future is the imaging optimisation triangle.

Sharp described the triangle as a trade-off for traits of resolution, speed, and depth, but the concept is fairly common and the third trait is often defined by the signal to noise ratio, or sensitivity. The moral of the story is that all three corners of the triangle can’t be optimised simultaneously. All else being equal, STED can’t be as fast as wide field or light sheet imaging, and nothing can penetrate tissue like 4-photon imaging. Step changes in the underlying technology can raise the watershed for performance across microscope modalities, e.g. new sensor paradigms can improve signal-to-noise regardless of the technique used. However, even with marked leaps in innovation, you can’t have it all at once.

The microscopy triangle is typically invoked as a qualitative example of trade-offs. However, the three traits certainly have measurable performance features and three corners equates easily enough to three axes. Why not populate a quantitative volume to show the pros and cons of various imaging modalities? Here are a few flagship microscope techniques populated on the quantitative microscopy TANSTAAFL pyramid.

microscopeTANSTAAFL

These are all vastly different techniques, so the minutiae of their strengths is somewhat lost. Given a known volume occupying desirable specifications for testing a hypothesis, the graph could be populated with the techniques at your disposal and used to inform a decision on which to utilise. More realistically, axes can be added as need (e.g. for photobleaching, axial resolution) and a single or set of similar techniques could be considered with different settings, e.g. laser power, sensor used, etc., rather than comparing these vastly different modalities.

Values are approximate and from the following sources:
Multiphoton depth penetration extimated from a talk by Chris Xu of Cornell University http://www.jneurosci.org/content/30/28/9341.short
Wide-field: Personal estimates
http://www.nature.com/nrm/journal/v15/n5/fig_tab/nrm3786_T2.html

The triangle is known by various names including, imaging triangle or (somewhat ominously) as the eternal triangle, or even the “triangle of frustration.”

Update 2014/07/09: Typo: “Start Trek” corrected to “Star Trek”