guessesAndGivens no. 3

guessesAndGivens3_20141229Small

<<<[]>

Advertisements

Transhelminthism:

If you want to find out if a digital nematode is alive, try asking it.

Fancy living in a computer? Contributors to the OpenWorm project aim to make life inside a computer a (virtual) reality. In recent years, various brain projects have focused funding on moonshot science initiatives to map, model and ultimately understand the human brain: the computer that helps humans to cognito that they sum. These are similar in feel to the human genome project of the late 1990s and early 2000s. Despite the inherent contradictions of the oft-trotted trope that the human brain is the “most complex thing in the universe,” it is indeed quite a complicated machine, decidedly more complex than the human genome. Understanding how it works will take more than mapping every connection, which is akin to knowing every node in a circuit but having no idea what each component is. A multivalent approach at the levels of cells, circuits, connections, and mind offers the most complete picture. OpenWorm coordinator Stephen Larson et al. aim to start by understanding something a little bit simpler: the determinate 304 neuron brain and accompanying body of Caenorhabditis elegans, a soil-dwelling nematode worm that has served as a workhorse in biology for decades.

Genome, Brain

The connectome, a neural wiring diagram of the worm’s brain, has been mapped. The simulation of the worm at the cellular level is an ongoing open-source software program. The first human genome was sequenced only 3 years after the first C. elegans genome, a similar pace for full biological simulation in silico would mean that digital humans, or a reasonable facsimile, are possible within our lifetimes. At the point when these simulations of people are able to fool observers will these entities be alive and conscious? Have rights? Pay taxes? If a digital person claims the validity of their own consciousness should we take their word for it, or determine some metric for ascertaining the consciousness of a simulated person based on our own inspection? For answers to questions of existence and sapience we can turn to our own experience (believing as we do that we are conscious entities), and the venerable history of the questions as discussed in science fiction.

Conversation with the chatbot (a conversational precursor to intelligent software)CleverBot from 2014 December 24.

In the so-called golden age of science fiction characters tended to be smart, talented, and capable. Aside from an unnerving lack of faults and weakness, overall the protagonists were fundamentally human. The main difference between the audience and the actors in these stories was access to better technology. But it may be that this vision of a human future is comically (tragically?) myopic. Even our biology has been changing more quickly as civilisation and technologies develop. If we add a rate of technological advance that challenges the best-educated humans to keep pace, a speed-up of the rate of change in average meteorological variables, and human-driven selective pressure, the next century should be interesting to say the least. When those unobtainyl transferase pills for longevity finally kick in, generational turnover can no longer be counted on to ease adaptation to a step-change in civilisation.

Greg Egan (who may or may not be a computer program) has been writing about software-based people for over two decades. When the mind of a human is not limited to run on a single instance of its native hardware, new concepts such as “local death” and traveling by transmission emerge intrinsically. Most of the characters in novels from writers such as Egan waste little time questioning whether they will still exist if they have to resort to a backup copy of themselves. As in flesh-and-blood humans, persistence of memory plays a key role in the sense of self, but is not nearly so limited. If a software person splits themselves to pursue two avenues of interest, they may combine their experiences upon their reunion, rejoining as a single instance with a transiently bifurcated path. If the two instances of a single person disagree as to their sameness, they may decide to go on as two different people. These simulated people would be unlikely to care (beyond their inevitable battle for civil rights) whether you consider them to be alive and sapient or not, any more so than the reader is likely to disbelieve their own sapience.

Many of the thought experiments associated with software-based person-hood are prompted by a human perception of dubiousness in duplicity: two instances of a person existing at the same time, but not sharing a single experience, don’t feel like the same person. Perhaps as the OpenWorm project develops we can watch carefully for signs of animosity and existential crises among a population of digital C. elegans twinned from the same starting material. We (or our impostorous digital doppelgängers, depending on your perspective) may find out for ourselves what this feels like sooner than we think.

2014-12-29 – Leading comic edited for improved comedic effect

Why it always pays (95% C.I.) to think twice about your statistics

IMG_20141208_191145

The northern hemisphere has just about reached its maximum tilt away from the sun, which means many academics will soon get a few days or weeks off to . . . revise statistics! Winter holidays are the perfect time to sit back, relax, take a fresh introspective at the research you may have been doing (and that which you haven’t) and catch up on all that work you were too distracted by work to do. It is a great time to think about the statistical methods in common use in your field and what they actually mean about the claims being made. Perhaps an unusual dedication to statistical rigour will help you become a stellar researcher, a beacon to others in your discipline. Perhaps it will just turn you into a vengefully cynical reviewer. At the least it should help you to make a fool of yourself ever-so-slightly less often.

First test your humor (description follows in case you prefer a mundane account to a hilarious webcomic): http://xkcd.com/882/

In the piece linked above, Randall Munroe highlights the low threshold for reporting significant results in much of science (particularly biomedical research) and specifically the way these uncertain results are over and mis-reported in the lay press. The premise is that researchers perform experiments to determine whether jelly beans of 20 different colours have anything to do with acne. After setting their p-value threshold at 0.05, they find in one of the 20 experiments that there is a statistically significant association between green jelly beans and acne. I would consider the humour response to this webcomic a good first-hurdle metric if I were a PI interviewing applicants for new students/post-docs.

In Munroe’s comic, the assumption is that jelly beans never have anything to do with acne and that 100% of the statistically significant results are due to chance. Assuming that all of the other results were also reported in the literature somewhere (although not likely to be picked up by the sensationalist press), this would give the proportion of reported results that fail to reflect reality at an intuitive and moderately acceptable 0.05, or 5%.
Let us instead consider a slightly more lab-relevant version:

Consider a situation where some jelly beans do have some relationship to the medical condition of interest, say 1 in 100 jelly bean variants are actually associated in some way with acne. Let us also swap small molecules for jelly beans, and cancer for acne, and use the same p-value threshold of 0.05. We are unlikely to report negative results where the small molecule has no relationship to the condition. We test 10000 different compounds for some change in a cancer phenotype in vitro.

Physicists may generally wait for 3-6 sigmas of significance before scheduling a press release, but for biologists publishing papers the typical p-value threshold is 0.05. If we use this threshold and perform our experiment and go directly to press with the statistically significant results of the experiment, 83.9% of our reported positive findings will be wrong. In the press, a 0.05 p-value will often be interpreted as “only 5% chance of being wrong.” This is certainly not what we see here, but after some thought the error rate is expected and fairly intuitive. Allow me to illustrate with numbers.

As expected from the conditions of the thought experiment 1%, or 100 compounds, of these have a real effect. Setting our p-value at the widely accepted 0.05, we will also uncover purely by chance non-existent relationships between 495 (0.05 * 99000 with no effect) of the compounds and our cancer phenotype of interest. If we assume that the probability of failing to detect a real effect due to chance are complementary to detecting a fake effect, we will pick up 95 of the 100 actual cases we are interested in. Our total positive results will be 495 + 95 = 590, but only 95 of those reflect a real association. 495/590, or about 83.9%, will be false positives.

Such is the premise of a short and interesting write-up by David Calquhoun on false discovery rates [2]. The emphasis is on biological research because that is where the problem is most visible, but the considerations discussed should be of interest to anyone conducting research. On the other hand, let us remember that confidence due to technical replicates does not generally translate to confidence in a description of reality, e.g. the statistical confidence in the data from the now-infamous faster-than-light neutrinos from the OPERA detector (http://arxiv.org/pdf/1109.4897v4.pdf) was very high, but the source of the anomaly was instrumentation and two top figures from the project eventually resigned after overzealous press coverage pushed the experiment into the limelight. Paul Blainey et al. discuss the importance of considering the effect of technical and biological (or more generally, experimentally relevant) replicates in a recent Nature Methods commentary [3].

I hope the above illustrates my thought that a conscientious awareness of the common pitfalls in one’s own field, as well as those one closely interacts, is important for slogging through the avalanche of results published every day and for producing brilliant work of one’s own. This requires continued effort in addition to an early general study of statistics, but I would suggest it is worth it. To quote [2] “In order to avoid making a fool of yourself you need to know how often you are right when you declare a result to be significant, and how often you are wrong.”

Reading:

[1]Munroe, Randall. Significant. XKCD. http://xkcd.com/882/

[2] Colquhoun, David. An investigation of the false discovery rate and the misinterpretation of p-values. DOI: 10.1098/rsos.140216. Royal Society Open Science. Published 19 November 2014. http://rsos.royalsocietypublishing.org/content/1/3/140216

[3] Blainey, Paul, Krzywinski, Martin, Altman, Naomi. Points of Significance: Replication. Nat Meth (2014) 11.9 879-880. http://dx.doi.org/10.1038/nmeth.3091

Ant Farm

Some photos from last summer: entomological agriculture on a nice plot of thistle. No Lieberkühn here! Just strong summer sunlight, an influence I strongly miss as we near the shortest day in the northern hemisphere and I continue to adjust to a home base in Scotland.

Click through for full resolution.

DSC_0776

DSC_0808

DSC_0793
DSC_0649