LAST MINUTE DIY Pinhole Viewer for Eclipse (4 steps)

Nobody invited you to their Great American Eclipse 2017 party? This is the first you’ve heard of it? Maybe you just forgot to prepare, with all that procrastination you’ve had to do since the last total eclipse in 1979. Don’t worry! You still have time. You can impress your friends and delight your coworkers with this

Step 1: Get a hand

Maybe you have one of these lying around the house, or you can borrow one from a friend. Pretty much any model will do, so don’t waste time being too picky.

Step 2: Make hand into pinhole shape

OK here’s the tricky part. Hopefully you’ve managed to keep track of that hand since step 1. Make a pinhole with it. Anticipate. The Great Eclipse is going to be portenting real epic-like any minute now.

Step 3: Arrange the pinhole in between the sun and a flat, light surface.

Alright here’s the second tricky part. You’ll want to really get this dialed in before the eclipse starts. Put the hand in between the sun and a viewing surface, I used a sheet of paper. Make sure the hand is still in a pinhole shape. Change the angle and position of the hand until an image of the sun forms on the paper, it might take a few minutes to get everything lined up. The farther the pinhole is from the surface the larger the projected image but the more difficult it is to maintain alignment, and eventually the image is too dim to compete with ambient scattered light. Eclipses happen pretty often if you’re willing to travel, but the next one crossing the lower 48 won’t happen until 2024.

Step 4: Realize you are not in the continental United States right now.


What if they had put off the LIGO upgrades?

If a neutron star falls into a black hole but no one has upgraded the gravitational observatory to the required sensitivity, does it fail completely to change our view of the universe?

The Advanced Laser Interferometry Gravitational Observatory (aLIGO) consists of a pair of Fabry–Pérot Interferometers spaced about 3000 km apart, each sporting two cavities about 4 km long and sensitive to length changes smaller than a proton. The tubes containing the optics operate at a vacuum with about 10 times lower pressure than that experienced by the International Space Station in low earth orbit. The lasers put out in excess of 100 kW of laser power, and the power in the chambers is further amplified by each photon reflecting off of the test mass and back several hundred times. Each 20 kg test mass is balanced precariously on threads of glass thinner than things that are really rather thin already. In other words, it’s a huge friggin’ laser powerful enough to burn a burrito, with components precariously balanced in an inside out space ship.

On the 14th of September 2015, these instruments recorded measurements that would support the idea that spacetime changes size when masses accelerate. We usually refer to the instruments and all aspects of the research program supporting it by the same acronym: LIGO. Perhaps you’ve heard of it?

Although the colloquial story is that LIGO recorded the historic GW150914 gravitational wave event during an engineering run even before beginning formal scientific data collection, this isn’t strictly true. In fact LIGO had been performing science runs at Hanford and Livingston sites since 2002. In 2005, LIGO reached an original design sensitivity of strain detection on the order of one part in 1021. Another way to think about, and the common way to report, the sensitivity of the instruments is the distance at which a typical neutron-pair inspiral could nominally be detected. One part in 1021 strain sensitivity corresponds to a search distance of about 8 million parsecs (about 26 million light years). This was the sort of sensitivity LIGO was capable of up until the latter part of 2010. As impressive as that is, there were no gravitational wave detections during operation of LIGO from 2002 to 2010.

The now famous GW150914 and subsequent detections GW151226 and GW170104 came after a comprehensive suite of upgrades that boosted sensitivity to a search distance of 80 million parsecs (~262 light years) away. Four years of shutdown beginning in 2010 marked the transition from “intial LIGO” to “advanced LIGO” (aLIGO). Four years sounds like quite a while in human time, and an especially conservative experimenter might be wont to keep collecting data until proof-of-concept is established. As long as the machine is working in some rudimentary fashion, pushing to eke out just one detection before shutting down for risky upgrades might sound like it makes sense. What if LIGO had put off the upgrades to instead continue with scientific runs? Not much, as it turns out.

Our best guess for the frequency of observable events is based on what aLIGO picked up in the first science run. The first advanced run had about 1100 hours of uptime, time when both instruments were locked-in and active. During this run aLIGO’s picked up 2 confirmed events (and one almost event, yet unconfirmed), giving us a rate of 2 events per 1100 hours in a volume of 2.145 trillion parsecs cubed (the search volume for an 80 Mega-parsec detection distance). This leads us to expect 1 detection for every 22.92 days of run time, or about 16 detections per year, not considering instrument downtime.

Prioritizing data collection at the cost of forgoing upgrades, we would probably still be waiting on the big announcement. Operating at a pre-2014 sensitivity of 8 Mparsecs, we could expect a detection on average once ever 62 years. Assuming a Poisson distribution (events are random), the chances of one or more detections in 4 years of data collection, pre-aLIGO sensitivity, would be just a tick over 6%. For a 50/50 split in the odds of making a detection, we’d have to wait 44 years. Chances are, funding bodies could very well lose interest in that time, and we certainly would not have seen the international enthusiasm in gravitational wave research resulting from the GW150914 announcement.

The moral of the story? The difference between being “productive” and creating something great lies in the old “work smarter, not harder” paradigm. Blind diligence and the perseverance to keep on plugging away has little chance to push the boundaries of what is known to be possible.

Curious about any of the calculations discussed above? Tinker with my notes in this Jupyter notebook

I too built a rather decent deep learning rig for 900 quid

Skip to the components list
Skip to the benchmarks

Robert Heinlein’s 1957 Door into Summer returns throughout to a theme of knowing when it’s “time to railroad.” Loosely speaking this is the idea that one’s success comes as much from historical context as it does from innate ability, hard work, and luck (though much of the latter can be attributed to historical context).

Much of the concepts driving our modern AI renaissance are decades old at least- but the field lost steam as the computers were too slow and the Amazookles of the world were yet to use them to power their recommendation engines and so on. In the meantime computers have gotten much faster and much better at beating humans at very fancy games. Modern computers are now fast enough to make deep learning feasible, and it works for many problems as well as providing insight into how our own minds might work.

I too have seen the writing on the wall in recent years. I can say with some confidence that now is the time to railroad, and by “railroad” I mean revolutionise the world with artificial intelligence. A lot of things changed in big ways during the original “time to railroad,” the industrial revolution. For some this meant fortune and progress and for others, ruin. I would like to think that we are all a bit brighter than our old-timey counterparts were back then and we have the benefit of our history to learn from, so I’m rooting for an egalitarian utopia rather than an AI apocalypse. In any case, collective stewardship of the sea changes underway is important and this means the more people learn about AI the less likely the future will be decided solely by the technocratic elites of today.

I’ve completed a few MOOCs on machine learning in general and neural networks in particular, coded up some of the basic functions from scratch and I’m beginning to use some of the major libraries to investigate more interesting ideas. As I moved on from toy examples like MNIST and housing price prediction one thing became increasingly clear:

It took me a week of work to realize I was totally on the wrong track training a vision model meant to mimic cuttlefish perception on my laptop. This sort of wasted time really adds up, so I decided to go deeper and build my own GPU-enhanced deep learning rig.

Luckily there are lots of great guides out there as everyone and their grandmother is building their own DL rig at the moment. Most of the build guides have something along the lines of “. . . for xxxx monies” in the title, which makes it easier to match budgets. Build budgets run the gamut from the surprisingly capable $800 machine by Nick Condo to the serious $1700 machine by Slav Ivanov all the way up to the low low price of “under $5000” by Kunal Jain. I did not even read the last one because I am not made of money.

I am currently living in the UK, so that means I have to buy everything in pounds. The prices for components in pounds sterling are. . . pretty much the same as they are in greenbacks. The exchange rate to the British pound can be a bit misleading, even now that Brexit has crushed the pound sterling as well as our hopes and dreams. In my experience it seems like you can buy about the same for a pound at the store as for a dollar in the US or a euro on the continent. It seems like the only thing they use the exchange rate for is calculating salaries.

I’d recommend first visiting Tim Dettmers’ guide to choosing the right GPU for you. I’m in a stage of life where buying the “second cheapest” appropriate option is usually best. With a little additional background reading and following Tim’s guide, I selected the Nvidia GTX 1060 GPU with 6GB of memory. This was from Tim’s “I have little money” category, one up from the “I have almost no money” category, and in keeping with my life philosophy of the second-cheapest. Going to the next tier up is often close to twice as costly, but not close to twice as good. This holds for my choice of GPUs as well: a single 1070 is about twice the cost and around 50% or so faster than a 1060 However, two 1060s does get you pretty close to twice the performance, and that’s what I went with. As we’ll see in the benchmarks Tensorflow does make it pretty easy to take advantage of both GPUs, but doubling the capacity of my DLR by doubling the GPUs in the future won’t be plausible.

My upgradeability is somewhat limited by the number of threads (4) and PCIe lanes (16) of the modest i3 CPU I chose; if a near-term upgrade was a higher priority, I should have left out the second 1060 GPU and diverted that part of a budget to a better CPU (e.g. the Intel Xeon E5-1620 V4 recommended by Slav Ivanov). But if you’re shelling out so much for a higher-end system you’ll probably want a bigger GPU to start with, and it’s easy to see how one can go from a budget of $800 to $1700 rather quickly.

The rest of the computer’s job is to quickly dump data into the GPU memory without messing things up. I ended up using almost all the same components as those in Nick’s guide because, again, my physical makeup is meat rather than monetary in nature.

Here’s the full list of components. I sourced what I could from Amazon Warehouse Deals to try and keep the cost down.

GPU (x2): Gigabyte Nvidia GTX 1060 6GB (£205.78 each)
Motherboard: MSI Intel Z170 KRAIT-GAMING (£99.95)
CPU: Intel Core i3 6100 Skylake Dual-Core 3.7 GHz Processor (£94.58)
Memory: Corsair CMK16GX4M2A2400C14 Vengeance 2x8GB (1£05.78)
PSU: Corsair CP-9020078-UK Builder Series 750W CS750M ATX/EPS Semi-Modular 80 Plus Gold Power Supply Unit (£77.25)
Storage: SanDisk Ultra II SSD 240 GB SATA III (£72.18)
Case: Thermaltake Versa H23 (27.10)

Total: £888.40

I had never built a PC before and didn’t have any idea what I was doing. Luckily, Youtube did, and I didn’t even break anything when I slotted all the pieces together. I had an install thumb drive for Ubuntu 16.04 hanging around ready to go and consequently I was up and running relatively quickly.

The next step was installing the drivers and CUDA developer’s toolkit for the GPUs. I’ve been working mainly with Tensorflow lately, so I followed their guide to get everything ready to take advantage of the new setup. I am using Anaconda to manage Python environments for now, so I made one with tensorflow and another with tensorflow_gpu packages.

I decided to train on the CIFAR10 image classification dataset using this tutorial to test out the GPUs. I also wanted to see how fast training progresses on a project of mine, a two-category classifier for quantitative phase microscope images.

The CIFAR10 image classification tutorial from was perfect because you can flag for the training to take place on one or two GPUs, or train on the CPU alone. It takes ~1.25 hours to train the first 10000 steps on the CPU, but only 4 minutes for the same training on one 1060. That’s a weeks-to-days/days-to-hours/hours-to-minutes level of speedup.

# CPU 10000 steps
2017-06-18 12:56:38.151978: step 0, loss = 4.68 (274.9 examples/sec; 0.466 sec/batch)
2017-06-18 12:56:42.815268: step 10, loss = 4.60 (274.5 examples/sec; 0.466 sec/batch)

2017-06-18 14:12:50.121319: step 9980, loss = 0.80 (283.0 examples/sec; 0.452 sec/batch)
2017-06-18 14:12:54.652866: step 9990, loss = 1.03 (282.5 examples/sec; 0.453 sec/batch)

# One GPU
2017-06-18 15:50:16.810051: step 0, loss = 4.67 (2.3 examples/sec; 56.496 sec/batch)
2017-06-18 15:50:17.678610: step 10, loss = 4.62 (6139.0 examples/sec; 0.021 sec/batch)
2017-06-18 15:50:17.886419: step 20, loss = 4.54 (6197.2 examples/sec; 0.021 sec/batch)

2017-06-18 15:54:00.386815: step 10000, loss = 0.96 (5823.0 examples/sec; 0.022 sec/batch)

# Two GPUs
2017-06-25 14:48:43.918359: step 0, loss = 4.68 (4.7 examples/sec; 27.362 sec/batch)
2017-06-25 14:48:45.058762: step 10, loss = 4.61 (10065.4 examples/sec; 0.013 sec/batch)

2017-06-25 14:52:28.510590: step 6000, loss = 0.91 (8172.5 examples/sec; 0.016 sec/batch)

2017-06-25 14:54:56.087587: step 9990, loss = 0.90 (6167.8 examples/sec; 0.021 sec/batch)

That’s about 21-32x speedup on the GPUs. Not quite double the speed on two GPUs because the model isn’t big enough to utilize all of both GPUs, as we can see in the output from nvidia-smi

# Training on one GPU

# Training on two GPUs

My own model had a similar speedup, going from training about one 79-image minibatch per second to training more than 30 per second. Trying to train this model on my laptop, a Microsoft Surface Book, I was getting about 0.75 steps a second. [Aside: the laptop does have a discrete GPU, a variant of the GeForce 940M, but no Linux driver that I’m aware of :/].

# Training on CPU only
INFO:tensorflow:global_step/sec: 0.981465
INFO:tensorflow:loss = 0.673449, step = 173 (101.889 sec)
INFO:tensorflow:global_step/sec: 0.994314
INFO:tensorflow:loss = 0.64968, step = 273 (100.572 sec)

# Dual GPUs
INFO:tensorflow:global_step/sec: 30.3432
INFO:tensorflow:loss = 0.317435, step = 90801 (3.296 sec)
INFO:tensorflow:global_step/sec: 30.6238
INFO:tensorflow:loss = 0.272398, step = 90901 (3.265 sec)
INFO:tensorflow:global_step/sec: 30.5632
INFO:tensorflow:loss = 0.327474, step = 91001 (3.272 sec)
INFO:tensorflow:global_step/sec: 30.5643
INFO:tensorflow:loss = 0.43074, step = 91101 (3.272 sec)
INFO:tensorflow:global_step/sec: 30.6085

Overall I’m pretty satisfied with the setup, and I’ve got a lot of cool projects to try out on it. Getting the basics for machine learning is pretty easy with all the great MOOCs and tutorials out there, but the learning curve slopes sharply upward after that. Working directly on real projects with a machine that can train big models before the heat-death of the universe is essential for gaining intuition and tackling cool problems.

A Skeptic Over Coffee: Young Blood Part Duh

Does this cloudy liquid hold the secret to vitality in your first 100 years and beyond? I can’t say for sure that it doesn’t. What I can say is that I would happily sell it to you for $8,000.

Next time someone tries to charge you a premium to intravenously imbibe someone else’s blood plasma, you have my permission to tell them no thanks. Unless there’s a chance that it is fake, then it might be worth doing.

Californian company Ambrosia LLC has been making the rounds in publications like the New Scientist hype-machine to promote claims that their plasma transfusions show efficacy at treating symptomatic biomarkers of aging. Set up primarily to exploit rich people by exploiting younger, poorer people on the off chance that the Precious Bodily Fluids of the latter will invigorate the former, the small biotech firm performed a tiny study of over-35s receiving blood plasma transfusions from younger people. It’s listed on and everything.

First of all, to determine the efficacy of a treatment it’s important that both the doctors and the patients are blinded to whether they are administering/being administered the active therapeutic. That goes all the way up the line from the responsible physician to the phlebotomist to the statistician analyzing the data. But to blind patients and researchers the study must include a control group receiving a placebo treatment, which in this case there was not. So it’s got that going for it.

To be fair, this isn’t actually bad science. For that to be true, it would have to be actual science. Not only does a study like this require a control to account for any placebo effect*, but the changes reported for the various biomarkers may be well within common fluctuations.

Finally, remember that if you assess 20 biomarkers with the common confidence cutoff of p=0.05, chances are one of the twenty will show a statistical difference from baseline. That is the definition of a p-value at that level: a 1 in 20 chance of a difference being down to random chance. Quartz reports the Ambrosia study looked at about 100 different biomarkers and mentions positive changes in 3 of them. I don’t know if they performed statistical tests at a cutoff level of 0.05, but if so you should expect on average 5 of 100 biomarkers in a screen to show a statistical difference. This isn’t the first case of questionable statistics selling fountain of youth concepts.

All of this is not to say that the experiments disprove the positive effects of shooting up teenage PBFs. It also generated zero conclusive evidence against the presence of a large population of English teapots in erratic orbits around Saturn.

You could conclude by saying “more scientific investigation is warranted” but that would imply the work so far was science.

* The placebo effect can even apply to as seemingly objective a treatment as surgery. Take this 2013 study that found no statistical difference in the outcomes of patients with knee problems treated with either arthroscopic surgery or a surgeon pretending to perform the surgery.


​What the cornerstone of any futuristic transportation mix should be.

The future has always promised exciting new forms of transport for the bustling hither and thither of our undoubtedly jumpsuit-wearing, cyborg selves. From the outlandish (flying cars) to the decidedly practical (electric cars), a better way of getting about is always just around the corner. Workers in the United States spend about 26 minutes twice a day on their commutes, and for most people this means driving. What’s worse, the negative effect of a long commute on life satisfaction is consistently under-estimated. Premature deaths in the United States due to automobile accidents and air pollution from vehicles are about 33,000 and an estimated 58,000 yearly, respectively. Add in all the costs associated with car ownership and road maintenance (not to mention the incalculable cost of automobiles’ contribution to the potentially existential threat of climate change) and the picture becomes clear: cars aren’t so much a convenient means of conveyance serving the humans they carry, but rather a demanding taskmaster that may be the doom of us all. There must be something better awaiting us in the transportation wonders of tomorrow.

What if we came up with a transportation mode that is faster than taking the bus, costs less than driving, and improves lifespan? What if it also happened to be the most efficient means of transport known? Anything offering up that long list of pros should be a centerpiece of any transportation blend, what wonder of future technology could I possibly be talking about?

I’m writing, of course, about the humble bicycle.

Prioritizing exotic transportation projects like Elon Musk’s hyperloop is like inventing a new type of ladder to reach the highest branches, all the while surrounded on all sides by drooping boughs laden with low-hanging fruit. In a great example of working harder, not smarter, city planners in the U.S. strive tirelessly to please our automobile overlords. Everyone needs a car to get to work and the supermarket, because everything is far apart, and everything is so far apart because everyone drives everywhere anyway. All the parking spaces and wide lanes push everything even further apart in a commuting nightmare feedback loop.

It doesn’t have to be that way, and it’s not too late to change. Consider the famously bikeable infrastructure of the Netherlands, where the bicycle is king. Many people take the purpose-built bike lanes for granted and assume they’ve always been there, but in fact they are a result of deliberate activism leading to a broad change in transportation policy beginning in the seventies. Likewise, the servile relationship many U.S. cities maintain with cars is not set in stone, and, contrary to popular belief, fuel taxes and registration fees don’t cover the costs

Even if every conventional automobile was replaced tomorrow with a self-driving electric car, a bicycle would still be a more efficient choice. The reason comes down to simple physics: a typical bike’s ~10 kgs is a fraction of the mass of the average rider, so most of the energy delivered to the pedals goes toward moving the human cargo. A car (even a Tesla) has to waste most of its energy moving the car itself. The only vehicle that has a chance of besting the bicycle in terms of efficiency is an electric-assist bicycle, once you factor in the total energy costs of producing and shipping the human fuel (food), but even that depends on where you buy your groceries [pdf].

Bicycles have been around in more or less modern form for over a hundred years, but the right tool isn’t necessarily the newest. The law of parsimony posits that the simplest solution that suffices is generally the best, and for many of our basic transport needs that means a bicycle. It’s about time we started affording cycling the respect it deserves as a central piece of our future cities and towns. Your future transportation experience may mean you’ll go to the office in virtual reality, meet important clients by hybrid dirigible, and ship supplies to Mars by electric rocket, but you’ll pick up the groceries by bicycle on the way home from the station.

Image sources used for illustrations:

Fat bike CC SA BY Sylenius

Public Domain:

Tire tracks

Lunar lander module>

Apollo footprint

Trolling a Neural Network to Learn About Color Cues

Neural networks are breaking into new fields and refining roles in old ones on a day-to-day basis. The main enabling breakthrough in recent years is the ability to efficiently train networks consisting of many stacked layers of artificial neurons. These deep learning networks have been used for everything from tomographic phase microscopy to learning to generate speech from scratch.

A particularly fun example of a deep neural net comes in the form of one @ColorizeBot, a twitter bot that generates color images from black and white photographs. For landscapes, portraits, and street photography the results are reasonably realistic, even if they do fall into an uncanny valley that is eery, striking, and often quite beautiful. I decided to try and trick @ColorizeBot to learn something about how it was trained and regularized, and maybe gain some insights into general color cues. First, a little background on how @ColorizeBot might be put together.

According to the description on @ColorizeBot’s Twitter page:

I hallucinate colors into any monochrome image. I consist of several ConvNets and have been trained on millions of images to recognize things and color them.

This tells us that CB is indeed an artificial neural network with many layers, some of which consist of convolutional layers. These would be sharing weights and give deep learning the ability to discover features from images rather than relying on a conventional machine vision approach of manual extraction of image features to train an algorithm. This gives CB the ability to discover important indicators of color that their handler wouldn’t necessarily have thought of in the first place. I expect CB was trained as a special type of autoencoder. Normally, an autoencoding neural network has the same data on both the input and output side and iteratively tries to reproduce the input at the output in an efficient manner. In this case instead of producing a single grayscale image at the output, the network would need to produce three versions, one image each for red, green, and blue color channels. Of course, it doesn’t make sense to totally throw away the structure of the black and white image and the way the authors include this a priori knowledge to inform the output must have been important for getting the technique to work well and fast. CB’s twitter bio claims to have trained on millions of photos, and I tried to trick it into making mistakes and revealing something about it’s inner workings and training data. To do this, I took some photos I thought might yield interesting results, converted them to grayscale, and sent them to @ColorizeBot.

The first thing I wanted to try is a classic teaching example from black and white photography. If you have ever thought about dusting off a vintage medium format rangefinder and turning your closet into a darkroom, you probably know that a vibrant sun-kissed tomato on a bed of crisp greens looks decidedly bland on black and white film. If one wishes to pursue the glamorous life of a hipster salad photograher, it’s important to invest in a few color filters to distinguish red and green. In general, red tomatoes and green salad leaves have approximately the same luminance (i.e. brightness) values. I wrote about how this example might look through the unique eyes of cephalapods, which can perceive color with only one color type of photoreceptor. Our own visual system can only see contrast between the two types of object by their color, but if a human viewer looks at a salad in a dark room (what? midnight is a perfectly reasonable time for salad), they can still tell what is and is not a tomato without distinguishing the colors. @ColorizeBot interprets a B&W photo of cherry tomatoes on spinach leaves as follows:


This scene is vaguely plausible. After all, it some people may prefer salads with unripe tomatoes. Perhaps meal-time photos from these people’s social media feeds made it into the training data for @ColorizeBot. What is potentially more interesting is that this test image revealed a spatial dependence- the tomatoes in the corner were correctly filled in with a reddish hue, while those in the center remain green. Maybe this has something to do with how salad images used to train the bot were framed. Alternatively, it could be that the abundance of leaves surrounding the central tomatoes provide a confusing context and CB is used to recognizing more isolated round objects as tomatoes. In any case it does know enough to guess than spinach is green and some cherry tomatoes are reddish.

Next I decided to try and deliberately evoke evidence of overfitting with an Ishihara test. These are the mosaic images of dots with colored numbers written in the pattern. If @ColorizeBot scraped public images from the internet for some of its training images, it probably came across Ishihara tests. If the colorizer expects to see some sort of numbers (or any patterned color variation) in a circle of dots that looks like a color-blindness test, it’s probably overfitting; the black and white image by design doesn’t give any clues about color variation.


That one’s a pass. The bot filled in the flyer with a bland brown coloration, but didn’t overfit by dreaming up color variation in the Ishihara test. This tells us that even though there’s a fair chance the neural net may have seen an imagef like this before, it doesn’t expect it every time it sees a flat pattern of circles. CB has also learned to hedge its bets when looking at a box of of colored pencils, which could conceivably be a box of brown sketching pencils.


What about a more typical type of photograph? Here’s an old truck in some snow:


CB managed to correctly interpret the high-albedo snow as white (except where it was confused by shadows), and, although it made the day out to be a bit sunnier than it actually was, most of the winter grass was correctly interpreted as brown. But have a look on the right hand side of the photo, where apparently CB decided the seasons changed to a green spring in the time it takes to scan your eyes across the image. This is the sort of surreal, uncanny effect that CB is capable of. It’s more pronounced, and sometimes much more aesthetic, on some of the fancier photos on CB’s Twitter feed. The seasonal transformation from one side of the photo tells us something about the limits of CB’s interpretation of context.

In a convolutional neural network, each part of an input image is convolved with kernels of a limited size, and the influence of one part of the image on its neighbors is limited to some degree by the size of the largest kernels. You can think of these convolutional kernels as smaller sub-images that are applied to the full image as a moving filter, and they are a foundational component of the ability of deep neural networks to discover features, like edges and orientations, without being explicitly told what to look for. The results of these convolutional layers propagate deeper through the network, where the algorithm can make increasingly complex connections between aspects of the image.

In the snowy truck and in the tomato/spinach salad examples, we were able to observe @ColorizeBot’s ability to change it’s interpretation of the same sort of objects across a single field of view. If you, fellow human, or myself see an image that looks like it was taken in winter, we include in our expectations “This photo looks like it was taken in winter, so it is likely the whole scene takes place in winter because that’s how photographs and time tends to work.” Likewise, we might find it strange for someone to have a preference for unripe tomatoes, but we’d find it even stranger for someone to prefer a mixture of ripe-ish and unripe tomatoes on the same salad. Maybe the salad maker was an impatient type suffering from a tomato shortage, but given a black and white photo that wouldn’t be my first guess on how it came to be based on the way most of the salads I’ve seen throughout my life have been constructed. In general we don’t see deep neural networks like @Colorizebot generalizing that far quite yet, and the resulting sense of context can be limited. This is different from generative networks like Google’s “Inception” or style transfer systems like, which perfuse an entire scene with a cohesive theme (even if that theme is “everything is made of duck’s eyes”).

Finally, what does CB think of theScinder’s logo image? It’s a miniature magnetoplasmadynamic thruster built out of a camera flash and magnet wire. Does CB have any prior experience with esoteric desktop plasma generators?


That’ll do CB, that’ll do.

Can’t get enough machine learning? Check out my other essays on the topic

@ColorizeBot’s Twitter feed

@CtheScinder’s Twitter feed

All the photographs used in this essay were taken by yours truly, (at, and all images were colorized by @ColorizeBot.

And finally, here’s the color-to-B&W-to-color transformation for the tomato spinach photo:


Journalistic Phylogeny of the Silicon Valley Apocalypse

For some reason, doomsday mania is totally in this season.

In 2014 I talked about the tendency of internet writers to regurgitate the press release for trendy science news. The direct lineage from press release to press coverage makes it easy for writers to phone it in: university press offices essentially hand out pre-written sensationalist versions of recent publications. It’s not surprising that with so much of the resulting material in circulation taking text verbatim from the same origin, it is possible to visualize the similarities as genetic sequences in a phylogenetic tree.

Recently the same sort of journalistic laziness reared its head as stories about the luxury doomsday prepper market. Evan Osnos at The New Yorker wrote an article describing the trend in Silicon Valley to buy up bunkers, bullets, and body armor-they think we’ll all soon rise up against them following the advent of A.I. Without a press release to serve as a ready-made template, other outlets turned to reporting on the New Yorker story itself as if it were a primary source. This is a bit different than copying down the press release as your own, and the inheritance is not as direct. If anything, this practice is even more hackneyed. At least a press office puts out their releases with the intention that the text serves as material for coverage so that the topic gets as much circulation as possible. Covering another story as a primary source, rather than writing an original commentary or rebuttal, is just a way to skim traffic off a trend.

In any case, I decided to subject this batch of articles to my previous workflow: converting the text to a DNA sequence with DNA writer by Lensyl Urbano, aligning the sequences with MAFFT and/or T-Coffee Expresso, and using the distances from the alignment to make a tree in Here’s the result:


Heredity isn’t as clear-cut as it was when I looked at science articles: there’s more remixing in this case and we see that in increased branch distances from the New Yorker article to most of the others. Interestingly, there are a few articles that are quite close to each other, much more so than they are to the New Yorker article. Perhaps this rabbit hole of quasi-plagiarism is even deeper than it first appears, with one article covering another article about an article about an article. . .

In any case, now that I’ve gone through this workflow twice, the next time I’ll be obligated to automate the whole thing in Python.

You can tinker with the MAFFT alignment, at least for a while, here:

My tree:

a href=””>