Dan Macnish wants to help people overcome fears about deep learning and artificial intelligence by using a camera that turns photos into cartoons.
We all know how cameras work. You line up your shot, fiddle with the focus, set the shutter speed, and – flash! – you get a perfect image of whatever is in front of you. With Dan Macnish’s camera, however, the process and results are a little different. Instead of outputting a faithful photograph, it attempts to turn what it sees into a cartoon doodle, instantly spitting it out on to thermal paper like a quirky Polaroid picture.
This article first appeared in The MagPi 73 and was written by David Crookes
To do this, it makes use of two key components: a neural network and the dataset produced by Google’s online game Quick, Draw! (quickdraw.withgoogle.com). The game is similar to Pictionary in that it suggests an object to sketch, with the twist being that it then uses AI to predict what you’re attempting to scribble. The result has been more than 50 million doodles, and Dan’s device seeks to match one of them with whatever the neural network reckons has just been snapped.
It’s all rather eye-catching. “I got the idea while experimenting with some of the amazing open source research into neural networks,” Dan tells us. “There are some great projects coming out of this research, but many of them are focused on neural networks themselves, rather than simple applications for these networks. So my project originated as a super-fun, simple application for this research, to be enjoyed by all sorts of different people – not just engineers and developers.”
As such, Dan was keen to keep things simple, fun, and whimsical, and he soon got down to work. He began with some basic sketches and ideas focusing mainly on what the user experience would be. “Through these initial sketches, I set on the idea of a Polaroid camera that draws cartoons, without ever showing the original image,” he says. “I particularly focused on the emotions and feelings that people might have: their surprise when seeing a cartoon and the excitement of not knowing what it would look like.”
At the same time, he began prototyping the software using Python and Jupyter Notebook on his laptop. He ran a pre-trained network over a few photographs before manually browsing the Quick, Draw! dataset and hand-selecting images which he then copied and pasted together in Paint. “These quick prototypes led me to flesh out the software and I moved on to an integrated development environment (IDE). Once I was getting good results on the laptop, I transitioned the code to the Raspberry Pi.”
At the heart of the device is an internet-connected Raspberry Pi 3 running Raspbian Stretch on a 16GB card, with a v2 Camera Module, outputting to a thermal printer, all housed in a simple cardboard casing. When an attached shutter button is pressed, the software interprets what it sees as one of 345 doodles and outputs them.
“The neural network I used can only recognise a small fraction of the categories in the Quick, Draw! dataset,” Dan explains. “So there are still many possibilities for using this dataset that have not yet been imagined.”
Even so, the response has been hugely positive. “People love the concept and it’s great seeing their reactions to some of the funny cartoons that appear,” he says. “One of the most surprising things is that even when the camera prints something completely incorrect, people often interpret the cartoon in a way that makes it seem correct after all. It’s quite entertaining.”