Liz: here’s the second and final part of David Plowman’s walk through the development of the Raspberry Pi camera board, which will be available to purchase in April. Before you go ahead and read this, check out David’s first post.
The Eye of the Beholder
That’s where beauty lies, so the saying goes. And for all the test charts, metrics and objective measurements that imaging engineers like to throw at their pictures, it’s perhaps sobering that the human eye – what people actually like – is the final arbiter of Image Quality (IQ). There has been much discussion, and no little research, on what makes for “good IQ”, but the consensus probably has it that while the micro aspects of IQ, such as sharpness, noise and detail, are very important, your eye turns first to the the macro (in the sense of large scale) image features – exposure and contrast, colours and colour balance.
We live in a grey world…
All camera modules respond differently to red, green and blue stimuli. Of itself this isn’t so problematic as the behaviour can be measured, calibrated and transformations applied to map the camera’s RGB response (which you saw in the final ugly image of my previous post!) onto our canonical (or standard) notion of RGB. It’s in coping with the different kinds of illumination that things get a little tricky. Let me explain.
Imagine you’re looking at a sheet of white paper. That’s just the thing – it’s always white. If you’re outside on a sunny day, it’s white, and if you’re indoors in gloomy artificial lighting, it’s still white. Yet if you were objectively to measure the colour of the paper with your handy spectrometer, you’d find it wasn’t the same at all. In the first case your spectrometer will tell you the paper is quite blue, and in the second, that it’s very orange. The Human Visual System has adapted itself brilliantly over millions of years simply not to notice any difference, a phenomenon known as colour constancy.
No such luck with digital images, though. Here we have to correct for the ambient illumination to make the colours look “right”. Take a look at the two images below. (You’ll find it easier to judge the “right”-ness if you scroll so only one image is on the screen at a time.)
It’s a scene taken in the Science Park in Cambridge, outside the Broadcom offices. The top one looks fine, but the bottom one has a strong blue cast. This is precisely because the top one has been (in the jargon) white-balanced for an outdoor illuminant and the bottom one for in indoor illuminant. But how do we find the right white balance?
The simplest assumption that camera systems can make is that every scene is, on average, grey, and it works surprisingly well. It has some clear limitations too, of course. With the scene above, a “grey world” white balance would actually give a noticeable yellow cast because of the preponderance of blue sky skewing the average. So in reality more sophisticated algorithms are generally employed which constrain the candidate illuminants to a known set (predominantly those a physicist would describe as being radiated by a black body, which includes sunlight and incandescent bulbs), and in keying on colours other than merely grey (often specific memory colours, such as blue sky or skin tones).
The devil is in the details…
With our colours sorted out, we need to look at the micro aspects of our image tuning. On the Pi, fortunately, we don’t have to worry about focusing, which leaves the noise and sharpening filters within the ISP. Note that some amount of sharpening is essential, really, because of the inherent softening effect of the Bayer mosaic that we saw last time.
When it comes to tuning noise and detail, there are generally two camps. The first camp regards noise as ugly and tries very hard to eliminate it. The second camp thinks a certain amount of noise is tolerable (it can look a bit like film “grain”) in return for better details and a more natural (less processed) look to the image.
To see what I mean, take a look at the following three images. It’s a small crop from a picture of some objects on a mantelpiece, taken in very gloomy orange lighting, and the walls are even a murky pinkish colour too. Pretty challenging for any digital camera!
The top one has had practically no noise filtering applied to it at all. Actually it shows bags of detail, but I think most people would regard the noise as pretty heinous. The second image demonstrates the opposite approach. The noise has been exterminated with extreme prejudice, but out with the bathwater goes the baby – detail and a “natural” looking result. Though my examples are deliberately extreme, you can find the influence of both camps at work in mobile imaging devices today!
The final image shows where we’ve settled with the Pi – a happy medium, I hope, but it does remain, ultimately, a matter of taste. And de gustibus non est disputandum, after all!
I’ve only grazed the surface of this subject – there are many more niggles and wrinkles that an imaging system has to iron out – but I’m hoping I’ve given you some sense of why a proper camera integration represents a significant commitment of time and effort. Whilst you’re all waiting for the boards finally to become available I’ll stick around on this website to answer any questions that I can.
My deep thanks, as ever, is due to those clever engineers at Broadcom who actually make this stuff work.
David Plowman, March 2013
Liz: We’re very close to being able to release the $25 add-on camera board for the Raspberry Pi now. David Plowman has been doing a lot of the work on imaging and tuning. He’s very kindly agreed to write a couple of guest posts for us explaining some more for the uninitiated about the process of engineering the camera module. Here’s the first – I hope you’ll find it as fascinating as I did. Thanks David!
Lights! Camera! … Action?
So you’ve probably all been wondering how it can take quite so long to get a little camera board working with the Raspberry Pi. Shouldn’t it be like plugging in a USB webcam, all plug’n’play? Alas, it’s not as straightforward as you might think. Bear with me for this – and a subsequent – blog posting and I’ll try and explain all.
The Nature of the Beast
The camera we’re attaching to the Raspberry Pi is a 5MP (2592×1944 pixels) Omnivision 5647 sensor in a fixed focus module. This is very typical of the kinds of units you’d see in some mid-range camera phones (you might argue the lack of autofocus is a touch low-end, but it does mean less work for us and you get your camera boards sooner!). Besides power, clock signals and so forth, we have two principal connections (or data buses in electronics parlance) between our processor (the BCM2835 on the Pi) and the camera.
The first is the I2C (“eye-squared-cee”) bus which is a relatively low bandwidth link that carries commands and configuration information from the processor to the image sensor. This is used to do things like start and stop the sensor, change the resolution it will output, and, crucially, to adjust the exposure time and gain applied to the image that the sensor is producing.
The second connection is the CSI bus, a much higher bandwidth link which carries pixel data from the camera back to the processor. Both of these buses travel along the ribbon cable that attaches the camera board to your Pi. The astute amongst you will notice that there aren’t all that many lines in the ribbon cable – and indeed both I2C and CSI are serial protocols for just this reason.
The pixels produced are 10 bits wide rather than the 8 bits you’re more used to seeing in your JPEGs. That’s because we’re ultimately going to adjust some parts of the dynamic range and we don’t want “gaps” (which would become visible as “banding”) to open up where the pixel values are stretched out. At 15fps (frames per second) that’s a maximum of 2592x1944x10x15 bits per second (approximately 750Mbps). Actually many higher-end cameras will give you frames larger than this at up to 30fps, but still, this is no slouch!
Show me some pictures!
So, armed with our camera modules and adapter board, the next job we have is to write a device driver to translate our camera stack’s view of the camera (“use this resolution”, “start the camera” and so forth) into I2C commands that are meaningful to the image sensor itself. The driver has to play nicely with the camera stack’s AEC/AGC (auto-exposure/auto-gain) algorithm whose job it is to drive the exposure of the image to the “Goldilocks” level – not too dark, not too bright. Perhaps some of you remember seeing one of Dom’s early camera videos where there were clear “winks” and “wobbles” in brightness. These were caused by the driver not synchronising the requested exposure changes correctly with the firmware algorithms… you’ll be glad to hear this is pretty much the first thing we fixed!
With a working driver, we can now capture pixels from the camera. These pixels, however, do not constitute a beautiful picture postcard image. We get a raw pixel stream, even more raw, in fact, than in a DSLR’s so-called raw image where certain processing has often already been applied. Here’s a tiny crop from a raw image, greatly magnified to show the individual pixels.
Surprised? To make sense of this vast amount of strange pixel data the Broadcom GPU contains a special purpose Image Signal Processor (ISP), a very deep hardware pipeline tasked with the job of turning these raw numbers into something that actually looks nice. To accomplish this, the ISP will crunch tens of billions of calculations every second.
What do you mean, two-thirds of my pixels are made up?
Yes, it is imaging’s inconvenient truth that fully two-thirds of the colour values in an RGB image have been, well, we engineers prefer to use the word interpolated. An image sensor is a two dimensional array of photosites, and each photosite can sample only one number – either a red, a green or a blue value, but not all three. It was the idea of Bryce Bayer, working for Kodak back in 1976, to add an array of microlenses over the top so that each photosite can measure a different colour channel. The arrangement of reds, greens and blues that you see in the crop above is now referred to as a “Bayer pattern” and a special algorithm, often called a “demosaic algorithm”, is used to create the fully-sampled RGB image. Notice how there are twice as many greens as reds or blues, because our eyes are far more sensitive to green light than to red or blue.
The Bayer pattern is not without its problems, of course. Most obviously a large part of the incoming light is simply filtered out meaning they perform poorly in dark conditions. Mathematicians may also mutter darkly about “aliasing” which makes a faithful reconstruction of the original colours very difficult when there is high detail, but nonetheless the absolutely overwhelming majority of sensors in use today are of the Bayer variety.
Now finally for today’s posting, here’s what the whole image looks like once it has been “demosaicked”.
It’s recognisable, that’s about the kindest thing you can say, but hardly lovely – we would still seem to have some way to go. In my next posting, then, I’ll initiate you into the arcane world of camera tuning…
Oleg Romashin has got a long way already with a port of the open-platform OS to the Raspberry Pi. Firefox OS is scheduled to release some time in 2013; we’re excited to see it’ll be an option for the Raspberry Pi, and it’s fascinating watching work like this progress. In the video below you’ll see Gecko runtime doing its thing, along with some WebGL animations at 60fps.
Oleg’s a Principal Engineer at Nokia. He’s made a download of the work-in-progress port to Raspberry Pi available (this is a direct download link to the tarball; there isn’t currently any other content on his site) if you want to join in the fun.