Drivers?


14 posts
by pej02 » Sun May 27, 2012 4:02 pm
I'd like to know what drivers/software support there are or will be for the CSI camera on the RPi.

Whilst I'm waiting for my RPi to arrive, I've been looking at various other forums to find out how the CSI camera is handled on other devices such as the Beagleboard (BB). It appears many people use the camera boards from Leopard Imaging on the BB/BB-xM. It seems like Video for Linux v.2 (V4L2) is the most common approach but then there appears to be disagreement (due to deprecation) about how to implement the lower-level device drivers depending on whether v.2.6.x or 3.2.y of the Linux kernel is used. I haven't started to look at how the camera is handled in Android as that appears to be a whole different ballgame.

If anyone who has up to-date experience of using CSI cameras for both still image and video capture can share their expertise and best practices on this RPi CSI camera thread it would be well received and form a good source of information for future visitors.
Posts: 1
Joined: Sun May 27, 2012 3:46 pm
by PaulBuxton » Sun Jun 17, 2012 12:22 pm
The approach that has been take with the more recent v4l scheme has separate drivers for the camera sensor and the ISP (image signal processor) there are quite a few cameras sensor drivers available already, the Pi will need an ISP driver to handle the conversion from raw camera data to something you can use (typically YUV or RGB data), this is probably done in the videocore processor.
As James has explained this will need some tweaking for the specific sensor in use.

Android uses a CameraHal component which is a user mode driver. It is typically built on top of either a V4L2 driver, or an OMX driver, but could be completely proprietary if you really wanted (that would make it awkward to debug). I know that Broadcom are quite active in the OpenMax working groups so suspect they have an OMX driver for their ISP.
Posts: 57
Joined: Tue Jan 10, 2012 11:38 am
by PaulBuxton » Mon Jun 18, 2012 12:06 pm
For anyone interseted Laurent Pinchart (one of the main guys working on V4L2) did this presentation describing the architecture and descisions that have gone into the V4L2.

http://free-electrons.com/pub/video/201 ... ia-n9.webm
Posts: 57
Joined: Tue Jan 10, 2012 11:38 am
by jamesh » Mon Jun 18, 2012 12:17 pm
PaulBuxton wrote:The approach that has been take with the more recent v4l scheme has separate drivers for the camera sensor and the ISP (image signal processor) there are quite a few cameras sensor drivers available already, the Pi will need an ISP driver to handle the conversion from raw camera data to something you can use (typically YUV or RGB data), this is probably done in the videocore processor.
As James has explained this will need some tweaking for the specific sensor in use.

Android uses a CameraHal component which is a user mode driver. It is typically built on top of either a V4L2 driver, or an OMX driver, but could be completely proprietary if you really wanted (that would make it awkward to debug). I know that Broadcom are quite active in the OpenMax working groups so suspect they have an OMX driver for their ISP.


Correct, each different camera requires a driver (although they are CSI-2 - that simply an electrical spec, not a spec that allows a single driver to cope for multiple camera). On top of that you them need to create a 'tuning' for that camera, that controls what is done to the data as it passes through the Videocore 4 ISP. These can take months for a competent team as it's a long winded process.

Which is why we want to chose a camera that's already tuned, or that is close enough that only small changes are necessary.

Yes, we do have OpenMAX components to handle the camera (and encoding etc), plus some Arm side library code to provide the IL to OpenMAX.

James
Soon to be employed engineer - Hurrah! Volunteer at the Raspberry Pi Foundation, helper at PiAcademy September 2014.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 11899
Joined: Sat Jul 30, 2011 7:41 pm
by rew » Tue Jun 19, 2012 5:29 am
James, these "adjustments" you're talking about, can't they be handled with a 4x4 matrix multiplication?

I've followed professor Blinn's computer graphics course back in '85, and what I learned is that everything can be done with 4x4 matrix multiplications.

Whatever the sensor delivers, call it X1, X2, X3, make it into a vector (X1, X2, X3, 1) multiply with a 4x4 matrix, normalize by dividing by the last element of the vector and out comes a vector with (R, G, B, 1).

Want YUV? Different matrix, same math. Want black-and-white? Different matrix, same math. So, for this part, it would be sufficient if the videocore would be able to do this math part, and if a we could provide the matrix from the Linux side. All adjustments for the camera can be done "later". If someone finds a better matrix, it's just a configuration file that needs updating. Provide 32bit and 64-bit output modes and done!

And I'm guessing the videocore is pretty good at those 4x4 matrix multiplications. That's what counts when you do 3D stuff. That's actually what professor Blinn tought in his course....

And I'm guessing the videocore can do this with floats. Having a 24-bit matissa is accuracy enough. Even for 16-bit-per-channel output.

So, IMHO, the tuning can be left to a "confguration file" on the Linux side.

Then there is some configuration programming that needs to be done. I'm guessing you have to change some registers (min row, max row, min column, max column, bits per pixel etc) for the camera module starts sending proper data. That's done through I2C, right? Again passthrough to linux and let the userspace handle it.
Check out our raspberry pi addons: http://www.bitwizard.nl/catalog/
User avatar
Posts: 396
Joined: Fri Aug 26, 2011 3:25 pm
by rurwin » Tue Jun 19, 2012 6:26 am
If it takes months for a competent team, then I'm guessing that it isn't as easy as one 4x4 matrix; maybe each pixel needs its own matrix (theoretically). For example, maybe a camera has a green tint in the top left-hand corner and a lack of contrast in the centre, both effects fading out gradually. Or maybe it needs a different matrix (or set of matrices) for each variation in light level.

But I'd be fascinated if James H would tell us more about the process?
User avatar
Forum Moderator
Forum Moderator
Posts: 2932
Joined: Mon Jan 09, 2012 3:16 pm
by PaulBuxton » Tue Jun 19, 2012 10:49 am
Processing data from a camera sensor requires a number of steps.

Camera sensors have a filter on each pixel so that they only capture one color channel

Typically they use a 2x2 square of RGB with there being 2 green pixels for each red and blue pixel (as the human visual system is more sensative to green this gives better preceived quality), although there are sensors coming out with more complicated patterns and also some which are using an unfiltered white pixel as well.
The process of converting this array of single channel pixels into a tri-stimulous (RGB, or YUV) pixel is known as demosaicing, and is one of the key steps in the conversion process. It will typically be done using some form of weighted interpolation.

In addition to demosaicing you need to to do black level correction (camera sensors sometimes have either a row or a border of completely covered pixels so that you can subtract them from your other pixels to adjust for true black.
There is also colour correction,white balance, lens shading (removal of vignetting artifacts), denoising and sharpening to be considered.

Things like denoising are best set up knowing the physical properties of the sensor (how big and deep the pixels are, what gain may be being applied).

Removal of chromatic aberration can also be done in the ISP, this involves a differential scaling of the Red and Blue components relative to the Green (i.e. Scale up the Red channel a bit and scale down the blue), which needs to be adjusted based on the lens used.

Certainly not something that is easily left to the novice, and if I were in Broadcom shoes I would rather wait and release a properly set up camera than rush one out and have people using it to evaluate their ISP and coming away with a bad impression.

Paul
Posts: 57
Joined: Tue Jan 10, 2012 11:38 am
by Burngate » Wed Jun 20, 2012 9:25 am
Yeah But ...
(Devil's advocate)
Isn't the whole thing about learning?
Demosaicing ... I can see the problems, not sure of the solutions, but at least a userland front end to choose a pattern would be useful
Denoising ... what a horrible idea! Unless you mean fixed-pattern noise. In which case that's unique to that sensor.
For the rest: overall black and gain? a couple of constants to be fed back into the GPU. Shading? Parabola constants. Then if you want, third and fourth order variables.
Gamma? (you didn't mention that) Vignetting? Portholing? (you didn't mention that)
Back in the '70s Marconi Mk7 cameras had most of those - dozens of pots on the front panel to feed in the appropriate numbers, plus others - beam current, focus volts and current, and so on that aren't needed now - to get a decent picture. Before that we also had tilt and bend, ahh those were the days.
But a modern camera has all the same things - why not bring them out to the front panel?
Lens problems - chromatic aberration, even portholing and so on - I would agree are less interesting, but the ability to really screw up the picture is part of the learning curve!
On a side note, we needed a new lens - 40-1 zoom, for horse racing. The Japanese one looked reasonable, until we compared it to the British one. Then we realised just how soft the Japanese one was. So we bought British.
Terned out there were a few problems. Zoomed out, barrel distortion was horrendous. But what was worse, was what was going on with the focal plane.
You have to remember that was the days of three separate tubes. Each could be moved forward / back to bring them into optical focus. The scan-sizes were individually controllable. Which was fine, but on that lens the three colours were focussing on different planes, moving independently through the zoom range. Perfect pictures zoomed in meant awful pictures zoomed out, and vice versa.
When solid-state sensors came in, with the three sensors glued to the prism and therefore immovable, it was discovered that almost all tube-era lenses had the same problem, just allowed for by mechanical adjustment of the tubes. So new lenses had to be bought.
Wyszkowski's Second Law: Anything can be made to work if you fiddle with it long enough.
Brain surgery is easier than psychoanalysis
User avatar
Posts: 2900
Joined: Thu Sep 29, 2011 4:34 pm
Location: Berkshire UK
by PaulBuxton » Wed Jun 20, 2012 2:29 pm
Didn't think it was worth trying to detail every little thing, as I was just trying to convey an idea of the [i]potential[/i] complexity (i.e. it isn't going to be just a matrix multiply), and there are many many possible features that may or may not be implemented, depending on the complexity of the ISP.

It would certainly be something that people might find fun to experiment with and is worthwhile people learning about, but for people who just want a good camera for use in something like OpenCV I expect they would be quite happy with Broadcom supplying a firmware update to use with the chosen camera. I think that the bits that people would specifically like playing with themselves (i.e. the GPU functions involved in the pipeline) will need to be supplied as firmware blobs as they are quite likely contain IP that Broadcom wants to keep private (similar to the encode/decode functionality).

What might be nice is if they could provide a raw dump of the sensor values. Some webcams provide this functionality through V4L2, and as a starting point OpenCV has functions to handle the demosaicing function, and people could develop their own usermode algorithms to handle the entire conversion process (check out the FrankenCamera for this kind of thing! http://graphics.stanford.edu/projects/camera-2.0/).

The 3 lens system sounds fun! I can imagine it producing some really wacky images.

In the end I expect that whatever gets released for the camera will be tuned properly, but will expose most of the knobs and dials for people to play around with.

Paul.
Posts: 57
Joined: Tue Jan 10, 2012 11:38 am
by jamesh » Wed Jun 20, 2012 4:13 pm
ISP...some of the stages that need tuning....not included all of them. It's all done with a big file of parameters, about 2000 lines worth on a Nokia N8 which is fairly typical for a 12MP sensor. We could probably make that file accessible to hte host CPU, but there may be some issues as they are probably confidential.

Debayer
Black level
Defective pixel
Denoise
Vignette correction
While balance
Gain Control
Defective pixel
Sharpen
Colour correction
Gamma correction
Geometric distortion
Resize

Remember, all these stages are run in real time (30fps) for things like 1080p video.


Here are some examples of how good the ISP is.....including denoise, Burngate!

http://www.gsmarena.com/pureview_blind_ ... ew-773.php
Soon to be employed engineer - Hurrah! Volunteer at the Raspberry Pi Foundation, helper at PiAcademy September 2014.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 11899
Joined: Sat Jul 30, 2011 7:41 pm
by Burngate » Wed Jun 20, 2012 6:18 pm
jamesh wrote:Here are some examples of how good the ISP is.....including denoise, Burngate!

http://www.gsmarena.com/pureview_blind_ ... ew-773.php

Quite impressive. I'm not expecting quite so much from a Pi-cam. In fact if I cared that much I'd update my Nikon. A bit more expensive than a Pi but ...

Noise ... information you don't want.
Lots of ways to remove noise, but they all depend on what information you think you want.
Easiest way is oversampling - precisely what multi-mega-pixel cameras do. But all you're doing is throwing away information.
Sharpening. Again all sorts of ways to do it, but it comes down to tweaking the frequency response. Amplify the hf, you also amplify any noise in the hf.
Look at those pictures again. Sure, you can zoom in and see lots of detail. But you could (with planning) use an optical zoom, to only shoot the bit you're interested in. Then you could use a smaller sensor, and still get a better picture of the bit you're interested in.
Most phone cameras, and point n' shoots, are for happy-snaps. I see the Pi-cam in the same light. You want better, you should pay for it.
Me? I've spent a fortune on high(ish)-end cameras over the years, and my pictures are still no better than happy-snaps - not through lack of technology, merely talent!
I'm going to buy the Pi-cam, whichever one you choose, just to play with. I expect to be impressed. I'll use the Nikon for proper photography!
Wyszkowski's Second Law: Anything can be made to work if you fiddle with it long enough.
Brain surgery is easier than psychoanalysis
User avatar
Posts: 2900
Joined: Thu Sep 29, 2011 4:34 pm
Location: Berkshire UK
by PaulBuxton » Wed Jun 20, 2012 10:49 pm
Pretty much the feature list I was expecting, I look forward to being able to get a camera with this properly setup.

Burngate

Both noise reduction and sharpening can ( and should) be aware of the fact that the sensors have a Bayer pattern(I.e. need demosaicing), which can be taken into account for both operations. These operations when done on a Dsp or custom hardware can be edge aware, and can give much better results than simple over sampling.

I think James examples speak for them selves.

Paul
Posts: 57
Joined: Tue Jan 10, 2012 11:38 am
by jamesh » Thu Jun 21, 2012 8:24 am
I've done a bit of denoise and sharpen tuning using the ISP (I'm not an image quality expert, but have had to do it for some fairly poor VGA cameras on phones), and the effects are quite good. All sensors produce noise, which gets worse at higher gains, so you tune the denoise vs gain. We do all tuning is a full ISP lap with all the proper lighting etc. The denoise is actually quote a sophisticated HW block with a number of parameters which vary according to both gain and other variables. We do analysis on the results over different shades of grey to give us the right graphs to plug in to the HW block as the noise also varies with the luma. Sharpening is more of a black art!

All camera's do some level of denoise and sharpening, even high end ones.
Soon to be employed engineer - Hurrah! Volunteer at the Raspberry Pi Foundation, helper at PiAcademy September 2014.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 11899
Joined: Sat Jul 30, 2011 7:41 pm
by rew » Sat Jun 30, 2012 10:43 am
jamesh wrote:Debayer
Do this the "obvious" way. leave colour correction to the matrix.
Black level
matrix
Defective pixel
is per-camera different. So will need user-configuration. IMHO for a cheap device like the 'pi and a cheap camera addon, we can skip defective pixel processing, allowing a "learing experience" for the public in how cameras work and how to fix defective pixels.
Denoise
Convolution filter? Size and actual values user configurable?
Vignette correction
Lens dependent. Nice if the GPU can handle this. In essence it's just the scaling of the blue and red components compared to the green. Two parameters. But they change according to focus position. Will the camera be able to focus?
While balance
matrix
Gain Control
matrix
Defective pixel
had that one before.
Sharpen
Does that really happen? After low-pass filtering the raw image you put the high frequencies through a gain stage again? Hmm.
Colour correction
matrix
Gamma correction
I quite strongly think this can be handled by a nonlinear component in the 4x4 matrix.I'm not entirely sure here.
Geometric distortion
I own a Sony DSC505. An older camera (it does close-ups really good). If you move in close it has an enormous barrel distortion. So when Sony can ship an expensive product with big geometric distortion, why can't the RPF ship a cheap product? Oh! it'd be nice to "showcase" that the broadcom GPU can do the de-barrel-conversion in real-time, but it is necessary to have this working BEFORE you release the camera to the public? Anyway, if you want a real-time, corrected image, the 3D pipeline of the GPU, will also allow you to do this. It's just a texture map on a sphere-like object. In fact, that's probably what actually happens.
Resize

You're using this as an argument that it's difficult to tune. Even those with a fancy more-than-full-HD screen will have to scale down an 8MP shot for display on the screen. Of course the GPU can do that in real time. But the scalefactor must be user-adjustable to allow the people with smaller screens to watch their camera output too, right?

I agree: Worst case, people end up giving a whole lot of publicity: "That RPI camera is shitty" while in fact it is a good camera with not all of the processing steps finished/tuned/released yet.

But this is the same with the raspberry pi. Can you imagine releasing a Linux computer with max 4Mb/sec throughput to its main storage device? While the hardware is able to do at least 20Mb/sec? Come on, you've got to be kidding me? RPF did it.
Check out our raspberry pi addons: http://www.bitwizard.nl/catalog/
User avatar
Posts: 396
Joined: Fri Aug 26, 2011 3:25 pm