I'm a video software developer on Linux/Windows,
but am ignorant of the Pi's SoC model for video compression.
Typically on systems I work on, raw video [yuv] is dumped into memory [ie,DMA], and then
sent to whatever H.264 or other codec by somehow queing a pointer to the uncompressed buffer
Am I right in understanding that the CSI-2 data goes straight to the SoC GPU for H.264 compression,
and does not pass through system memory first?
And, as a result, making it much harder to develop a generic H.264 encoder to be used with multiple different camera drivers or front-ends?
The datapath is (usually) CSI2 straight in to the GPU via the ISP. This results in a YUV uncompressed frame in GPU memory. This is then passed to the encoder, the output of which can stay on the GPU or be passed in to Arm memory space. (Usually done using OpenMAX components and buffers). Note that the is only one chunk of memory - at boot its all allocated to the GPU and it hands over a load to the Arm memory manager. To the GPU, thhe memory is a flat address space, to the Arm is managed via the memory manager. This disparate view of memory can mean getting data to and from the Arm may need a copy.
However encoder is in fact completely generic - it can take whatever data is thrown at it from wherever in the GPU, so can be used with anything producing the YUV (maybe some other formats too) frames. So you can have some Arm side code running that passes buffers to the encoder via the encoder component, OpenMAX (and mmal - a Broadcom wrapper over openmax) handles getting the data to the GPU and back. Same process for the decode components.
The difficulty with something like a HDMI in, is writing a driver for the HDMI chip, and however that is attached. Everything else (bar some tweaks) should come out in the wash.