HD video is over 1 Gbit/s. USB 2.0 even running at theoretical max is not fast enough. Unless there is some way to get the uncompressed HD straight into the GPU, I think a 700 MHz CPU is going to be pretty busy trying to handle a 125 Mbyte/s input.
The following is all theoretical without the GPU SW support but it is interesting anyway:
You can use the CSI port which goes up to ~4Gbits/sec.So that should go a long way of getting data into the board.
There is a limit to the speed at which the GPU H264 core can encode (or decode). But there is nothing which prevents you from splitting up the picture and let multiple Pi's each handle a section.
Splitting up reduces compression efficiency as it makes it difficult to impossible to use motion compensated prediction from the other parts. There are also problems "stitching" the compressed bit streams together. Not impossible, but its certainly more complex than might first appear from the "there is nothing which prevents you from splitting up the picture and let multiple Pi's each handle a section." sentence.
For the OPs original application I think that using the H.264 encoder already in many HD camcorders would be a more fruitful approach.
The problem is that you wouldn't have hardware/software access to these encoders "already in many HD camcorders". Let's say I'd like to take the raw H.264 NALs and remux/republish them to a live streaming system using HTTP or RTMP, that may prove problematic. Some camcorders are RTSP-enabled and we could transmux the outgoing stream, but it's not the vast majority.
Of course there are other HD H.264 compression chips out there, but not so many: MG2850/3500 from Mobilygen (now Maxim), DM368/6xxx from TI (the DaVinci platform), some media processors from Samsung, and that's almost the end of it.
The problem is that they are not sourcable for the "common people" (like hobbyists and enthusiasts), and that software support is often scarce, sometimes scary and most of the time outdated.
Take the LeopardBoard 368 for instance (DaVinci DM368, https://www.leopardimaging.com/Leopardboard_368.html
): it's not that expensive, but still 5 times the price of the R.Pi, and you "only" get a 460Mhz JZ ARM core, and a media processor not anywhere near the VC IV in terms of OpenGL-iness. Plus the TI SDK is kinda outdated (old Linux kernel) and the compressor API is... well, complicated (to say the least)
The same apply for the Maxim chips: you may try to reverse-engineer the Elgato Turbo 264HD USB stick (http://www.elgato.com/elgato/n.....t1.en.html
), like the crusher264 project started to, but it would imply many man/days just to understand the communication protocol and the intrinsics of the H.264 compression settings, without any documentation from the manufacturer of course.
I think the R.Pi is a really interesting product because of its very low TCO (initially targetted at education, but also a real bargain for experimenters/hobbyists). If Broadcom was willing to give access to all those GPU bells and whistles in a quite simple and standardized way (say OpenMAX), that would just be terrific!
Fingers crossed and keep the good work!