So I got a batch of Model As in the post. "Low Power", they said. "Suitable for battery power", they said.
Well we'll just see about that.
First some theory. CMOS devices have two main power consumption considerations - static and dynamic. Static is as expected, the power consumption that would occur if the silicon were in a state where no clock cycles were fed to it. This is mainly made up of subthreshold leakage (transistors leaking when off) which absolutely plagued desktop PCs until recently. This value is (roughly) a function of the square of the applied voltage VDD.
Dynamic consumption is exhibited when clock cycles change the state of the transistors on the silicon. CMOS transistor gates are essentially small capacitances that must be charged or discharged to turn them on or off - this gives rise to energy expenditure proportional to the square of the applied voltage and proportional to the clock frequency.
There are three main silicon groups that can be poked/played with on the Pi. The DDR2 SDRAM chip has its own clock speed and voltage regulator. The GPU has its own set of clock speeds and the ARM has its own clock speed. The GPU and ARM share a voltage regulator.
To test the power consumption, I went with the reliable and simple current shunt resistor - I picked four 0.75 ohm power resistors out of my bucket of parts and soldered them in parallel, then precisely measured the resistance by putting 1 amp through them with a bench power supply - 0.185 ohms exactly. This went into the +ve lead of a supply from a small wall wart that then powered the Pi directly via the GPIO. Thus I could measure the current and voltage very precisely.
Results:
For comparison the model B 256M (default clocks):
Idle (console): 0.372A 1.89W
Idle (lightdm running): 0.375A 1.91W
Running scp from SD card: 0.413A 2.09W
Model A (default clocks)
Idle (console): 0.113A 0.587W
cat /dev/urandom | gzip: 0.153A 0.795W
Model A (ARM=350MHz overvolt=0)
Idle (console): 0.111A 0.576W
cat /dev/urandom | gzip: 0.133A 0.691W
Model A (ARM=350MHz overvolt=-8)
Idle (console): 0.106A 0.554W
cat /dev/urandom | gzip: 0.125A 0.652W
At this point I started playing with the SDRAM clocks and voltages. Long story short: this does nothing to improve power consumption and most often the Pi simply wouldn't boot at reduced voltage settings. DDR2 inherently incorporates various power saving features into the silicon - there is nothing to be gained from underclocking it.
Then I turned my attention to the GPU - again a story of instability and in most cases ridiculous slowdown. It seems that the most to be gained from underclocking is about 20% reduction in idle power consumption using the cpufreq scaling.
Model A (ARM_min=350M overvolt_min=-6)
idle (no x): 0.106A 0.551W
yes > yes.txt (SD write): 0.174A 0.902W
cat /dev/mmcblk0 > /dev/null (SD read): 0.154A 0.804W
My gut feeling is that the CPU/GPU isn't sleeping when idle. This is evident in the rather small difference between idle and load power consumption: my Core i5 desktop machine achieves a much higher difference between idle and load for CPU-bound activity.
I also note that the model A's biggest single power user is RG1, the 5V to 3v3 regulator. The percentage of total consumption will vary depending on input voltage, but by calculation the device will account for between 30% and 37% of the model A's total power usage. Replace this with a switching regulator >90% efficient and we could end up with 0.35W idle power consumption on the model A.
tl;dr version: Model A uses <33% (0.8W) of a Model B's power consumption (1.89W) using default settings. Underclocking and undervolting give negligible benefit for a ridiculously slow Pi. The low-hanging fruit for power optimisation is and always will be RG1, the 5V to 3.3V regulator on the Pi which will account for 30-37% of total power consumption regardless of version of Pi..