Fri Feb 03, 2012 3:17 pm
Burngate said:
One of the more common failures we had was electrolytic capacitors dying.
These days, the things I worry about most are:
1. Electrolytic capacitors, as Burngate mentions. If a chip runs hot, the electrolyte boils away over time. This is especially of cheap aluminum can caps. There are more expensive versions for high-temp applications, but manufacturers often use the cheap ones prefering that consumers need to buy new product every few years instead of every few decades. Failure mode is that voltage is no longer well regulated and non-reproducible errors occur, followed by hard failure. Mitigation is to not design products that run hot. Use ARM instead of x86, for example.
2. Ball grid arrays and other lead-less packages ("lead" as in "Leeds"). These are nasty packages from a reliability standpoint. All packages used to have pins that went through the PC board or "gull wings" where one end connected to the chip and the other was soldered onto the PC board. Leads are a great way to absorb the thermal expansion difference between the IC package and the PC board: if there is a mismatch, the lead flexes and the solder joints remain intact.
With a lead-less package, you're hoping the solder flexes if there is a thermal mismatch. The problem is, solder doesn't flex: it cracks. After enough thermal cycles, a few of those hundred of balls start to have poor connection. Ah, but which ones?
Solder ball fractures are of great concern for industrial temp products that live in the desert where the daily thermal cycle can be pretty bad. Outer space can be even worse: you may have a satellite that's behind Earth part of the time and fully exposed to the Sun part of the time.
To mitigate, leave your device on all the time to avoid thermal cycling. However, then your electrolytic caps boil away. Sigh. I've heard of hackers having success reflowing the balls on video game CPUs using a heat gun. Only recommended as an act of desperation.
3. Tin whiskers: this is a relatively new failure mode and not well understood. Some materials, particularly tin, sometimes engage in strange crystal growth in the form of tiny filaments -- called whiskers -- which can create shorts between PC board traces or IC pins. Until recently, tin whiskers were rare because the lead (Pb, pronounced "led") in the solder made the tin behave itself. However, nowadays you have RoHS requirements that restrict use of lead and tin whiskers are starting to become a serious problem. For example, they were recently implicated as a cause for unexpected Toyota acceleration. So would you prefer to die of lead poisoning from poorly-disposed-of electronics or in a car crash? Would you prefer to have a moderate amount of somewhat toxic lead-based electronics in a landfill or huge quantities of less toxic new electronics in a landfill?