jasper77
Posts: 1
Joined: Thu Jan 03, 2019 10:05 pm

Changes do not survive power cycle

Thu Jan 03, 2019 10:54 pm

I work with an embedded system powered by a RasPi, and some systems deployed in the field will spontaneously go into a mode where new changes stay volatile. I say "spontaneously" because we're not there to know what causes it, and by the time we find out any log data is gone. I suspect power loss due to improper shutdown, but can't be sure. I'm looking for at least a way to detect and report when it happens. Ideally, if someone here knows what is causing this and how to avoid it, please tell me!

These systems presently use a Pi 2 with a micro SDHC (consumer grade Samsung 16G), so there is no lock switch on the card.

One sure fire way to know when this happens:
- touch a new file, confirm it exists
- reboot
- the new file doesn't exist

Another symptom is that a rotating log file will have a start date of a long time ago, the newest entries are current, and there's a large date skip somewhere in the middle. The newest entries are from memory and a power cycle loses them.

I'm trying to find a way to tell when it happens without rebooting, and without having to wait days for the log timestamp side effect to be detectable. Something scriptable.

The file system hasn't exactly gone read-only. I can create files. /proc/mount shows everything as rw, so looking there isn't informative.

I don't think the SD card has bad sectors, because if I run "badblocks -v /dev/mmcblk0" it returns 0 errors. Also I can write a new image to an afflicted SD card and it works fine after that. (is rw)

I tried using dd with the understanding that it writes directly to disk, skipping caching, but it all seems normal:

Code: Select all

$ sudo dd bs=4096 count=1 skip=200000 if=/dev/mmcblk0 of=xyz.bin
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00329162 s, 1.2 MB/s
$ sudo dd bs=4096 count=1 seek=200000 if=/dev/urandom of=/dev/mmcblk0
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00170464 s, 2.4 MB/s
$ sudo dd bs=4096 count=1 skip=200000 if=/dev/mmcblk0 of=abc.bin
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00352715 s, 1.2 MB/s
$ sudo dd bs=4096 count=1 seek=200000 if=xyz.bin of=/dev/mmcblk0
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.003651 s, 1.1 MB/s
$ cmp xyz.bin abc.bin
xyz.bin abc.bin differ: byte 1, line 1
.. though after a reboot, abc.bin and xyz.bin are gone.

One one afflicted card, df says /root is 27% used, and /boot is 33% used. Everything else is either 0, 1, or 2% used. This is typical.

Intentionally making the SD card read-only isn't an option. The application uses a sqlite DB and the application log files are also needed. I'm considering making the SD card read-only and using a mounted USB drive for the writable stuff, but without knowing the root cause I can't be sure that would really help.

Can anyone shed some light on what might be going on, or suggest other things I could try?

User avatar
Joel_Mckay
Posts: 289
Joined: Mon Nov 12, 2012 10:22 pm
Contact: Website

Re: Changes do not survive power cycle

Fri Jan 04, 2019 5:21 pm

For our club machinery, we use an ramdrive and overlayfs:
https://sourceforge.net/projects/microm ... -pi/files/
And map the /home to a writable partition for everything that needs to persist.

Professionally, the logger equipment I would design used SLC flash, large journals (150MB on the above), and a large circular buffer for data.
This would ensure each 4kiB write would likely be wear-leveled across the entire disk over a month, and the OS was read-only so the system would boot even if the memory was failing.

SLC flash is still manufactured, and is good for over 100k writes.
MLC ("Sandisk ultra", and most others) flash about 10k writes.
TLC are only good for under 5k writes, but in my experience can be under 1.4k writes.

Cards like the "Sandisk extreme pro" or "Sandisk endurance" usually have internal wear leveling strategies in their controllers to try to cheat the wear-out time. However, a power outage can still cause damage to the card during writes, and only certain full-sized SSD PC drives have power loss protection built-in (some use super-capacitors). We actually added a miniature UPS to our design, as it was less costly than constantly replacing memory that wouldn't indicate a write was unsuccessful. Note verifying writes will often not work due to controller sector caching.

Newer sdcards will default into read-only mode when they start to fail, and I suspect that is what you are experiencing.

Sometimes I get flack on this forum, as people rarely fully understand what I am trying to describe...
But I have collected data on roughly 50 types/models of cards deployed into several hundred units since around 2013.

You may want to look at how your linked sqlite handles its write commits with either Temporary DB or InMemory DB modes.
;-)
There are also dozens of other OS tweaks you can do to get things to behave.

Best of luck,
J

dominic03
Posts: 81
Joined: Fri Dec 21, 2018 1:50 am

Re: Changes do not survive power cycle

Tue Jan 08, 2019 12:57 pm

I got something slightly similar, where my config.txt was reverting changes after every reboot.

What happened was the partitions were mapped wrong so a USB drive was mapped to /boot after booting. This could be your problem, but maybe not.

I fixed it with a reinstall. I feel like it had something to do with fstab drive mapping, so check your fstab.
My setup:
Raspberry Pi 3b+ running Raspbian Buster, firmware date July 9 2019 (via PINN)
Kingston 32 GB class 10 (U1) SDHC card
Lexar 16GB flash drive for backups
pi-top 3 with drivers and pi-topHUB 2.0

dominic03
Posts: 81
Joined: Fri Dec 21, 2018 1:50 am

Re: Changes do not survive power cycle

Tue Jan 08, 2019 12:58 pm

Also try SYSRQ + S before shutdown.
My setup:
Raspberry Pi 3b+ running Raspbian Buster, firmware date July 9 2019 (via PINN)
Kingston 32 GB class 10 (U1) SDHC card
Lexar 16GB flash drive for backups
pi-top 3 with drivers and pi-topHUB 2.0

pws
Posts: 91
Joined: Mon Apr 11, 2016 4:16 pm

Re: Changes do not survive power cycle

Tue Jan 08, 2019 4:08 pm

I always issue a "sudo sync" before any reboot or shutdown. A little insurance....

User avatar
rpdom
Posts: 14992
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: Changes do not survive power cycle

Tue Jan 08, 2019 7:17 pm

pws wrote:
Tue Jan 08, 2019 4:08 pm
I always issue a "sudo sync" before any reboot or shutdown. A little insurance....
sudo is not required for sync. Any user can run it as it can do no harm.

Also, sync will happen during the shutdown anyway. It's been decades since it was needed to run it manually.

Also, sync will not affect the SD card's internal cache and controller. Waiting until the shutdown sequence is completed (the ten flashes of the ACT LED) will do that.

timrowledge
Posts: 1275
Joined: Mon Oct 29, 2012 8:12 pm
Location: Vancouver Island
Contact: Website

Re: Changes do not survive power cycle

Tue Jan 08, 2019 8:37 pm

I’ve seen similar problems that ended up being the sd card having ‘died’. Even to the extent of installing fairly large packages, running for hours and then on a reboot - boom ,back to the state before installing.
Making Smalltalk on ARM since 1986; making your Scratch better since 2012

dominic03
Posts: 81
Joined: Fri Dec 21, 2018 1:50 am

Re: Changes do not survive power cycle

Tue Jan 08, 2019 9:51 pm

Do SYSRQ + U to remount read-only, sync, and try to create a file. Does it give an error?
My setup:
Raspberry Pi 3b+ running Raspbian Buster, firmware date July 9 2019 (via PINN)
Kingston 32 GB class 10 (U1) SDHC card
Lexar 16GB flash drive for backups
pi-top 3 with drivers and pi-topHUB 2.0

Return to “Advanced users”