geekman92
Posts: 14
Joined: Wed Sep 12, 2012 10:18 am

Block size & lots of small files

Wed Jan 20, 2016 1:50 pm

I am using an 8Gb card running Raspbian Jessie with the filesystem extended using raspi-config. I have program that may create lots (~1 million) of small files (~800 bytes). Let's round that up to 1kB files which by my calculations is 1Gb for all 1 million files (1024 * 1,000,000 = 1.024Gb).

One of the tests for my program is to try and create all 1M files, after around 350,000 had been created the test failed due to the filesystem being full.

I had a look at the files using du and noticed that an 841B file seemed to be taking up 4096B on disk:

Code: Select all

[email protected] $ du -B1 1.bin
4096    1.bin

[email protected] $ du -b 1.bin
841     1.bin
I guess this has something to do with the block size? Which I according to stat is 4096B (I believe this is also the default ext4 block size which adds up):

Code: Select all

[email protected] $ stat 1.bin
  File: ‘1.bin’
  Size: 841             Blocks: 8          IO Block: 4096   regular file
Device: b302h/45826d    Inode: 260860      Links: 1
So my questions are:
  1. Is it the fact that block size is 4096B that is causing the filesystem to fill up faster than I expect?
  2. Can I change the block size with raspbian installed or do I need to recreate the image? If so how?
  3. Will decreasing the block size from 4096B to say 256B or 512B have a detrimental effect on the rest of the OS?
Thanks in advance for any help.[/size]
:)

paulvha
Posts: 26
Joined: Wed Jan 06, 2016 3:28 pm

Re: Block size & lots of small files

Wed Jan 20, 2016 2:13 pm

use df to check the size available. One could reduce the block size (4096 is said to be the maximum) but that needs to be done during created something like : mkfs.ext4 -j -b 1024 /dev/xxxx.

Your issue could also be a lack of inodes. This is the number of entries in a table on each filesystem that contains the name and relevant information about the files and the location of the data in the filesystem. Check with df -i. The # of inodes is normally determined on the size of filesystem. with the -i option on mkfs.ext4 you can influence the calculation, while -I (capital i) you can influence the size (= max size of a single file). (see man mkfs.ext4)

User avatar
jojopi
Posts: 3078
Joined: Tue Oct 11, 2011 8:38 pm

Re: Block size & lots of small files

Wed Jan 20, 2016 2:27 pm

The allowed block sizes are 1024, 2048, 4096, but nobody has used anything other than 4096 for many years. You cannot change block size without reformatting the filesystem.

It will be easier to fix your application not to use such tiny files.

The page size of the MMU is 4096 bytes, so using smaller files will probably waste RAM in the buffer cache. You will also waste a lot of space on directories and inode tables, and everything will be slower.

Also bear in mind that the native block size of the SD card will be no smaller than 4096 bytes, so writes smaller than that will require read-modify-write cycles and increase wear.

User avatar
experix
Posts: 204
Joined: Mon Nov 10, 2014 7:39 pm
Location: Coquille OR
Contact: Website

Re: Block size & lots of small files

Wed Jan 20, 2016 5:01 pm

Maybe it is better to use a USB stick or hard drive, with filesystem constructed in the best way for lots of little files. Putting them there will save wear on your SD card and make the files much easier to archive and access from another machine, by moving the drive there, should you want to do that.

geekman92
Posts: 14
Joined: Wed Sep 12, 2012 10:18 am

Re: Block size & lots of small files

Fri Jan 22, 2016 3:24 pm

Thank you for all your replies, they were useful. I will rethink the design of the system. :)

Heater
Posts: 12731
Joined: Tue Jul 17, 2012 3:02 pm

Re: Block size & lots of small files

Fri Jan 22, 2016 4:05 pm

What is in all the small files? How fast are they generated? How fast do they need to be read? Is this some kind of data logging system? In time series order?

I'm thinking that it may be an idea to think about storing this data as records in some database.

MySql is the obvious first choice.

If you don't want to mess with SQL there are many other databases. Some designed for storage of time series data.

Basically the a database will keep all your little data records in one file. Or perhaps a few. All neatly indexed for fast reading.

pksato
Posts: 295
Joined: Fri Aug 03, 2012 5:25 pm
Location: Brazil

Re: Block size & lots of small files

Fri Jan 22, 2016 5:16 pm

What kind information is stored on these files?
Can not be a record on a database?
I suggest to use sqlite, no need to run a daemon. All main programing language can use sqlite.
Other option is to create a large file, and format it as some filesystem with small size block (with compression), and mount to store theses small files.
And, compressed file container like zip or rar, that support add and remove files.

jahboater
Posts: 4465
Joined: Wed Feb 04, 2015 6:38 pm

Re: Block size & lots of small files

Fri Jan 22, 2016 7:01 pm

I think a design involving a million small files is inherently flawed.
Modern file systems such as ext4 can cope I am sure, but its bound to hit some limit somewhere. What if the number of i-nodes have been restricted, or the user tries something daft like FAT16. There may be smaller limits on the number of files in a directory.
If you have done this to avoid using a database, instead you could just have a single 1GB file with a million fixed length records. Use lseek(), read() or write() to access individual records. Very simple.

User avatar
jackokring
Posts: 816
Joined: Tue Jul 31, 2012 8:27 am
Location: London, UK
Contact: ICQ

Re: Block size & lots of small files

Fri Jan 22, 2016 7:09 pm

You could also tar.xz your files in groups of about 100, to make 100kB files (or less), and only have 10,000 of them, and flush the 'cache' of decompressed files to keep it as small as need be.
Pi[NFA]=B256R0USB CL4SD8GB Raspbian Stock.
Pi[Work]=A+256 CL4SD8GB Raspbian Stock.
My favourite constant 1.65056745028

User avatar
DougieLawson
Posts: 35381
Joined: Sun Jun 16, 2013 11:19 pm
Location: Basingstoke, UK
Contact: Website Twitter

Re: Block size & lots of small files

Sat Jan 23, 2016 7:59 am

jahboater wrote:I think a design involving a million small files is inherently flawed.
That just smacks of something where a million rows in an SQL database is going to be a better design.
Note: Having anything remotely humorous in your signature is completely banned on this forum.

Any DMs sent on Twitter will be answered next month.

This is a doctor free zone.

geekman92
Posts: 14
Joined: Wed Sep 12, 2012 10:18 am

Re: Block size & lots of small files

Mon Jan 25, 2016 9:58 am

Hey sorry for the late reply, busy weekend.

Okay so let me describe what I'm trying to achieve and why I used individual files.

My program sends packets of information to a server throughout its operation. The RPi is in a remote location and is connected to the internet via a 3G dongle so its connection may bounce up and down. The power to the RPi may be lost every now and again.

Now I don't want to loose any of these packets so when the server is unavailable I cache them to individual files as python pickles.
instead you could just have a single 1GB file with a million fixed length records. Use lseek(), read() or write() to access individual records. Very simple.
The reason I used individual files rather than a large file(s) is that the packets need to be read out of the file one by one, sent to the server and then deleted. As I understand doing this requires you to read the whole file into memory read the first line, remove it and then write the whole file back again which would be pretty slow.

I had thought about using a database and I even did a few tests with an SQLite database. However, doesn't a database store operations in memory before writing it to disk? If the RPi happens to loose power in that time won't the last few operations that have been store in memory but not written to disk yet be lost?
What if the number of i-nodes have been restricted, or the user tries something daft like FAT16.
The "user" doesn't get to change anything on the RPi, it is configured and installed by me so this isn't an issue as I have total control of the hardware and software and how they operate together.

I'm fully aware that there are gaps in my knowledge and that have made assumptions when designing this program. Please correct me if anything I have said is incorrect and that way I can keep learning, thanks :)

User avatar
rpdom
Posts: 14483
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: Block size & lots of small files

Mon Jan 25, 2016 10:07 am

If the power is lost while the Pi is running, all bets are off. You can't guarantee that any part of a file has actually been written to the card or that the card itself isn't doing any internal operation that might wreck it if power is lost.

you'd be better off getting some form of battery back-up for the Pi and doing a controlled shutdown after power is lost.

jahboater
Posts: 4465
Joined: Wed Feb 04, 2015 6:38 pm

Re: Block size & lots of small files

Mon Jan 25, 2016 10:45 am

The reason I used individual files rather than a large file(s) is that the packets need to be read out of the file one by one, sent to the server and then deleted. As I understand doing this requires you to read the whole file into memory read the first line, remove it and then write the whole file back again which would be pretty slow.
No, I presumed you would need some sort of simple index anyway, and that could mark a line as unused. There would never be any reason to read the entire file. See "man 2 lseek".

For files in general, see "man 2 open", you could always open a file with O_SYNC
O_SYNC The file is opened for synchronous I/O. Any write(2)s on the resulting file
descriptor will block the calling process until the data has been physically writ-
ten to the underlying hardware.
See also "man 2 fsync" or fdatasync.
I am sure you can do this with a database as well.
These are not a complete solution for power cuts (see previous post) but they might help a little.

Edit:
another simple idea - use a ring buffer, no need for any index, nor any need to delete lines.
Have a "headptr" and a "tailptr" (simple integers). Each new line is written at headptr offset, when the write is complete and successful, increment the headptr (add 1024 to it). When the network is available read the line pointed to by the tailptr. Send the line over the network to the server and when the ACK is received, increment the tailptr. The acknowlegement should only be sent by the server when it has successfully committed the new data. When a ptr reaches 1000000 reset it to 0 to make the ring. If the tailptr == the headptr, the buffer is empty (or possibly full).
Pre-allocate the file with "dd if=/dev/zero of=bigfile bs=1024 count=1000000" and you will never have any running out of disk space issues.

User avatar
r3d4
Posts: 967
Joined: Sat Jul 30, 2011 8:21 am
Location: ./

Re: Block size & lots of small files

Mon Jan 25, 2016 4:13 pm

Using some kind of fuse binding could be another option .
I have messed about abit using fuse bindings to mount a json file (eg: file.json) as a file system , so it is posible!
This was set up to write/save on every file change to try and avoid losing data if power is lost.
That said i mostly combined examplesand the odd library,
to get something that worked with a few simple tests with a few files/dirs and some simlink's ( i had to write/add the symlink handler :shock: and afew other missing peices ) ,
anyway my tests/experiments was nothing close to 1M files!

cpc464
Posts: 206
Joined: Tue Jul 08, 2014 5:10 pm
Contact: Website

Re: Block size & lots of small files

Mon Jan 25, 2016 10:28 pm

I just checked the Raspbian Jessie (Lite) image, and the number of inodes in / is set to 247,296, so you can't create a million files there.

$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/root 247296 68136 179160 28% /
Unix engineer since 1989

asandford
Posts: 1997
Joined: Mon Dec 31, 2012 12:54 pm
Location: Waterlooville

Re: Block size & lots of small files

Mon Jan 25, 2016 11:05 pm

cpc464 wrote:I just checked the Raspbian Jessie (Lite) image, and the number of inodes in / is set to 247,296, so you can't create a million files there.

$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/root 247296 68136 179160 28% /
For testing backup solutions; I've often had to create million+ files. The easiest way I've found is to create a filesystem with a thousand files in it, then copy that a thousand times, then copy that as many times as required.

malicious
Posts: 96
Joined: Thu Jul 24, 2014 10:07 pm
Location: USA

Re: Block size & lots of small files

Tue Jan 26, 2016 3:09 am

cpc464 wrote:I just checked the Raspbian Jessie (Lite) image, and the number of inodes in / is set to 247,296, so you can't create a million files there.

$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/root 247296 68136 179160 28% /
Is that with a 4GB SD card? ext4 defaults to one inode per 16 kilobytes so larger partitions have more inodes. For comparison, Raspbian Jessie's root file system on a 64GB card has about 3.7M inodes.

jahboater
Posts: 4465
Joined: Wed Feb 04, 2015 6:38 pm

Re: Block size & lots of small files

Tue Jan 26, 2016 5:01 am

Is that with a 4GB SD card? ext4 defaults to one inode per 16 kilobytes so larger partitions have more inodes. For comparison, Raspbian Jessie's root file system on a 64GB card has about 3.7M inodes.
Or the user may set any value during the format (mkfs.ext4)
-N number-of-inodes
Overrides the default calculation of the number of inodes that should be reserved
for the filesystem (which is based on the number of blocks and the bytes-per-inode
ratio). This allows the user to specify the number of desired inodes directly.

cpc464
Posts: 206
Joined: Tue Jul 08, 2014 5:10 pm
Contact: Website

Re: Block size & lots of small files

Tue Jan 26, 2016 11:56 am

As I say: it was an image, so the size of my SD card and/or fsck options are not relevant. That was all decided by whoever built the image. It is the latest (Nov 2015) image from the raspberrypi.org download pages.

Creating a million files should not be difficult, but you'll need to choose a file system with at least a million free inodes. Slightly off topic, but I wrote an article about having 3 million files in one folder. It doesn't cover creating the files though, just listing and eventually deleting them. Sorry to advertise my own blog, but here it is.

http://unixetc.co.uk/2012/05/20/large-d ... s-to-hang/
Unix engineer since 1989

User avatar
jojopi
Posts: 3078
Joined: Tue Oct 11, 2011 8:38 pm

Re: Block size & lots of small files

Tue Jan 26, 2016 12:09 pm

cpc464 wrote:As I say: it was an image, so the size of my SD card and/or fsck options are not relevant. That was all decided by whoever built the image. It is the latest (Nov 2015) image from the raspberrypi.org download pages.
No, the Foundation's 2015-11-21 "lite" image has 85008 inodes as distributed, because it is only about 1.4GB in size. The number you quoted must have been after you expanded the filesystem to fit a 4GB card.

The bigger the card, the more groups are added to the filesystem by resize2fs, and the more inodes you get.

cpc464
Posts: 206
Joined: Tue Jul 08, 2014 5:10 pm
Contact: Website

Re: Block size & lots of small files

Tue Jan 26, 2016 3:46 pm

Whoops. Yes, that image was expanded when the server was built last year. :oops: So the inodes increased from 85008 to 247,296. It was expanded to 4 Gb manually (the card is actually considerably bigger).
Unix engineer since 1989

Return to “General discussion”