User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Most efficient filesystem for huge file trees

Sat Aug 11, 2012 12:33 pm

I have a couple of questions for any elite filesystem geeks who may inhabit this forum, this is the first one

I have a project that crunches huge amounts of text into keyword phrases and stores collocate and frequency data as a huge tree of tens of thousands of directories containing small JSON files. My sample data set is a fraction of my whole corpus and created just under 100Mb of files.

On a FAT formated disk, that 100Mb took up about 350Mb of disk space due to filesystem inefficiencies. I expect to end up with many gigabytes of data so inneficiency like this will be a very significant issue.

I'm guessing FAT isn't the most efficient filesystem, this was on a disk that started life in Windows-land.

So, what's the most efficient filesystem for this kind of file tree, that is readily accessible to Debian on a Raspberry Pi?
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

bashhacker
Posts: 5
Joined: Fri Aug 10, 2012 1:08 am

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 12:24 am

What about decorating the file names instead of using directories. Or maybe save a list of the file names to a text file and grep it? I hate not answering a question but it sounds like maybe a proper database is in order? I'm not sure these are just off the top of my head... sounds like an exciting problem.

User avatar
jackokring
Posts: 816
Joined: Tue Jul 31, 2012 8:27 am
Location: London, UK
Contact: ICQ

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 12:52 am

For ease of use a single table database is maybe best.
A custom filesystem is not easy to write, and squashfs maybe worth a look, but I think it needs a pre-built structure to squash.

Cheers Jacko
Pi[NFA]=B256R0USB CL4SD8GB Raspbian Stock.
Pi[Work]=A+256 CL4SD8GB Raspbian Stock.
My favourite constant 1.65056745028

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 8:18 am

Sadly a database is not the solution, I arrived at the filesystem solution after even an uber-optimised MySQL schema and installation ran out of puff due to the size of the dataset involved. It had reached the point at which queries took several minutes even on fairly cutting-edge x86 hardware.

Precomputing my queries into the filesystem is much more efficient in time at the expense of disk space. But disk space is cheap these days and processor time isn't, so I'm prepared to put up with that. The huge space overhead of FAT though seemed like a bit much, hence the question, would another FS be more space-efficient?

More background here:
http://thekeywordgeek.blogspot.co.uk/20 ... rofit.html

And something to play with that uses the same technique:
http://thekeywordgeek.blogspot.co.uk/20 ... ature.html
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

User avatar
rurwin
Forum Moderator
Forum Moderator
Posts: 4257
Joined: Mon Jan 09, 2012 3:16 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 8:36 am

I am among those saying "surely there is a better way", but if you are set on using a filesystem, then just about anything will be better than FAT, which uses larger blocks as the disk size increases. Or use FAT with as small a disk as will hold the data.

It may be worth checking the various Linux man pages for the various filesystems mkfs commands, to see if you can set the block size for any of them. Then set the block size to the smallest that will hold all but the largest of the JSON files. For example see "man mkfs.ext3" and read about the -b and -T flags.

Me, I'd build the entire thing in a single memory buffer as an AVL-balanced binary tree and load it into memory with mmap.

User avatar
jackokring
Posts: 816
Joined: Tue Jul 31, 2012 8:27 am
Location: London, UK
Contact: ICQ

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 8:41 am

Can you give 1 example query which takes the longest?
Pi[NFA]=B256R0USB CL4SD8GB Raspbian Stock.
Pi[Work]=A+256 CL4SD8GB Raspbian Stock.
My favourite constant 1.65056745028

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 8:55 am

It does seem counter-intuitive, doesn't it. But I've been running this particular project for quite a few years now and have tried them all, with a data set this large all the 'normal' rules don't work any more.

Back when I started coding, disk space was very expensive and processor time wasn't. Best demonstrated by those DOS disk compression systems, it was cheaper to have your 286 or 386 decompressing files all the time than it was to buy an 80Mb HDD to replace your 40Mb one. But now disk space is stupidly cheap - gigabytes for pennies - and even on higher spec hardware than a Pi, processor time can be a scarce resource.

My light bulb moment came when I realised the filesystem is more than just a place to put files, it can also be seen as a very simple and highly optimised database for storing and finding files. Which combined with precomputing of queries by keyword means that getting the data I want is a matter of seconds rather than one of minutes as it was using the more traditional DB.

Even the FS inefficiency can be managed - just buy a bigger disk - but the FAT inefficiency was just so over the top I felt something had to be done.

I will run some tests on block sizes, thanks.
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 8:59 am

jackokring wrote:Can you give 1 example query which takes the longest?
Not easily, if you mean one of the original SQL queries. But speed is no longer my issue, using the filesystem has solved that one such that I can do the job on modest hardware like a Pi. The space inefficiency prompted this question.
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 9:03 am

rurwin wrote: Me, I'd build the entire thing in a single memory buffer as an AVL-balanced binary tree and load it into memory with mmap.
Sadly no computer I have yet encountered has the necessary half-terabyte of memory :)
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

User avatar
jackokring
Posts: 816
Joined: Tue Jul 31, 2012 8:27 am
Location: London, UK
Contact: ICQ

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 9:08 am

I can't help but wonder about word entropy, and other maths. Seeing the query is not so I convince you to go database, but so I can the data construction patterns...
Pi[NFA]=B256R0USB CL4SD8GB Raspbian Stock.
Pi[Work]=A+256 CL4SD8GB Raspbian Stock.
My favourite constant 1.65056745028

User avatar
jackokring
Posts: 816
Joined: Tue Jul 31, 2012 8:27 am
Location: London, UK
Contact: ICQ

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 9:25 am

The reason I ask is because it relates to a database search system I made where fast searches needed longer and many words, and small words took the longest of time to order. Later though about how to divide server time between limited single entry customers, and pedantic speedy hits was not easy. In fact the most reasonable solution is to select random rare words and add them to the search behind the scenes. This boosts the search speed. It just needs a the balance between rarity and commonality relating to the result page number increasing as more results are requested.

Cheers Jacko
Pi[NFA]=B256R0USB CL4SD8GB Raspbian Stock.
Pi[Work]=A+256 CL4SD8GB Raspbian Stock.
My favourite constant 1.65056745028

User avatar
rurwin
Forum Moderator
Forum Moderator
Posts: 4257
Joined: Mon Jan 09, 2012 3:16 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 9:49 am

thekeywordgeek wrote:
rurwin wrote: Me, I'd build the entire thing in a single memory buffer as an AVL-balanced binary tree and load it into memory with mmap.
Sadly no computer I have yet encountered has the necessary half-terabyte of memory :)
It doesn't need it. The file will be sparse, so you can create it as half a terabyte or more and it will only occupy the space that it needs to hold the data. Its blocks will be paged into memory only when they are accessed, and when memory is full they will be discarded automatically based on some heuristic that you don't need to bather about. So long as your memory management is 64 bits, it will work, even on a Pi. Writing the file will result in a fair amount of disk activity, but reading the data, which I assume is the most frequent, will not even bother to do that. Pages will be read into memory and discarded efficiently. You will magically have in memory only those pages that you use most often plus the ones necessary for the most recent query. Your entire tree structure for a million words should fit into 24MB (two 8-byte pointers plus an average of 8 characters), so all of that could be paged in at any one time, but if it isn't then you only need one file access for each letter of the word, absolute maximum, and that could be optimised.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 27021
Joined: Sat Jul 30, 2011 7:41 pm

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 10:28 am

I have to say I'm intrigued as to why a database is slower/worse than a filesystem. If that were generally the case, main frames would use the filesystems and Oracle would be out of business. And since this doesn't seem to be of a size to compare with bank DB's etc, I'm a bit confused. What the total database size? I'm assuming it's some sort of book repository for the moment!
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 11:08 am

jamesh wrote:I have to say I'm intrigued as to why a database is slower/worse than a filesystem. If that were generally the case, main frames would use the filesystems and Oracle would be out of business. And since this doesn't seem to be of a size to compare with bank DB's etc, I'm a bit confused. What the total database size? I'm assuming it's some sort of book repository for the moment!
How long have you got for me to explain corpus linguistics? :)

In short, I am using a computer to try to understand English Language usage through statistical analysis. If I have a word or phrase - any word or phrase - I want to be able to see at a glance how it is used in the language by computing its relationship to other words and phrases. This process requires a huge collection of written text - a corpus - which if big enough should tend towards a statistically significant sample of usage of any English word or phrase.

If you look at the links in my sig you'll understand the kind of use this technique is put to professionally. My employer uses some extremely cool commercial software and a multi-billion-word corpus to research new words for dictionary inclusion, my own more modest effort dates from my days working in the search engine marketing business and has moved on from its original purpose of keyword analysis to become a rather neat language analysis tool for real-time and historical news data.

As you might imagine, many millions of words of English text make for some pretty large database tables. Millions of sources, many millions of words and phrases and hundreds of millions of incidences of those words and phrases in those sources. MySQL was a capable tool when my corpus wasn't very big but even with every optimisation of schema, database and indexes it has not been able to cope with joins on this scale. It always delivers a result, but very slowly indeed. I am not a MySQL newbie by any means and I have not been afraid to call in favours from MySQL gurus in my optimisation.

So what I've done is make a *much* faster system by cumulatively precomputing my queries and saving them as JSON files in the filesystem. This comes at the expense of extra disk space, but disk space is cheap. It also makes it trivial to use a browser as a client, it just asks for the JSON files.

So yes, using a FS over a DB would be crazy in most database use cases. But in this one the scale of the task has outgrown the database as a tool.

Which returns me to my original question, having solved my speed issue using the filesystem, what's the most efficient filesystem space-wise for this scale of tree? :)
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

User avatar
jackokring
Posts: 816
Joined: Tue Jul 31, 2012 8:27 am
Location: London, UK
Contact: ICQ

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 11:38 am

So your directories are the words in the phrase, and the contents of the files are different results based of funny names (not words), such as longest collocated phrase. Which would involve a reflexive search, or joining all sub phrases from found container, where sub phrase occurring count > number of found containers with input phrase and record id are equal.

Cheers Jacko
Pi[NFA]=B256R0USB CL4SD8GB Raspbian Stock.
Pi[Work]=A+256 CL4SD8GB Raspbian Stock.
My favourite constant 1.65056745028

User avatar
jojopi
Posts: 3316
Joined: Tue Oct 11, 2011 8:38 pm

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 11:54 am

jamesh wrote:I have to say I'm intrigued as to why a database is slower/worse than a filesystem. If that were generally the case, main frames would use the filesystems and Oracle would be out of business.
Generic RDBMSs are not used because they are efficient. They are always inefficient. They can be orders of magnitude slower than a custom solution for a particular application. Generic databases are used because they are flexible and well tested and everyone knows how to use them.

User avatar
jackokring
Posts: 816
Joined: Tue Jul 31, 2012 8:27 am
Location: London, UK
Contact: ICQ

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 12:07 pm

jojopi wrote:
jamesh wrote:I have to say I'm intrigued as to why a database is slower/worse than a filesystem. If that were generally the case, main frames would use the filesystems and Oracle would be out of business.
Generic RDBMSs are not used because they are efficient. They are always inefficient. They can be orders of magnitude slower than a custom solution for a particular application. Generic databases are used because they are flexible and well tested and everyone knows how to use them.
The particular example I chose above, indicates a problem when new text is added, as every collocate has to be potentially calculated, and the JSON file re-written. An the > should have been >= or a subtle bug occurs sometimes. I think much of the inflexibility of SQL causes the problems. The use of tables is not the issue, as the file system is a set of tables containing tables.
Pi[NFA]=B256R0USB CL4SD8GB Raspbian Stock.
Pi[Work]=A+256 CL4SD8GB Raspbian Stock.
My favourite constant 1.65056745028

ado24
Posts: 4
Joined: Mon Jul 23, 2012 7:22 pm

Re: Most efficient filesystem for huge file trees

Sun Aug 12, 2012 12:58 pm

If you don't have some moral problems with that, look at reiserfs. Also look at berkeley db or graph databases.

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Mon Aug 13, 2012 8:59 am

Berkeley DB would be a possible solution, and I've considered it, it's one I use professionally for not dissimilar tasks. But the FS solution does have the advantage of lowest barrier to entry. Writing a little jQuery client in a browser to access data directly from disk or HTTP on any platform without any server code takes some beating.

I have no problems with reiserFS, it's the software and not the originator that you're using.
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Mon Aug 13, 2012 9:01 am

jojopi wrote:Generic RDBMSs are not used because they are efficient. They are always inefficient. They can be orders of magnitude slower than a custom solution for a particular application. Generic databases are used because they are flexible and well tested and everyone knows how to use them.
Agreed. I enthusiastically use MySQL for all sorts of things, but it took me a long time to realise that when the only tool you have is a hammer, you try to make every task into a nail.
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Mon Aug 13, 2012 9:05 am

jackokring wrote: The particular example I chose above, indicates a problem when new text is added, as every collocate has to be potentially calculated, and the JSON file re-written. An the > should have been >= or a subtle bug occurs sometimes. I think much of the inflexibility of SQL causes the problems. The use of tables is not the issue, as the file system is a set of tables containing tables.
That's about it. Though the idea is that once you've processed enough text the number of collocates you add goes down significantly as most collocates will re-occur time and time again. And the common collocates rise to the top through possessing higher frequencies, so you can disregard those with frequencies tending to zero.
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

vmp32k
Posts: 14
Joined: Fri Jul 27, 2012 3:05 pm

Re: Most efficient filesystem for huge file trees

Tue Aug 14, 2012 11:30 am

@OP: Have you looked at NoSQL databases? MongoDB for example is basically a very fast JSON key-value store with nifty features to maximize access performance (MapReduce). Though I have no idea if it would run (at all/well enough) on Pi's since it requires quite a lot of memory. ;)

User avatar
scruss
Posts: 3391
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: Most efficient filesystem for huge file trees

Tue Aug 14, 2012 11:42 am

rurwin wrote: It doesn't need it. The file will be sparse, so you can create it as half a terabyte or more and it will only occupy the space that it needs to hold the data.
That's pretty much how we did it for the Collins COBUILD corpus in the late 1990s. About a billion words, and all running on a Sun Ultra-1, which had probably roughly the same processing power as a Raspberry Pi :(
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Tue Aug 14, 2012 1:21 pm

I suspect I work with one or two of your ex-colleagues then :)

It's true, we sometimes forget how powerful even modest computers are today. Back in the '90s I was optimising web site code to run on Pentium servers with ~128Mb of memory, I sometimes wonder whether using all the same tricks today on machines with 100x the power is worth the effort I put into it
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

User avatar
thekeywordgeek
Posts: 105
Joined: Fri May 18, 2012 1:48 pm
Contact: Website

Re: Most efficient filesystem for huge file trees

Tue Aug 14, 2012 1:29 pm

vmp32k wrote:@OP: Have you looked at NoSQL databases? MongoDB for example is basically a very fast JSON key-value store with nifty features to maximize access performance (MapReduce). Though I have no idea if it would run (at all/well enough) on Pi's since it requires quite a lot of memory. ;)
No, I haven't. Work uses NoSQL to serve huge XML data sets very effectively, I guess I was following a KISS principle when I tried the filesystem.

The advantage of all this is that the amount of work relating to the storage medium is fairly trivial. The hard work is in the natural language processing, and that's well handled by both the NLTK library and my own libraries honed over many years of language hacking.

So I can continue with the filesystem and move to another medium if I think its performance would be better, without any inconvenience beyond writing another storage library and reprocessing my data.
I make and sell radio kits for the Raspberry Pi and more.
http://shop.languagespy.com/

Return to “Advanced users”