Page 1 of 1

Database v text file

Posted: Mon Dec 02, 2013 10:41 pm
by CodeCruncher
Scenerio :
An application has a value come into it. Assuming a list of 1000 'constants' that the application 'knows' about what is the best solution for it performance wise to result a query against the list to say wether the value exists. The response to the query needs to be pretty quick but the number of queries per hour is low ie 5 per hour.

A text file of some sort ? If yes what type of file ? plan text, csv xml ?
A database ? any hints for suitable candidates
Memory array. I've kinda ruled that out since the memory is constrainted on the Pi

Any thoughts. I'm awaiting my deliver of my Pi to actually play with it but would appreciate some insight from people that already been there and got the tshirt.

Re: Database v text file

Posted: Mon Dec 02, 2013 10:55 pm
by DougieLawson

Install it with sudo apt-get install python-sqlite sqlite3

But, I may be biased in favour of databases.

Re: Database v text file

Posted: Mon Dec 02, 2013 11:02 pm
by CodeCruncher
Was thinking the response from a query against db to be quicker as opposed to other methods.Thanks for the response.

Re: Database v text file

Posted: Mon Dec 02, 2013 11:58 pm
by ame
A text file is easier to set up than a database, and with only 1000 entries it will be faster than your eye can blink. I recommend CSV.

Depending on how large your file actually is you might be able to import the whole thing into a Python dictionary in memory, then lookup will be virtually instantaneous.

Re: Database v text file

Posted: Tue Dec 03, 2013 12:05 am
by DougieLawson
SQLite takes five minutes and with the python bindings you can start writing SQL straightaway. It's also future proof, today it may be 1000 records tomorrow it could by 20million rows. In my line of business we have some of the largest users in the World running petabyte SQL databases (and heading towards yottabytes) with sub-second response times on their mainframes. As I said, I may be biased.

Re: Database v text file

Posted: Tue Dec 03, 2013 12:07 am
by ame
And another thing, memory is not constrained in the Pi. It's true it's not huge compared to a modern desktop system, but compared to human comprehension it is in fact gargantuan. For example, the English ASCII text version of the Bible is about 4Mb. How much data do you have?

Re: Database v text file

Posted: Tue Dec 03, 2013 12:29 am
by DougieLawson
The other argument for an SQL database is that the original poster will learn a transferable skill.

He may have 1000 small records that would be easy (in your view) to do with a CSV. But writing the CSV code is specific to this project. The next project won't get code re-use.

Learning SQL may appear overkill for this trivial project, but the next project will be more complex and will warrant using a database.

SQL dialects do have some local syntactic quirks but the basic CREATE, INSERT, SELECT, UPDATE, DELETE, DROP doesn't differ too wildly between SQLite, MySQL (aka MariaDB) and IBM's DB2 [I can't comment on Oracle, PostgreSQL, Microsoft SQL Server or other SQL engines as I've never used them].

Re: Database v text file

Posted: Tue Dec 03, 2013 12:44 am
by ame
I know SQL and I don't disagree, however OP asked for advice and I don't think using a text file (aka flat file) is a bad idea. In fact it's often overlooked. Learning how to manipulate text files and extract meaningful data is also a useful and transferable skill. Perhaps more useful, since we are not always presented with a nice SQL command line to sift through some random chunk of data.

Re: Database v text file

Posted: Tue Dec 03, 2013 12:48 am
by DougieLawson
Your argument holds water. We need to let the original poster decide.

I've very rarely done text processing (on Linux I've mostly done it with Perl, because I like writing programs that I can't understand six months later).

Re: Database v text file

Posted: Tue Dec 03, 2013 2:22 am
by JustThisGuy
A friend showed me this example of a command line phone contact list using a plaintext file as database.

File format:

Code: Select all

name: street address: city, state zip: extra fields...
One used your favorite editor to create the file. use the script or alias named 'phone' SEARCHTERM can be content of any field:

Code: Select all

grep SEARCHTERM plaintextfilename | tr : \n
The output, for example 'phone name', gives

Code: Select all

 street address
 city, state zip
 extra fields
Simple, elegant. I used that script for years on many a unix system. An appropriate filebased db, imho.

A thousand variables really isn't that much memory. Unless your 'constants' are larger than 1k a piece (that's a lot of text or a gigantic number) you'll use less than 1 megabyte of memory. You want speed, this is the way. I'd definitely code a simple hash or array from constants in a header file as a first try.

Second thing to consider is: Are these really constants, or do you want to change them, however rarely? If this data changes slowly and never changes format a file is what I would suggest.

If this database is in anyway a moving target which changes at moderate to fast rates you need a db product which has already done the heavy lifting when it comes to sessions and transactions and indexes and such. The leanest and simplest I've used is sqlite. There is no server to set up. The database resides in a single file and your application talks to it using a library linked to your app. It uses a subset of SQL, the lingua franca of relational databases and the basic stuff of creating and checking entries is fairly simple.

Db servers like MySQL, Postgresql and the rest are heavy duty workhorses. Don't go there unless you need to or just want to learn. It is my opinion that they would be overkill for your stated requirements.

Re: Database v text file

Posted: Tue Dec 03, 2013 6:19 am
by W. H. Heydt
Sigh... Kind of depends on what sort of values you're dealing with.

If they are--say--integers within a range that isn't to big, set up a file with one byte for each integer in the range with 0 for invalid values and 1 for valid values. The do a direct file access to see if the byte that corresponds to your value is a 0 or a 1.

To set up an example. Input data is in the range of 1000 to 1999. Create a 1000 byte file. Read byte #(<input value> - 1000) in the file. Is it a 1? Yes, valid value.