sudo-i
Posts: 6
Joined: Thu Jan 08, 2015 11:37 pm

sorting data

Thu Jan 08, 2015 11:55 pm

hello there,

Been using the RPi over a month now and I've already replaced arduino hub with it. I've expanded my sensor network to 500 nodes and having a big problem trying to sort the data.

here is an preview of the incoming data.log file on the RPi

09.01.2015 12:54:40 AM 5 2025
09.01.2015 12:55:01 AM 42 2118
09.01.2015 12:55:54 AM 32 1537
09.01.2015 12:55:58 AM 36 2012
09.01.2015 12:56:35 AM 6 2356
09.01.2015 12:56:37 AM 10 1243
09.01.2015 12:57:40 AM 38 1700
09.01.2015 12:57:43 AM 3 2306
09.01.2015 12:58:04 AM 4 1775
09.01.2015 12:58:41 AM 31 1806
09.01.2015 12:58:42 AM 7 1943
09.01.2015 12:59:02 AM 11 2300
09.01.2015 12:59:15 AM 2 2493
09.01.2015 12:59:25 AM 5 2025
09.01.2015 12:59:29 AM 33 2356
09.01.2015 12:59:30 AM 34 1793
09.01.2015 12:59:51 AM 12 2743
09.01.2015 12:59:56 AM 40 1450
09.01.2015 01:00:36 AM 42 2125
09.01.2015 01:00:48 AM 32 1525
09.01.2015 01:01:49 AM 6 2362
09.01.2015 01:01:55 AM 36 2006
09.01.2015 01:02:38 AM 10 1243
09.01.2015 01:02:39 AM 37 1975
09.01.2015 01:03:11 AM 4 1775
09.01.2015 01:03:18 AM 38 1700
09.01.2015 01:03:38 AM 3 2312
09.01.2015 01:03:52 AM 7 1950
09.01.2015 01:04:11 AM 5 2025
09.01.2015 01:04:18 AM 11 2293
09.01.2015 01:04:28 AM 34 1806
09.01.2015 01:04:43 AM 31 1812
09.01.2015 01:04:52 AM 2 2531
09.01.2015 01:05:08 AM 12 2737
09.01.2015 01:05:10 AM 40 1450
09.01.2015 01:05:14 AM 33 2350
09.01.2015 01:05:42 AM 32 1543
09.01.2015 01:06:11 AM 42 2118
09.01.2015 01:07:04 AM 6 2356
09.01.2015 01:07:16 AM 9 2006
09.01.2015 01:07:43 AM 37 1968
09.01.2015 01:07:52 AM 36 2006
09.01.2015 01:08:18 AM 4 1775
09.01.2015 01:08:40 AM 10 1237
09.01.2015 01:08:55 AM 38 1706
09.01.2015 01:08:57 AM 5 2018
09.01.2015 01:09:03 AM 7 1937
09.01.2015 01:09:26 AM 34 1793
09.01.2015 01:09:32 AM 3 2306
09.01.2015 01:09:36 AM 11 2300
09.01.2015 01:10:26 AM 12 2712
09.01.2015 01:10:27 AM 40 1450
09.01.2015 01:10:28 AM 2 2493
09.01.2015 01:10:36 AM 32 1543
09.01.2015 01:10:45 AM 31 1806
09.01.2015 01:10:59 AM 33 2356
09.01.2015 01:11:46 AM 42 2112
09.01.2015 01:12:18 AM 6 2356
09.01.2015 01:12:47 AM 37 1956
09.01.2015 01:13:25 AM 4 1768
09.01.2015 01:13:42 AM 5 2018
09.01.2015 01:13:48 AM 36 2018
09.01.2015 01:14:14 AM 7 1937
09.01.2015 01:14:24 AM 34 1793
09.01.2015 01:14:33 AM 38 1706
09.01.2015 01:14:41 AM 10 1237
09.01.2015 01:14:52 AM 11 2306
09.01.2015 01:15:27 AM 3 2312
09.01.2015 01:15:30 AM 32 1537
09.01.2015 01:15:43 AM 12 2718
09.01.2015 01:15:44 AM 40 1450
09.01.2015 01:16:05 AM 2 2506
09.01.2015 01:16:44 AM 33 2350
09.01.2015 01:17:20 AM 42 2118
09.01.2015 01:17:32 AM 6 2362
09.01.2015 01:17:52 AM 37 1950
09.01.2015 01:18:28 AM 5 2012
09.01.2015 01:18:32 AM 4 1775
09.01.2015 01:19:23 AM 34 1800
09.01.2015 01:19:41 AM 9 2006
09.01.2015 01:19:45 AM 36 2012
09.01.2015 01:20:08 AM 11 2300
09.01.2015 01:20:11 AM 38 1712
09.01.2015 01:20:24 AM 32 1543
09.01.2015 01:21:01 AM 12 2700
09.01.2015 01:21:01 AM 40 1450
09.01.2015 01:21:21 AM 3 2306
09.01.2015 01:21:41 AM 2 2487
09.01.2015 01:22:28 AM 33 2350
09.01.2015 01:22:47 AM 6 2356
09.01.2015 01:22:49 AM 31 1806
09.01.2015 01:22:55 AM 42 2112
09.01.2015 01:22:56 AM 37 1956
09.01.2015 01:23:13 AM 5 2000
09.01.2015 01:23:38 AM 4 1775
09.01.2015 01:24:21 AM 34 1812
09.01.2015 01:24:35 AM 7 1931
09.01.2015 01:25:18 AM 32 1537
09.01.2015 01:25:25 AM 11 2287

as you can see, not every second filled with data and the small number is node name, large number is the temperature in Cx100. what I need is, sorting that data automatically (no matter the size, currently my data.log holds about 500.000 entries) like this

No Date TIME /M ID Temp No Date TIME /M ID Temp
13 09.01.2015 12:59:15 AM 2 2493 8 09.01.2015 12:57:43 AM 3 2306
33 09.01.2015 01:04:52 AM 2 2531 27 09.01.2015 01:03:38 AM 3 2312
53 09.01.2015 01:10:28 AM 2 2493 49 09.01.2015 01:09:32 AM 3 2306
72 09.01.2015 01:16:05 AM 2 2506 68 09.01.2015 01:15:27 AM 3 2312
88 09.01.2015 01:21:41 AM 2 2487 87 09.01.2015 01:21:21 AM 3 2306

and make my life easier when I try to make graph of temperature changes or import data to somewhere else. I've tried excel, access, sql and matlab but I can't sort it due to time difference and node names,

any help will be appreciated!
trying to make the best outta my time on this pale blue dot.

User avatar
scruss
Posts: 3342
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: sorting data

Fri Jan 09, 2015 1:22 am

You're making it much harder on yourself by using a non-standard date format. If you output the timestamp in an ISO 8601 format (like 2015-01-09T01:14:55+0000) at the start of the line, and use UTC to get rid of timezone changes, the standard sort command will get your data in order. You've got a fairly complex problem in date manipulation here if you want to get these data in order.

You're also getting into territory that databases are built for. Even a simple lightweight system like SQLite can create tables with a timestamp as a key.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

User avatar
DougieLawson
Posts: 39626
Joined: Sun Jun 16, 2013 11:19 pm
Location: A small cave in deepest darkest Basingstoke, UK
Contact: Website Twitter

Re: sorting data

Fri Jan 09, 2015 1:31 am

scruss wrote:You're making it much harder on yourself by using a non-standard date format. If you output the timestamp in an ISO 8601 format (like 2015-01-09T01:14:55+0000) at the start of the line, and use UTC to get rid of timezone changes, the standard sort command will get your data in order. You've got a fairly complex problem in date manipulation here if you want to get these data in order.

You're also getting into territory that databases are built for. Even a simple lightweight system like SQLite can create tables with a timestamp as a key.
+1

As soon as you use a yyyy-mm-dd format for date and hh:mm (24 hour) military format for time then records will naturally sort in the right order (numeric or character sort) without using a database.
Note: Any requirement to use a crystal ball or mind reading will result in me ignoring your question.

Criticising any questions is banned on this forum.

Any DMs sent on Twitter will be answered next month.
All fake doctors are on my foes list.

User avatar
Richard-TX
Posts: 1549
Joined: Tue May 28, 2013 3:24 pm
Location: North Texas

Re: sorting data

Fri Jan 09, 2015 5:36 am

When I do daily log files I use a format like
yyyy-mm-dd

that way a ls always sorts things correctly.

In you case I would use
yyyy-mm-dd-hh-mm-ss

where hours are in 24 hour format.

If you wish to use a different separator, feel free.

Note that the month, hours, etc are always 2 digits long even if they are 00.
Richard
Doing Unix since 1985.
The 9-25-2013 image of Wheezy can be found at:
http://downloads.raspberrypi.org/raspbian/images/raspbian-2013-09-27/2013-09-25-wheezy-raspbian.zip

User avatar
iinnovations
Posts: 621
Joined: Thu Jun 06, 2013 5:17 pm

Re: sorting data

Fri Jan 09, 2015 6:36 am

Database!
CuPID Controls :: Open Source browser-based sensor and device control
interfaceinnovations.org/cupidcontrols.html
cupidcontrols.com

ame
Posts: 3172
Joined: Sat Aug 18, 2012 1:21 am
Location: New Zealand

Re: sorting data

Fri Jan 09, 2015 7:30 am

Richard-TX wrote:When I do daily log files I use a format like
yyyy-mm-dd

that way a ls always sorts things correctly.

In you case I would use
yyyy-mm-dd-hh-mm-ss

where hours are in 24 hour format.

If you wish to use a different separator, feel free.

Note that the month, hours, etc are always 2 digits long even if they are 00.
Congratulations! You've just invented ISO8601.

sudo-i
Posts: 6
Joined: Thu Jan 08, 2015 11:37 pm

Re: sorting data

Fri Jan 09, 2015 9:39 am

I see, I used a single line of code to log the data on https://docs.python.org/2/howto/logging.html here, maybe I need to push data directly to another server to sort and classify it, thanks for helping!
trying to make the best outta my time on this pale blue dot.

ame
Posts: 3172
Joined: Sat Aug 18, 2012 1:21 am
Location: New Zealand

Re: sorting data

Fri Jan 09, 2015 9:44 am

sudo-i wrote:I see, I used a single line of code to log the data on https://docs.python.org/2/howto/logging.html here, maybe I need to push data directly to another server to sort and classify it, thanks for helping!
I recently played with SQLite on the Pi, to store temperature sensor data. It was trivially easy.

Although flat files are fine for certain collections of data, once you have your data in an SQL database you really avoid a lot of problems.

I use cron to run a python script every minute. This collects the data from the sensors and adds it to the database. Other cron jobs run periodically to read the data and make graphs. There is only one 'writer' so SQLite runs quite well for my application.

gordon77
Posts: 5134
Joined: Sun Aug 05, 2012 3:12 pm

Re: sorting data

Fri Jan 09, 2015 10:36 am

sudo-i wrote:I see, I used a single line of code to log the data on https://docs.python.org/2/howto/logging.html here, maybe I need to push data directly to another server to sort and classify it, thanks for helping!
Can't you define the format you want?


import logging
logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')
logging.warning('is when this event was logged.')
which would display something like this:

12/12/2010 11:46:36 AM is when this event was logged.

sudo-i
Posts: 6
Joined: Thu Jan 08, 2015 11:37 pm

Re: sorting data

Fri Jan 09, 2015 12:02 pm

ame wrote: ...There is only one 'writer' so SQLite runs quite well for my application.
well that is the problem I'm experiencing, I'll have 5 more writers when I complete the sensor network, but I'm trying to add scipy codes now to the codes, I'm using RF24 library with lots of modifications, I might add 2 Arduino DUE for encryption on both ends.
trying to make the best outta my time on this pale blue dot.

User avatar
joan
Posts: 15006
Joined: Thu Jul 05, 2012 5:09 pm
Location: UK

Re: sorting data

Fri Jan 09, 2015 1:12 pm

The following awk script may be use to reformat your existing data into a sortable form.

conv.awk

Code: Select all

function ndate(s)
{
   return substr(s,7,4) "." substr(s,4,2) "." substr(s, 1,2)
}

function ntime(t, am)
{
   add=0; if (am == "PM") add=12;

   h = add + substr(t, 1, 2)

   return sprintf("%02d:%s", h, substr(t,4))
}

{print ndate($1),ntime($2,$3), $4,$5 }
Run as

awk -f conv.awk old_log_file >new_log_file

I'm not sure about the AM/PM values. Your log entries look odd to me.

The script produces output like

Code: Select all

2015.01.09 12:54:40 5 2025
2015.01.09 12:55:01 42 2118
2015.01.09 12:55:54 32 1537
2015.01.09 12:55:58 36 2012
2015.01.09 12:56:35 6 2356
2015.01.09 12:56:37 10 1243
2015.01.09 12:57:40 38 1700
2015.01.09 12:57:43 3 2306
2015.01.09 12:58:04 4 1775
2015.01.09 12:58:41 31 1806
2015.01.09 12:58:42 7 1943
2015.01.09 12:59:02 11 2300
2015.01.09 12:59:15 2 2493
2015.01.09 12:59:25 5 2025

ame
Posts: 3172
Joined: Sat Aug 18, 2012 1:21 am
Location: New Zealand

Re: sorting data

Fri Jan 09, 2015 1:41 pm

sudo-i wrote:
ame wrote: ...There is only one 'writer' so SQLite runs quite well for my application.
well that is the problem I'm experiencing, I'll have 5 more writers when I complete the sensor network, but I'm trying to add scipy codes now to the codes, I'm using RF24 library with lots of modifications, I might add 2 Arduino DUE for encryption on both ends.
All right, so make one program that collects data from all sources and then writes to the database once (per sampling period). Or use a 'proper' database engine that supports many simultaneous write operations.

sudo-i
Posts: 6
Joined: Thu Jan 08, 2015 11:37 pm

Re: sorting data

Fri Jan 09, 2015 8:25 pm

joan wrote: ...I'm not sure about the AM/PM values. Your log entries look odd to me.
ame wrote: All right, so make one program that collects data from all sources and then writes to the database once (per sampling period). Or use a 'proper' database engine that supports many simultaneous write operations.
*the date format is like that in our country, so.. (somewhat it has to be like that)
**maybe I need to create 2 separate SQL servers on the PC gathering data from 2 last raspberry pi hubs, I'll report the results, my other Pi will arrive in 2 days. so far only result I have is watching data flow over serial and connecting raspberry pi over SSH and sharing data to other PCs over samba (\\10.0.0.2\pi\Desktop\node.log etc)
***the python code is this now, replaced %I and %p with %H and data seems more beautiful now,

Code: Select all

logging.basicConfig(filename='/home/pi/Desktop/node.log',level=logging.INFO,format='%(asctime)s 	 %(message)s	',datefmt='%Y/%m/%d	%H:%M:%S')
09.01.2015 22:18:23 31 2056
trying to make the best outta my time on this pale blue dot.

User avatar
scruss
Posts: 3342
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: sorting data

Sat Jan 10, 2015 1:56 am

sudo-i wrote:*the date format is like that in our country, so.. (somewhat it has to be like that)
But ISO dates are everywhere … the ‘I’ is International.

You could (should, probably) store it in one format, but present it in another. Dealing with timestamps without having to correct for clock changes is a very solved problem. Don't waste your own time reinventing solutions.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

ame
Posts: 3172
Joined: Sat Aug 18, 2012 1:21 am
Location: New Zealand

Re: sorting data

Sat Jan 10, 2015 3:40 am

sudo-i wrote: *the date format is like that in our country, so.. (somewhat it has to be like that)
Classic rookie mistake.

Store your dates and times internally as UTC in ISO8601 format. Convert back and forth to local time formats as required, for user input and display output.

Mark_T
Posts: 149
Joined: Sat Dec 27, 2014 10:54 am

Re: sorting data

Sat Jan 10, 2015 12:25 pm

One trick I've used is to directly output data as SQL statements, like:

Code: Select all

insert into temp_log values ('2015-01-10 12:22:04', 5, 2025) ;
insert into temp_log values ('2015-01-10 12:22:04', 42, 2118) ;
Then you can just pipe into an appropriate MySQL command (or whatever). Arrange
things so that duplicates are ignored if you can so the system is robust. Archive the raw
files as a precautionary backup.

sudo-i
Posts: 6
Joined: Thu Jan 08, 2015 11:37 pm

Re: sorting data

Sat Jan 10, 2015 10:04 pm

scruss wrote: But ISO dates are everywhere … the ‘I’ is International...
thanks, I did that.
ame wrote: Classic rookie mistake...
not planning to stay rookie forever, but I'll be curious till the end of my time :)
Mark_T wrote:One trick I've used is to directly output data as SQL statements, like:

Code: Select all

insert into temp_log values ('2015-01-10 12:22:04', 5, 2025) ;
insert into temp_log values ('2015-01-10 12:22:04', 42, 2118) ;
Then you can just pipe into an appropriate MySQL command (or whatever). Arrange
things so that duplicates are ignored if you can so the system is robust. Archive the raw
files as a precautionary backup.
exactly what I did now, on access though, linked directly to the raspberry pi internal log over Network, seems solid with VPN between raspberry pi' and PC.
Attachments
access.png
auto sorting data on access
access.png (50.93 KiB) Viewed 2005 times
trying to make the best outta my time on this pale blue dot.

Return to “Automation, sensing and robotics”