AlexKoe
Posts: 25
Joined: Tue Jan 20, 2015 3:26 pm
Contact: Website

How to create a sensor database with IBM Informix - Part 1

Mon Jan 26, 2015 9:43 pm

Preface:
This post is part 1 in a little how-to series on how to use the IBM Informix database on the Raspberry Pi to build an efficient sensor data data database for local sensor data analytics/processing/caching.

The other posts can be found here:

viewtopic.php?f=37&t=97199&p=674959 (Informix Intro and Installation)
viewtopic.php?f=37&t=100029&p=693935 (An Informix Sensor DB - Part 2)
viewtopic.php?f=37&t=137392&p=912405 (The Informix REST API - Part 3)
viewtopic.php?uid=131887&f=37&t=140398 (Round robin sensor data storage with IBM Informix - Part 4)

Hi,

since I posted a quick How-To on how to create a simple Informix database instance on Raspian a few days ago, I wanted to show you how easy it is to create an Informix based sensor / time series database.

Before I start, just a little bit of technology background on how Informix is handling sensor data:
Informix is an (object-) relational database and hence its usage and its SQL language very similar to other relational databases which are available for the RPi. BUT, it has built-in enhancements to very efficiently handle time stamped (e.g. sensor-) data plus JSON documents.

Let's have a look on how one would typically store sensor data in any relational database. One would create a table, similar to the following:

Code: Select all

create table sensor_data
  (
    sensor_id char(40) not null ,
    sensor_type char(20),
    sensor_unit char(6),
    timestamp datetime year to fraction(5),
    value decimal(15,2),
    primary key (sensor_id)
  );
In addition it would be common to add one or more indexes to speed up the data lookup for a specific timestamp/sensor_id combination (e.g. for aggregations):

Code: Select all

create unique index sensor_data_ix1 on sensor_data(timestamp, sensor_id);
That is a very valid approach and works quite a well as long as the amount of data is quite small and as long as you are planning to write e.g. your own aggregation and validation functions. One drawback really is that for each sensor value you are storing redundant data (e.g. sensor_id, sensor_unit, sensor_type). Ok, you could reduce some of that redundancy by introducing another relation (table) which holds that meta information for each sensor_id. But then you also still have redundant data in your index(es). Especially in a (likely) space restricted environment like a Raspberry Pi, a space efficient storage would be very desirable. Also less space usage means less I/O operations and hence more efficient usage of the RPi's resources (e.g. CPU and RAM) over all. ;)

The IBM Informix approach on how to handle sensor / time series data
Informix supports an extended SQL data type called 'TIMESERIES'. An Informix database table can have one or more TIMESERIES columns. A TIMESERIES column can store time series data with regular or irregular time stamp intervals.

An example for an 'irregular time series' would be some kind of event log or a sensor which generates its data in changing measurement intervals. A 'regular time series' could be a sensor which either produces data in equal time intervals (e.g. every 60 seconds) or is being read out in equal intervals (e.g. once every 15 minutes).

Informix is storing each time series type in a database object called 'container'. The storage behavior is optimized for each type ('regular', 'irregular').

In the case of a regular time series Informix only stores the actual 'payload' of a time series (e.g. a sensor's measurement value). Time stamp information are ignored and meta data are only stored once for each unique sensor id. For a regular time series time stamp information can be ignored, since Informix knows when a time series started ('origin') and in what kind of interval the data is provided (through a so called 'calendar' and 'calendar pattern' concept).

For an irregular time series the time stamps are stored together with the payload, but the meta information is again only stored once.

Compared to a traditional relational approach you will notice a significant (e.g. up to 60- 80%) less storage usage by using the Informix TIMESERIES data type. In addition all of the time series data will be nicely clustered on disk. The combination of optimal clustering and less disk usage leads to a much better I/O performance compared to a pure relational approach.

To work with the time series data, Informix provides a built-in set of more than 100 SQL time series focused functions which support an application developer in accessing the data on the database w/o the requirement to retrieve them into a client application first. So instead of e.g. reading thousands of records into an application for validation or aggregation, one simply does those steps in the database and only returns the results back to the client. If there is a requirement for a specialized function, one can write a server side function either in C or in Java (that Informix functionality had been inherited from Illustra, an early commercial version of PostgreSQL).

I don't want to spend too much time now on the additional technical details, but I should at least mention that Informix has a built-in time series data high performance loader SQL API which can achieve e.g. on a current Intel CPU (I guess I need to run a benchmark on an RPi soon) about 60k sensor values per CPU(core) per second with a near linear scalability.

Ok, enough theory, let's get started.

In my How-To guide I will build upon the the relational example above, but by using the Informix TIMESERIES data type instead.

Create a basic sensor database with Informix
In my example I will create a sensor database with with a sensor data table to hold the data for sensors which are producing measurements in 1-minute intervals. So we will be dealing with regular time series.

In the very first step let's create an empty Informix database which will become the 'playground' for all my future how-tos. I am using the Informix instance which I created in my other How-To guide on creating an Informix instance.

Code: Select all

echo "create database sensor_db with log" | dbaccess - -
Now we need to describe/define the actual payload for the timeseries column in our sensor data table. In Informix that is done by creating a so called ANSI SQL 'row type' (similar to e.g. a C struct). That row type can contain either 'traditional' data types like e.g. CHAR or DECIMAL or INTEGER etc. or it can contain an element of type BSON (binary JSON) or a mixture of both approaches. Let's start with a 'traditional' row type first and let's call it 'sensor_t':

Code: Select all

create row type sensor_t
(
        timestamp       datetime year to fraction(5),
        value           decimal(15,2)
);
IMPORTANT: Every row type which should be used as the payload for an Informix TIMESERIES column needs to have one datetime column defined as 'datetime year to fraction(5)'!
The 'value' element will later store our measurement value.

In the next step we need to create a 'container' which will store the actual time series data. Each container is being optimized for a specific time series type. Let's create a container named 'sensor_cont' for the time series row type 'sensor_t' in the Informix storage area 'rootdbs' with an initial container extend size of 2048 kB and a next extend size of 2048 kB:

Code: Select all

execute procedure TSContainerCreate (
        'sensor_cont',
        'rootdbs',
        'sensor_t',
        2048,
        2048);
In the following step we then create the actual time series base table with the TIMESERIES column 'sensor_values', referencing the time series type 'sensor_t':

Code: Select all

create table sensor_ts (
        sensor_id char(40),
        sensor_type char(20),
        sensor_unit char(6),
        sensor_values timeseries(sensor_t),
        primary key (sensor_id)
) lock mode row;
Now we do have an empty Informix time series table ready to be used for sensor data.

To make it easy for us to use that table in a way that the most of you are used to it, I am now creating a virtual, relational view/table on the time series data. Those virtual tables are called 'Virtual Table Interface' tables or simply VTI tables. Those VTI tables allow an easy relational-like handling of Informix time series data.
I am creating a VTI table with the name 'sensor_data' (like the relational table in the beginning of my post) based on the time series base table 'sensor_ts'.

Code: Select all

execute procedure TSCreateVirtualTab (
        'sensor_data',
        'sensor_ts',
        'origin(2015-01-26 00:00:00.00000),calendar(ts_1min),container(sensor_cont),threshold(0),regular',0,'sensor_values');
The 3rd parameter of the 'TSCreateVirtualTab' stored procedure, that long string, describes a) the default origin or the default start time point of each newly created time series, b) its interval ('calendar') - here its a 1 minute interval, c) which container to be used ('sensor_cont') as the default storage, d) that it is a 'regular' time series and e) it contains a reference to actual time series column ('sensor_values'). Finally 'threshold(0)' tells the database to store all of the time series values in the designated container.
After executing that stored procedure, you will have a virtual table called 'sensor_data' with the following schema:

Code: Select all

create table sensor_data
  (
    sensor_id char(40) not null ,
    sensor_type char(20),
    sensor_unit char(6),
    timestamp datetime year to fraction(5),
    value decimal(15,2)
  );
Now let's test the 'sensor_data' table by inserting the following 6 demo records. Remember, we defined an 1 minute interval for our time series column, so we are just providing time stamps down to a minute granularity:

Code: Select all

insert into sensor_data values ("Sensor01", "Temp", "C", "2015-01-26 08:00"::datetime year to minute, 21.5);
insert into sensor_data values ("Sensor01", "Temp", "C", "2015-01-26 08:01"::datetime year to minute, 21.6);
insert into sensor_data values ("Sensor02", "Temp", "C", "2015-01-26 15:45"::datetime year to minute, 35.9);
insert into sensor_data values ("Sensor01", "Temp", "C", "2015-01-26 08:02"::datetime year to minute, 22.1);
insert into sensor_data values ("Sensor02", "Temp", "C", "2015-01-26 15:46"::datetime year to minute, 35.2);
insert into sensor_data values ("Sensor02", "Temp", "C", "2015-01-26 15:47"::datetime year to minute, 33.5);
Each time you insert a row with a new 'sensor_id' through the VTI table 'sensor_data' a new record is being inserted into the time series base table 'sensor_ts' and the first time series value is stored in the timeseries column 'sensor_values'.
Each time you insert a row with an existing 'sensor_id' into the VTI table 'sensor_data' the provided value is either added or updated (depending on its time stamp) in the 'sensor_values' column.

With those few example rows inserted into the database, now let's work bit with them. Do a 'SELECT *' on the sensor_data table first:

Code: Select all

echo "select * from sensor_data" | dbaccess sensor_db -

Database selected.

sensor_id    Sensor01
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 08:00:00.00000
value        21.50

sensor_id    Sensor01
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 08:01:00.00000
value        21.60

sensor_id    Sensor01
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 08:02:00.00000
value        22.10

sensor_id    Sensor02
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 15:45:00.00000
value        35.90

sensor_id    Sensor02
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 15:46:00.00000
value        35.20

sensor_id    Sensor02
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 15:47:00.00000
value        33.50

6 row(s) retrieved.

Database closed.
Now let's try to run the following SQL query against the time series base table: 'SELECT sensor_id, sensor_type, sensor_unit FROM sensor_ts':

Code: Select all

echo "SELECT sensor_id, sensor_type, sensor_unit FROM sensor_ts" | dbaccess sensor_db -

Database selected.

sensor_id                                sensor_type          sensor_unit

Sensor01                                 Temp                 C
Sensor02                                 Temp                 C

2 row(s) retrieved.

Database closed.
Notice there are only two rows (one for each unique sensor id) vs. the six rows for each measurement in the virtual table view.

Before I finish that first and very basic part on how to use the Informix time series data type, I would like to give you a glimpse on what you can do with the stored data. Before I do that let's add a few additional example rows through our 'sensor_data' VTI table:

Code: Select all

insert into sensor_data values ("Sensor01", "Temp", "C", "2015-01-26 07:58"::datetime year to minute, 20.2);
insert into sensor_data values ("Sensor01", "Temp", "C", "2015-01-26 07:59"::datetime year to minute, 20.7);
insert into sensor_data values ("Sensor01", "Temp", "C", "2015-01-27 10:11"::datetime year to minute, 26.3);
insert into sensor_data values ("Sensor01", "Temp", "C", "2015-01-27 10:45"::datetime year to minute, 26.9);
A typical operation on sensor data is the aggregation to a different time granularity, e.g. from minutes to hours or to days, weeks or whatever.
Such an aggregation can be done in Informix either on the VTI table level and by using traditional SQL statements, or more efficiently by using some of the built-in time series functions on the time series base table.

In the following SQL example we are aggregating the minute interval based data for 'sensor_id' 'Sensor01' to an hourly granularity:

Code: Select all

SELECT AggregateBy(
        'SUM($value)',
        'ts_1hour',
        sensor_values,
        0,
        '2015-01-26 00:00'::datetime year to minute,
        '2015-01-31 23:59'::datetime year to minute)
FROM sensor_ts
WHERE sensor_id = "Sensor01";
The 'AggregateBy()' function is returning a TIMESERIES object with the aggregated values based on the supplied 'ts_1hour' calendar for the date/time range from/to provided. The result looks like this:

Code: Select all

(expression)  origin(2015-01-26 00:00:00.00000), calendar(ts_1hour), container(sensor_cont), threshold(0), regular, [NULL, NULL, NULL, NULL, NULL, NULL, NULL, (40.90), (65.20), NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, (53.20)]

1 row(s) retrieved.
Each element of the returned time series contains hourly aggregated values. If there are no measurements in the base time series the aggregated value is NULL. BTW for these exercises I am using already pre-defined calendar definitions. They are stored in the 'calendartable' table.
If you need a different aggregation granularity, simply supply a different calendar to the 'AggregateBy()' function. In the following example I am using exactly the same SELECT statement like before, except for a different calendar ('ts_1day'):

Code: Select all

SELECT AggregateBy(
        'SUM($value)',
        'ts_1day',
        sensor_values,
        0,
        '2015-01-26 00:00'::datetime year to minute,
        '2015-01-31 23:59'::datetime year to minute)
FROM sensor_ts
WHERE sensor_id = "Sensor01";
...with the following result:

Code: Select all

(expression)  origin(2015-01-26 00:00:00.00000), calendar(ts_1day), container(sensor_cont), threshold(0), regular, [(106.10), (53.20)]

1 row(s) retrieved.
Since your application probably doesn't know how to handle an Informix time series object directly, let's convert those objects on the fly into a tabular format for a more classic relational processing. We are using a combination of the 'Transpose()' function and the 'Table()' constructor to create the derived table 'sensor':

Code: Select all

SELECT sensor.timestamp::datetime year to hour, sensor.value FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'SUM($value)',
        'ts_1hour',
        sensor_values,
        0,
        '2015-01-26 00:00'::datetime year to minute,
        '2015-01-31 23:59'::datetime year to minute)
FROM sensor_ts
WHERE sensor_id = "Sensor01"
))::sensor_t
) AS TAB (sensor);
...with the following result set:

Code: Select all

timestamp                value

2015-01-26 07            40.90
2015-01-26 08            65.20
2015-01-27 10            53.20

3 row(s) retrieved.
Note that I also adjusted the format of the timestamp column to an hourly format.
Finally re-run the daily aggregation in a similar way:

Code: Select all

SELECT sensor.timestamp::datetime year to day timestamp, sensor.value FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'SUM($value)',
        'ts_1day',
        sensor_values,
        0,
        '2015-01-26 00:00'::datetime year to minute,
        '2015-01-31 23:59'::datetime year to minute)
FROM sensor_ts
WHERE sensor_id = "Sensor01"
))::sensor_t
) AS TAB (sensor);
with the following results:

Code: Select all

timestamp             value

2015-01-26           106.10
2015-01-27            53.20

2 row(s) retrieved.
Increase your application's flexibility with a JSON based sensor data table!
Now that you have learned the basics on how to create an Informix sensor data table based on structured data types so far, let's have a brief look on how to create a JSON based sensor data table.
The following steps are more or less the same like before, except that the required Informix row type looks differently:

Code: Select all

create row type sensor_json_t
(
        timestamp       datetime year to fraction(5),
        values           bson
);
BSON is the binary representation of a JSON document. Having a JSON document in binary format makes it easier for the database to work with it.
In the following SQL script we are creating a new time series container for the JSON time series, a new base table with the TIMESERIES column and a new VTI table to easily access the JSON based sensor data:

Code: Select all

execute procedure TSContainerCreate (
        'sensor_json_cont',
        'rootdbs',
        'sensor_json_t',
        2048,
        2048);

create table sensor_json_ts (
        sensor_id char(40),
        sensor_type char(20),
        sensor_unit char(6),
        sensor_values timeseries(sensor_json_t),
        primary key (sensor_id)
) lock mode row;

execute procedure TSCreateVirtualTab (
        'sensor_json_data',
        'sensor_json_ts',
        'origin(2015-01-26 00:00:00.00000),calendar(ts_1min),container(sensor_json_cont),threshold(0),regular',0,'sensor_values');
As soon as you have the VTI table 'sensor_json_data' created, you can start to insert new rows with JSON formatted sensor data. Notice that 'Sensor03' and 'Sensor04' have different number of measurement values ('temp1' and 'temp' respectively):

Code: Select all

insert into sensor_json_data values ("Sensor03", "Temp", "C", "2015-01-26 08:00"::datetime year to minute, '{ "temp1":21.5 }'::json);
insert into sensor_json_data values ("Sensor03", "Temp", "C", "2015-01-26 08:01"::datetime year to minute, '{ "temp1":23.1 }'::json);
insert into sensor_json_data values ("Sensor03", "Temp", "C", "2015-01-27 10:45"::datetime year to minute, '{ "temp1":22.8 }'::json);
insert into sensor_json_data values ("Sensor04", "Temp", "C", "2015-01-26 12:02"::datetime year to minute, '{ "temp1":30.2, "temp2":10.3}'::json);
insert into sensor_json_data values ("Sensor04", "Temp", "C", "2015-01-27 15:46"::datetime year to minute, '{ "temp1":34.0, "temp2":9.7}'::json);
insert into sensor_json_data values ("Sensor04", "Temp", "C", "2015-01-27 15:47"::datetime year to minute, '{ "temp1":33.9, "temp2":9.2}'::json);
Now let's select the data which we just inserted:

Code: Select all

select sensor_id, sensor_type, sensor_unit, timestamp, values::json values from sensor_json_data;
...with the following results:

Code: Select all

sensor_id    Sensor03
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 08:00:00.00000
values       {"temp1":21.500000}

sensor_id    Sensor03
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 08:01:00.00000
values       {"temp1":23.100000}

sensor_id    Sensor03
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-27 10:45:00.00000
values       {"temp1":22.800000}

sensor_id    Sensor04
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-26 12:02:00.00000
values       {"temp1":30.200000,"temp2":10.300000}

sensor_id    Sensor04
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-27 15:46:00.00000
values       {"temp1":34.000000,"temp2":9.700000}

sensor_id    Sensor04
sensor_type  Temp
sensor_unit  C
timestamp    2015-01-27 15:47:00.00000
values       {"temp1":33.900000,"temp2":9.200000}

6 row(s) retrieved.
Now that's really cool: although we are now using JSON documents to store the sensor data, we can still use the same time series functions to do the aggregations:

Code: Select all

SELECT sensor.timestamp::datetime year to hour timestamp, sensor.value temp1 FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'SUM($temp1)',
        'ts_1hour',
        sensor_values,
        0,
        '2015-01-26 00:00'::datetime year to minute,
        '2015-01-31 23:59'::datetime year to minute)::TimeSeries(sensor_t)
FROM sensor_json_ts
WHERE sensor_id = "Sensor03"
))
) AS TAB (sensor);
Notice that there is an explicit cast to create a timeseries object based on the 'sensor_t' row type which contains one 'value' element. The results of that query:

Code: Select all

timestamp                temp1

2015-01-26 08            44.60
2015-01-27 10            22.80

2 row(s) retrieved.
If you would like to do an aggregation on two JSON fields, you need to a create a new row type with two numeric elements, e.g. value1 and value2:

Code: Select all

create row type sensor_twovals_t
(
        timestamp       datetime year to fraction(5),
        value1           decimal(15,2),
        value2           decimal(15,2)
);
The associated query looks like:

Code: Select all

SELECT sensor.timestamp::datetime year to hour timestamp,
        sensor.value1 temp1,  sensor.value2 temp2 FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'SUM($temp1),SUM($temp2)',
        'ts_1hour',
        sensor_values,
        0,
        '2015-01-26 00:00'::datetime year to minute,
        '2015-01-31 23:59'::datetime year to minute)::TimeSeries(sensor_twovals_t)
FROM sensor_json_ts
WHERE sensor_id = "Sensor04"
))
) AS TAB (sensor);
...with the following results:

Code: Select all

timestamp                temp1            temp2

2015-01-26 12            30.20            10.30
2015-01-27 15            67.90            18.90

2 row(s) retrieved.
That concludes part 1 on how to create a sensor database with IBM Informix. Have fun while exploring the Informix time series technology for handling sensor data on the Raspberry Pi.

- Alexander
Last edited by AlexKoe on Wed Mar 16, 2016 12:06 pm, edited 2 times in total.

AlexKoe
Posts: 25
Joined: Tue Jan 20, 2015 3:26 pm
Contact: Website

Re: How to create a sensor database with IBM Informix - Part

Wed Jan 28, 2015 1:14 pm

A simple use case: read out the internal temperature of an Raspberry Pi every 60 secs and store it in the Informix 'sensor_data' table


Ok, since we now have a working Informix instance and a database to store some sensor data in, let's have some fun and use that setup to store some real sensor data. In my small example I going to use only the Informix database to automatically read out the Raspberry Pi's internal temperature every 60 seconds and store it in the 'sensor_data' table which I did create in the how-to above.
The RPi's internal temperature can be accessed through the following file:

Code: Select all

/sys/class/thermal/thermal_zone0/temp
The current RPi temperature is stored there in milli-degrees centigrade as an ASCII string.

To access that value from within Informix we are going to create a so called EXTERNAL TABLE. EXTERNAL TABLES allow easy access from and to external files and hence are great e.g. for loading and/or exporting data. Let's create an EXTERNAL TABLE called 'rpi_temp':

Code: Select all

create external table rpi_temp
(
        temp1 integer external char(6)
)
using
(
        FORMAT "FIXED",
        DATAFILES
        (
                "DISK:/sys/class/thermal/thermal_zone0/temp"
        )
);
So each time we do e.g. a 'SELECT temp1/1000 from rpi_temp' we will get the current RPi temperature in centigrades. Note that we need to divide by 1000 in order to convert the raw value from milli-degrees centigrade to centigrade:

Code: Select all

echo "select temp1/1000 from rpi_temp" | dbaccess sensor_db -

Database selected

    (expression)

48.6920000000000

1 row(s) retrieved.
Now we need to find a way to read out that table every 60 seconds and store it in the sensor_data table. There might be different ways of doing it, e.g. by running an external script or program every 60 seconds. For my example I am going to use Informix's built-in scheduler. Its kind like a database crontab.
All so called 'tasks' are stored in the 'ph_task' table in the 'sysadmin' database. So all what we need to do is to write a task which is going to execute the following INSERT statement every 60 seconds:

Code: Select all

insert into sensor_data
        select "SensorRPiTemp", "Temperature", "C",  current::datetime year to minute,
                (temp1/1000)::DECIMAL(15,2) from rpi_temp;
Now create a new task by inserting a new row into the 'sysadmin:ph_task' table as user 'informix':

Code: Select all

INSERT INTO ph_task
(
        tk_name,
        tk_description,
        tk_type,
        tk_group,
        tk_execute,
        tk_start_time,
        tk_stop_time,
        tk_frequency,
        tk_dbs
)
VALUES
(
        "Read Out RPi Temp",
        "Reads out the RPi internal temp every 60 secs",
        "TASK",
        "MISC",
        "insert into sensor_db:sensor_data select 'SensorRPiTemp', 'Temperature', 'C', current::datetime year to minute, (temp1/1000)::DECIMAL(15,2) from sensor_db:rpi_temp",
        NULL,
        NULL,
        interval(1) minute to minute,
        "sensor_db"
)
Since the Informix scheduler is caching the content of the 'ph_task' table, you should stop and re-start the scheduler with the following SQL statements as user 'informix':

Code: Select all

echo "EXECUTE FUNCTION task('scheduler shutdown')" | dbaccess sysadmin -
echo "EXECUTE FUNCTION task('scheduler start')" | dbaccess sysadmin -
Now the task will run indefinitely (since we set 'tk_stop_time' to NULL) and will fill the 'sensor_data' table with the current Raspberry Pi internal temperature.
With the following SELECT statement you can read out the last five entries (temperature entries) from the 'sensor_data' table in descending order:

Code: Select all

echo "select first 5 * from sensor_data where sensor_id = 'SensorRPiTemp' order by timestamp desc" | dbaccess sensor_db -
Run some simple aggregations on the stored RPi temperature values in the 'sensor_data' table
Since we are now continuously collecting the RPi temperature data, let's run some simple statistics on those temperature values. How about some hourly min, max, avg values? For that query we do need an additional row type with three values:

Code: Select all

create row type sensor_threevals_t
(
        timestamp       datetime year to fraction(5),
        value1           decimal(15,2),
        value2           decimal(15,2),        
        value3           decimal(15,2)
);
And now let's do the hourly aggregation directly on the temperature data:

Code: Select all

SELECT sensor.timestamp::datetime year to hour timestamp,
        sensor.value1 min, sensor.value2 max, sensor.value3 avg
FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'min($value), max($value), avg($value)',
        'ts_1hour',
        sensor_values,
        0,
        '2015-01-01 00:00'::datetime year to minute,
        '2015-12-31 23:59'::datetime year to minute)::TimeSeries(sensor_threevals_t)
FROM sensor_ts
WHERE sensor_id = "SensorRPiTemp"
))
) AS TAB (sensor);
...with the following results (example):

Code: Select all

timestamp                  min              max              avg

2015-01-28 11            48.15            48.69            48.42
2015-01-28 12            48.15            49.23            48.75
2015-01-28 13            48.69            49.23            48.72

3 row(s) retrieved.
An now you can change the aggregation granularity by simply providing a different calendar. E.g. using a daily aggregation:

Code: Select all

SELECT sensor.timestamp::datetime year to day timestamp,
        sensor.value1 min, sensor.value2 max, sensor.value3 avg
FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'min($value), max($value), avg($value)',
        'ts_1day',
        sensor_values,
        0,
        '2015-01-01 00:00'::datetime year to minute,
        '2015-12-31 23:59'::datetime year to minute)::TimeSeries(sensor_threevals_t)
FROM sensor_ts
WHERE sensor_id = "SensorRPiTemp"
))
) AS TAB (sensor);
...with the following result (example):

Code: Select all

timestamp               min              max              avg

2015-01-28            48.15            49.23            48.73

1 row(s) retrieved.
...or a weekly aggregation:

Code: Select all

SELECT sensor.timestamp::datetime year to day timestamp,
        sensor.value1 min, sensor.value2 max, sensor.value3 avg
FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'min($value), max($value), avg($value)',
        'ts_1week',
        sensor_values,
        0,
        '2015-01-01 00:00'::datetime year to minute,
        '2015-12-31 23:59'::datetime year to minute)::TimeSeries(sensor_threevals_t)
FROM sensor_ts
WHERE sensor_id = "SensorRPiTemp"
))
) AS TAB (sensor);
...with the following result (example):

Code: Select all

timestamp               min              max              avg

2015-01-25            48.15            49.23            48.73

1 row(s) retrieved.
Notice: the default 'ts_1week' calendar is based on a U.S. week which starts on a Sunday. To use a weekly calendar which starts on a Monday, simply INSERT a new calendar into the 'calendartable' in the 'sensor_db' database:

Code: Select all

echo 'INSERT INTO calendartable (c_name, c_calendar) values ("ts_1week_m", "startdate(2011-01-03 00:00:00.00000),pattstart(2011-01-03 00:00:00.00000),pattern({1 on},week)")' | dbaccess sensor_db -
And then re-run the last SELECT statement with that new calendar:

Code: Select all

SELECT sensor.timestamp::datetime year to day timestamp,
        sensor.value1 min, sensor.value2 max, sensor.value3 avg
FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'min($value), max($value), avg($value)',
        'ts_1week_m',
        sensor_values,
        0,
        '2015-01-01 00:00'::datetime year to minute,
        '2015-12-31 23:59'::datetime year to minute)::TimeSeries(sensor_threevals_t)
FROM sensor_ts
WHERE sensor_id = "SensorRPiTemp"
))
) AS TAB (sensor);
...with the following result (example):

Code: Select all

timestamp               min              max              avg

2015-01-26            48.15            49.23            48.73

1 row(s) retrieved.
Notice the different day ('timestamp') in the result set. ;-)

Cool Feature: Expression VTI tables
Before I finish that posting, let me quickly introduce another cool Informix time series feature: the 'Expression Virtual Table' or 'Expression VTI Table'. An (read-only!) expression virtual table combines the tabular view onto an Informix time series with a dynamic expression which is executed each time you select from that table.
Example: if you would like to display the hourly aggregated Raspberry Pi temperature values it would be probably very easy for your application, if you could select those values, dynamically calculated from a simple 'relational-like' table. That's where an expression virtual table 'kicks' in.
So let's quickly create an expression virtual table which provides min, max and avg values for the RPi temperatures on-the-fly:

Code: Select all

EXECUTE PROCEDURE TSCreateExpressionVirtualTab(
       'sensor_data_min_max_avg', 'sensor_ts',
       'AggregateBy("min($value),max($value),avg($value)", "ts_1hour", sensor_values, 0, $ts_begin_time, $ts_end_time)',
       'sensor_threevals_t', 0, 'sensor_values');
Now you can simply query that new virtual table 'sensor_data_min_max_avg' and each time you will the hourly aggregated values:

Code: Select all

select sensor_id, sensor_type, sensor_unit,
        timestamp::datetime year to hour timestamp,
        value1 min, value2 max, value3 avg
from sensor_data_min_max_avg
where sensor_id = "SensorRPiTemp"
This concludes that how-to section.

Have fun while trying the examples above.

- Alexander

msd
Posts: 6
Joined: Sun Feb 21, 2016 8:29 am

Re: How to create a sensor database with IBM Informix - Part

Tue Feb 23, 2016 7:57 am

Hi Alexander,

selecting data on a RPi1 from sensor_data table is very slow, even when reading a single value, do you know the reason why is it like that?

BR
Marco

AlexKoe
Posts: 25
Joined: Tue Jan 20, 2015 3:26 pm
Contact: Website

Re: How to create a sensor database with IBM Informix - Part

Tue Feb 23, 2016 9:38 am

msd wrote:Hi Alexander,

selecting data on a RPi1 from sensor_data table is very slow, even when reading a single value, do you know the reason why is it like that?

BR
Marco
Hi Marco,

it would be great, if you could share how your (intended) query looks like, how do you execute your query (e.g. via dbaccess or the REST API) and what would you like to achieve with your query.

Also do you run your tests on a RPi or on a RPi2?

There could be many different reasons why a query might run slow including network timeouts while connecting to the database, a non-optimal query execution path to access the time series data, some HW/OS issue or some kind of Informix configuration issue.

- Alexander

msd
Posts: 6
Joined: Sun Feb 21, 2016 8:29 am

Re: How to create a sensor database with IBM Informix - Part

Tue Feb 23, 2016 4:00 pm

AlexKoe wrote: it would be great, if you could share how your (intended) query looks like, how do you execute your query (e.g. via dbaccess or the REST API) and what would you like to achieve with your query.

Also do you run your tests on a RPi or on a RPi2?
Hi, yes sorry for not giving a better explanation:

I have a RPi (1) and have installed informix 1210UC6DE using your guide, which worked fine.

Now I set up my sensor network with 9 sensors. Data is sent over radio, I have a small app (TCL script) on the RPi that polls from the serial interface and fills the DB using console commands. Inserting one value needs around 1s, but this is not a problem because data comes every minute.

Running on the same RPi I have nginx with PHP-FPM and the PDO_INFORMIX (howto here btw: http://stackoverflow.com/a/19932259/2248541)

Now I want to get latest data from all sensors:

I didn't figure a better way in SQL a to do multiple single queries:

Code: Select all

SELECT value, timestamp::datetime year to minute as time FROM sensor_data WHERE sensor_id "Sensor03" ORDER BY timestamp DESC LIMIT 1
Probably there are better ways, but if I do this in the console:

Code: Select all

time echo "SELECT value, timestamp::datetime year to minute as time FROM sensor_data WHERE sensor_id = "Sensor03" ORDER BY timestamp DESC LIMIT 1" | dbaccess sensor_db -     

Database selected.

           value time
            1.37 2016-02-23 16:35

1 row(s) retrieved.

Database closed.

real    0m0.637s
user    0m0.020s
sys     0m0.050s
I think 0.6s I way to long to retrieve a single value.

I selected data the same way on MYSQL before and it was much faster, so I don't think it lies in the SD card speed...

On the other hand selecting from sensor_ts is much faster (not a VTI?):

Code: Select all

time echo "select sensor_id from sensor_ts" | dbaccess sensor_db -

Database selected.

sensor_id

Sensor01
Sensor02
Sensor03
Sensor04
Sensor05
Sensor06
Sensor07
Sensor08
Sensor09

9 row(s) retrieved.

Database closed.

real    0m0.148s
user    0m0.040s
sys     0m0.040s
What am I missing?

AlexKoe
Posts: 25
Joined: Tue Jan 20, 2015 3:26 pm
Contact: Website

Re: How to create a sensor database with IBM Informix - Part

Tue Feb 23, 2016 8:19 pm

Hi Marco,

although the VTI table provides you with a handy relational view of your time series data, its unfortunately not optimal for all SQL operations (like in your case the 'ORDER BY timestamp DESC LIMIT 1').

In that case its much more efficient to use one of the built-in time series SQL functions and to operate directly on the time series base table ('sensor_ts'). The function you are looking for is in your case 'GetLastElem()'.

Please try the following SELECT statement and let me know how it performs:

Code: Select all

SELECT GetLastElem(sensor_values).value from sensor_ts where sensor_id = 'Sensor03'
For more information on the built-in Informix time series functions, please refer to the online documentation:
http://www-01.ibm.com/support/knowledge ... tm?lang=en

If you have any additional questions, just let me know. ;)

- Alexander

msd
Posts: 6
Joined: Sun Feb 21, 2016 8:29 am

Re: How to create a sensor database with IBM Informix - Part

Wed Feb 24, 2016 8:14 am

AlexKoe wrote:Please try the following SELECT statement and let me know how it performs:

Code: Select all

SELECT GetLastElem(sensor_values).value from sensor_ts where sensor_id = 'Sensor03'
Hi Alexander, this works great, exactly as expected. Thank you!

Is there any function to get e.g. last 24 hours of sensor data, other than just this following?

Code: Select all

SELECT ...  WHERE timestamp > CURRENT - 24 UNITS HOUR;
BR
Marco

AlexKoe
Posts: 25
Joined: Tue Jan 20, 2015 3:26 pm
Contact: Website

Re: How to create a sensor database with IBM Informix - Part

Wed Feb 24, 2016 11:13 am

msd wrote: Hi Alexander, this works great, exactly as expected. Thank you!

Is there any function to get e.g. last 24 hours of sensor data, other than just this following?

Code: Select all

SELECT ...  WHERE timestamp > CURRENT - 24 UNITS HOUR;
BR
Marco
Hi Marco,

depending on your requirements, that approach would be the easiest way to handle such a query. You might want to add the primary key column(s) to your WHERE condition.

Code: Select all

SELECT ...  WHERE timestamp > CURRENT - 24 UNITS HOUR AND sensor_id = 'Sensor03';
The VTI table actually pushes both, the primary key values and the time stamp(s) down to the actual time series and hence the performance should be quite good.
Alternatively you could also access the time series base table 'sensor_ts' in combination with the time series 'Clip()' function:

Code: Select all

SELECT Clip(sensor_values, CURRENT - 24 UNITS HOUR, CURRENT) from sensor_ts where sensor_id = 'Sensor03';
But that query would return the time series values in its internal format which likely needs some further client side processing, or you create an in-memory table based on the result set for easier client processing:

Code: Select all

SELECT sensor.timestamp, sensor.value
FROM TABLE (
TRANSPOSE ((
SELECT Clip(sensor_values, CURRENT - 24 UNITS HOUR, CURRENT)::TimeSeries(sensor_t) FROM sensor_ts
WHERE sensor_id = "Sensor03"
))
) AS TAB (sensor);
- Alexander

msd
Posts: 6
Joined: Sun Feb 21, 2016 8:29 am

Re: How to create a sensor database with IBM Informix - Part

Wed Feb 24, 2016 4:56 pm

Hi Alexander, thank you for your patience.

Regarding adding a primary key: it does not make it faster, but it's ok, it's a RPi!

At the end I wanted to get data from all sensors at once, and I used a VTI to do that aggregating data. I averaged over 1h or 1d for longer time ranges (1 week or more).

It looks like this: http://i.imgur.com/3eGOXpJ.png

AlexKoe
Posts: 25
Joined: Tue Jan 20, 2015 3:26 pm
Contact: Website

Re: How to create a sensor database with IBM Informix - Part

Wed Feb 24, 2016 8:39 pm

msd wrote:Hi Alexander, thank you for your patience.

Regarding adding a primary key: it does not make it faster, but it's ok, it's a RPi!

At the end I wanted to get data from all sensors at once, and I used a VTI to do that aggregating data. I averaged over 1h or 1d for longer time ranges (1 week or more).

It looks like this: http://i.imgur.com/3eGOXpJ.png
Hi Marco,

you are very welcome!

I guess that your questions and the related answers are also very valuable for others who might be reading those posts and interested in that topic. ;)

Just curious: how are you doing your aggregations?

Also: your UI looks nice. What are you using for displaying your data?

BTW, I am currently working on an (initially) simple Informix plugin for Grafana 2.6 (http://grafana.org/).

- Alexander

msd
Posts: 6
Joined: Sun Feb 21, 2016 8:29 am

Re: How to create a sensor database with IBM Informix - Part

Thu Feb 25, 2016 8:32 am

By aggregation I mean using a VTI to be able to select all sensor data at once and using the AggregateBy function:

Code: Select all

EXECUTE PROCEDURE TSCreateExpressionVirtualTab(
       'sensor_data_1h_avg', 'sensor_ts',
       'AggregateBy("avg($value)", "ts_1hour", sensor_values, 0, $ts_begin_time, $ts_end_time)',
       'sensor_t', 0, 'sensor_values');
and then simply

Code: Select all

SELECT * from sensor_data_1h_avg WHERE timestamp > CURRENT - 7 UNITS DAY;
To display data I use the HIGHSTOCK chart from HIGHCHARTS
A nice demo similar to my use: http://www.highcharts.com/stock/demo/compare

BTW Grafana looks cool too!

AlexKoe
Posts: 25
Joined: Tue Jan 20, 2015 3:26 pm
Contact: Website

Re: How to create a sensor database with IBM Informix - Part

Fri Feb 26, 2016 10:46 am

msd wrote:By aggregation I mean using a VTI to be able to select all sensor data at once and using the AggregateBy function:

Code: Select all

EXECUTE PROCEDURE TSCreateExpressionVirtualTab(
       'sensor_data_1h_avg', 'sensor_ts',
       'AggregateBy("avg($value)", "ts_1hour", sensor_values, 0, $ts_begin_time, $ts_end_time)',
       'sensor_t', 0, 'sensor_values');
Hi Marco,

that is a very good approach! ;)

Alternatively you could also use the following approach and supply one of the different pr-defined calendars programmatically on the fly (eg. replace 'ts_1hour' with eg. 'ts_1day' or 'ts_1week'), or you can use your own custom calendar:

Code: Select all

SELECT sensor.timestamp::datetime year to hour, sensor.value FROM TABLE (
TRANSPOSE ((
SELECT AggregateBy(
        'AVG($value)',
        'ts_1hour',
        sensor_values,
        0,
        'CURRENT - 7 UNITS DAY'::datetime year to second,
        'CURRENT'::datetime year to second)
FROM sensor_ts
WHERE sensor_id = "Sensor01"
))::sensor_t
) AS TAB (sensor);
- Alexander

AlexKoe
Posts: 25
Joined: Tue Jan 20, 2015 3:26 pm
Contact: Website

Re: How to create a sensor database with IBM Informix - Part

Fri Feb 26, 2016 5:25 pm

msd wrote:
It looks like this: http://i.imgur.com/3eGOXpJ.png
BTW, some time ago I also setup a backyard weather station at home (based on a Raspberry Pi 2 w/ Informix) and cooked up the following very basic UI (JavaScript based): http://i.imgur.com/HM5QnUr.png ;)

- Alexander

Return to “Automation, sensing and robotics”