Graphing CSV data with matplotlib
Loading the csv data.
Python is capable of handling csv data using the csv module.
Create a new Python file and call it
First you'll need to import the
readerfunction from the
from csv import reader
Next you can open the csv sheet and store the data in a list.
with open('filename.csv', 'r') as f: data = list(reader(f))
filename.csvwith the name of the file you are using -
- Run the file and then switch over to the interpreter, and you can have a look at the contents of the
datalist that you have created. This will show you the headers of the csv sheet.
- If you want to have a look at the first set of values in the data, then you can use:
- You can see the last items in the list, by typing:
Getting specific data sets.
At the moment the values are stored as a list of lists. To graph the data, you'll want specific data sets. For instance, you might want to get the temperature.
Looking at the headers of the csv sheet, there is a choice of temperature measurements to choose from.
'ROW_ID', 'temp_cpu', 'temp_h', 'temp_p', 'humidity', 'pressure', 'pitch', 'roll', 'yaw', 'mag_x', 'mag_y', 'mag_z', 'accel_x', 'accel_y', 'accel_z', 'gyro_x', 'gyro_y', 'gyro_z', 'reset', 'time_stamp'
temp_p, which is the temperature recorded by the pressure sensor, is probably the best one to use. This is the 3rd item in the list, as lists are indexed from
You can use a list comprehension to extract the temperatures from the list of lists. This line of code, takes the third value from every item in data.
temp = [i for i in data]
You can look at the contents of some of the temp data, by typing into the interpreter. This will show you the start of the list, which is the header:
This will show you the first 10 items in the list:
You don't actually want the header, so you can alter the list comprehension so that the first item of the
datalist is ignored.
temp = [i for i in data[1::]]
You can do the same again, to get the date-time stamps from
time = [i for i in data[1::]]
Your first graph
The Matplotlib module (along with numpy and scipy) is one of the reasons so many mathematicians and scientists use Python. It's an excellent way of drawing graphs.
- You'll need to import the module, to begin with:
from matplotlib import pyplot
- You need only two lines to graph your data:
pyplot.plot(range(len(temp)), temp) pyplot.show()
pyplot.plot()needs to be given two iterables. In the above, you have given it
range(len(temp))which is all numbers from
0up to the length of the
templist. You've also given it
tempwhich is the set of temperatures.
pyploy.show()draws the graph. Save and run your file.
Adding dates and times
At the moment, the date-time isn't being used in the graph. The reason you cant really use it at the moment is because the data is not in a format that matplotlib can recognise
This string needs changing to a datetime object, which is fortunately quite easy.
You need to import the parser function from the dateutil module to begin with.
from dateutil import parser
To convert a date in string format, to a datetime object, the syntax is fairly simple. For instance:
You want to convert each date that is added to the
timelist, so you can edit your list comprehension to read:
time = [parser.parse(i) for i in data[1::]]
Now you can change your
pyplot.plot()call so it looks like this
pyplot.plot(time, temp) pyplot.show()
Your complete code should look like:
from matplotlib import pyplot, dates from csv import reader from dateutil import parser with open('../data/astro_pi_data_20150824_085954.csv', 'r') as f: data = list(reader(f)) temp = [i for i in data[1::]] time = [parser.parse(i) for i in data[1::]] pyplot.plot(time, temp) pyplot.show()
Adding Titles and Axis
Graphs should always be titles and have labelled axis. Again, this is trivial with matplotlib.
First you can add a title:
pyplot.title('Temperature changes over Time')
Then the x axis:
And lastly the y axis: