How to group rows by time in PostgreSQL (and in Django)

Sometimes you need to group rows by a time interval, say, one value per minute, or per hour. In this post, we will research how we can do that with vanilla PostgreSQL and with Django ORM.

When do you need it? Imagine you have a table that stores weather temperature measurements and takes a temperature snapshot every 10 seconds. In one hour it will collect 360 measurements. That’s would be a lot if your script runs for several months. Okay, now you want to render temperature measurements as data points on a line chart, and you also want your users to be able to reduce data set by hours, days, and weeks. Here comes this snippet!

Setting up the database

Before we start experimenting with the SQL query let’s set up a database table and fill it with a sample data set. I assume you have already created a database.

I omit indices and primary key here just to simplify the example.

So we have a table. Let’s fill it with sample temperature measurements.

This query will create 267840 measurements. It is a pretty large data set for our example.

Let’s check what we have in the table:

SQL to group by time

Now we want to select only one measurement per hour. We can do this by truncating the date and using the result in the distinct on expression.

You see that PostgreSQL returned only one value per hour. That is what we want! In the same way, you can group values by week, month, or other period supported by the database.

Use Django ORM to group rows by the time

We can do the same with Django ORM. Assuming we have this model for our table

you can write the following DSL to fetch values grouped by the hour (or another period you wish)

That’s it!

Originally posted in my blog.