Sometimes you need to group rows by a time interval, say, one value per minute, or per hour. In this post, we will research how we can do that with vanilla PostgreSQL and with Django ORM.
When do you need it? Imagine you have a table that stores weather temperature measurements and takes a temperature snapshot every 10 seconds. In one hour it will collect 360 measurements. That’s would be a lot if your script runs for several months. Okay, now you want to render temperature measurements as data points on a line chart, and you also want your users to be able to reduce data set by hours, days, and weeks. Here comes this snippet!
Setting up the database
Before we start experimenting with the SQL query let’s set up a database table and fill it with a sample data set. I assume you have already created a database.
create table temperatures (
measured_at timestamp with time zone
I omit indices and primary key here just to simplify the example.
So we have a table. Let’s fill it with sample temperature measurements.
insert into temperatures
random() * 40 + 1,
This query will create 267840 measurements. It is a pretty large data set for our example.
Let’s check what we have in the table:
select * from temperatures limit 10
SQL to group by time
select distinct on (key) *, date_trunc('hour', measured_at) as key
You see that PostgreSQL returned only one value per hour. That is what we want! In the same way, you can group values by week, month, or other period supported by the database.
Use Django ORM to group rows by the time
We can do the same with Django ORM. Assuming we have this model for our table
from django.db import models
value = models.FloatField()
measured_at = models.DateTimeField()
you can write the following DSL to fetch values grouped by the hour (or another period you wish)
from django.db.models.functions import Trunc
Originally posted in my blog.