Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Method 1Method 2Method 3
Planning time0.269 ms0.259 ms0.145 ms
Execution time3069.059 ms6029.105 ms12008.801 ms


Caching design

...

If we store the count and sum of the values in addition to average, it becomes easy to update the bin with a new datapoint.

bins_year

sensor_idyearfieldcountsumaverage
123452003temperature120840060.0
123452003pH1202402.0

bins_month

sensor_idmonthyearfieldcountsumaverage
1234562003temperature1060060.0
1234562003pH10202.0

bins_day

sensor_iddaymonthyearfieldcountsumaverage
123451362003temperature16060.0
123451362003pH122.0

bins_hour

sensor_idhourdaymonthyearfieldcountsumaverage
12345191362003temperature16060.0
12345191362003pH122.0

Other possible tables:

bins_season - do we need to cache this, or calculate from monthly bins?

bins_total - do we need to cache this, or is it fast enough to calculate from yearly bins?

I don't think we want cache table for water_year, for example, because that is specific to GLM/GLTG and not generic for clowder. We could use the month caches to quickly calculate that on the fly.

When do we update cache tables?

  • cron job (hourly? 5 minutes?)
  • whenever new datapoint is added (at most 1 bin per table would need to be created or updated) - upsert