TimescaleDB: Option to define and enable Continuous Aggregation policy

Started by fbu, November 27, 2020, 12:34:08 PM

Previous topic - Next topic

fbu

Hi NetXMS team,

recently, I was investigating the usage of TimescaleDB for the underlying NetXMS database. The system works fine, however when taking a look at the implementation, I noticed that it only uses the feature of Hypertables and does not use Continuous Aggregation options.

I tried to set up Continuous Aggregation on my local instance of NetXMS by adding VIEWs to the DB schema, using the TimescaleDB tutorials as a guide (https://docs.timescale.com/latest/using-timescaledb/continuous-aggregates).

The SQL code for my tests is added in the file continuous_aggregation_for_netxms_tsdb.sql.

For testing, I only used the idata_sc_default table and left all DCIs in the default storage class. The SQL code basically creates five "layers" of continuous aggregation, each with decreasing granularity compared to the previous one:


  • Older than 1 hour: Calc average over 3 minutes intervals (idata_continuous_3minutes_gt_1hour)
  • Older than 1 day: Clac average over 5 minutes intervals (idata_continuous_5minutes_gt_1day)
  • Older than 1 week: Clac average over 10 minutes intervals (idata_continuous_10minutes_gt_1week)
  • Older than 1 month: Clac average over 30 minutes intervals (idata_continuous_30minutes_gt_1month)
  • Older than 3 months: Clac average over 1 hour intervals (idata_continuous_1hour_gt_3months)

All these VIEWs are then combined in one single VIEW idata_continuous, including the original data for the most recent 60 minutes.

The VIEW itself works as expected, however, making NetXMS actually read the data through this VIEW it not possible without changing NetXMS sourcecode, as by default when using TSDB syntax, NetXMS reads the data directly from the storage class (idata_sc_XYZ) tables and not from the combined idata VIEW.

-----------------------------------------------

Now for my actual feature request (sorry btw. for the verbose introduction  ;D)

Would it be possible to introduce this as an optional feature? As a first implementation, I thought about a globally definable policy for multiple levels of Continuous Aggregation (possibly even define individual aggregation functions?), which could be configured via the NXMC GUI. As this feature does only use VIEWs, which are mostly read-only, they do not alter any materialized data in the DB and thus could even be configured on an existing database without breaking the system.

The most obvious advantage of this feature would be a huge performance boost for read queries that request large amounts of data over long period of time (e.g. 1 year of 60s data points) where it is of greater interest to observe a certain trend of data instead of the actual detailed datapoints. This would especially be useful in combination with Grafana, where it is quite common to increase the size of the displayed timeframe to investigate certain data-trends instead of being interested in exact values of very old data.

One minor downside of this would be increased storage consumption, as those VIEWS are updated periodically by TimescaleDB and thus store additional data next to the original values.


As for the actual implementation, there are two problems which I would consider hard to overcome - How to handle string/text based data values and how to handle table DCIs?

Anyway, this is just meant as an idea and to provide some suggestions for potential ways to implement it.

Hoping for your feedback,
best regards
fbu