Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - fbu

#1
Hi NetXMS team,

recently, I was investigating the usage of TimescaleDB for the underlying NetXMS database. The system works fine, however when taking a look at the implementation, I noticed that it only uses the feature of Hypertables and does not use Continuous Aggregation options.

I tried to set up Continuous Aggregation on my local instance of NetXMS by adding VIEWs to the DB schema, using the TimescaleDB tutorials as a guide (https://docs.timescale.com/latest/using-timescaledb/continuous-aggregates).

The SQL code for my tests is added in the file continuous_aggregation_for_netxms_tsdb.sql.

For testing, I only used the idata_sc_default table and left all DCIs in the default storage class. The SQL code basically creates five "layers" of continuous aggregation, each with decreasing granularity compared to the previous one:


  • Older than 1 hour: Calc average over 3 minutes intervals (idata_continuous_3minutes_gt_1hour)
  • Older than 1 day: Clac average over 5 minutes intervals (idata_continuous_5minutes_gt_1day)
  • Older than 1 week: Clac average over 10 minutes intervals (idata_continuous_10minutes_gt_1week)
  • Older than 1 month: Clac average over 30 minutes intervals (idata_continuous_30minutes_gt_1month)
  • Older than 3 months: Clac average over 1 hour intervals (idata_continuous_1hour_gt_3months)

All these VIEWs are then combined in one single VIEW idata_continuous, including the original data for the most recent 60 minutes.

The VIEW itself works as expected, however, making NetXMS actually read the data through this VIEW it not possible without changing NetXMS sourcecode, as by default when using TSDB syntax, NetXMS reads the data directly from the storage class (idata_sc_XYZ) tables and not from the combined idata VIEW.

-----------------------------------------------

Now for my actual feature request (sorry btw. for the verbose introduction  ;D)

Would it be possible to introduce this as an optional feature? As a first implementation, I thought about a globally definable policy for multiple levels of Continuous Aggregation (possibly even define individual aggregation functions?), which could be configured via the NXMC GUI. As this feature does only use VIEWs, which are mostly read-only, they do not alter any materialized data in the DB and thus could even be configured on an existing database without breaking the system.

The most obvious advantage of this feature would be a huge performance boost for read queries that request large amounts of data over long period of time (e.g. 1 year of 60s data points) where it is of greater interest to observe a certain trend of data instead of the actual detailed datapoints. This would especially be useful in combination with Grafana, where it is quite common to increase the size of the displayed timeframe to investigate certain data-trends instead of being interested in exact values of very old data.

One minor downside of this would be increased storage consumption, as those VIEWS are updated periodically by TimescaleDB and thus store additional data next to the original values.


As for the actual implementation, there are two problems which I would consider hard to overcome - How to handle string/text based data values and how to handle table DCIs?

Anyway, this is just meant as an idea and to provide some suggestions for potential ways to implement it.

Hoping for your feedback,
best regards
fbu
#2
While this approach is certainly an alternative to mine, it does not help with the core problem, which is the DCI Data Recalculation:

Unfortunately, you cannot pass a timestamp parameter the functions PushDCIData, which could define where to put the new value in the DCI table. Therefore, it will always put the new value to the actual current timestamp.

When looking at the sourcecode, I would see the following location, at which this problem could be addressed:

In line 114 of src/server/core/dci_recalc.cpp, the recalculateValue(value) function is called, which calls the function transform(value, elapsedTime) in line 2151 of src/server/core/dcitem.cpp.
In this function, in line 999 of the same file (dcitem.cpp), a ScriptVMHandle object is created, using the function CreateServerScriptVM(...). This function CreateServerScriptVM(...) could have an optional timestamp parameter, which would tell the script VM to simulate an execution at the given point in time, defined by the given parameter.

The value for this timestamp parameter could be extracted from the ItemValue &value (which is the current DCI value to be recalculated) of the calling function, using value.getTimeStamp().

This approach, however, would require refactoring the implementation of NXSL functions, like time(), GetDCIValue(...) or GetDCIRawValue(...) as their behavior should then be adapted, accordingly.

(I'm sorry, if I have some weird ideas, which might sound reasonable to me but not for others  ;D)
#3
Dear NetXMS team,

I seem to have found a scenario related to DCI Data Recalculation, during which the server (Version 3.2-400) does not behave as expected. The scenario is as follows:

For some of my DCIs I use secondary DCIs, which feature a longer retention period and a bigger polling interval, to avoid storing data with fine granularity over a long period of time.

These secondary DCIs use different aggregation scripts, based on what kind of data the primary DCI returns. The most common script is a simple average calculation, like this:

return GetAvgDCIValue($node, FindDCIByDescription($node, "gateway.RTT"), time()-3600, time());

You can see that (with a polling interval of 1 hour) the secondary DCI simply returns the average value of the primary DCI for the last hour.

This method works perfectly fine, as long as the server is running.

And this is where the problem comes in:

If for some reason, the server was down and the secondary DCIs could not poll any data (the primary DCIs still gathers data, as it is coming from a NetXMS Agent, which stores the data on its local DB during server downtime), I want to use the DCI Data Recalculation feature, to recalculate the secondary DCI data, based on the data that the primary DCI could still gather during the downtime via the Agent.

However, it seems like the NXSL function time() always returns the actual current timestamp, instead of the timestamp of the value to recalculate.
Just to make it a little clearer: Imagine the current timestamp is X and the server should recalculate a DCI value at the timestamp Y (with Y < X). The transformation script shown above is supposed to calculate the average of the primary DCI for the period of one hour before timestamp Y and not X.

This behavior leads to the situation that for every recalculated value of the secondary DCI, the time() function returns timestamp X, meaning the recalculated average value will always be the one for the latest one hour period (relative to actual present), which is not correct.


One way I could see to resolve this problem is to have an option for the DCI Data Recalculation, which makes the server always take the timestamp of the value that is currently being recalculated and (temporarily) set this as the current timestamp, which will then be returned by the time() function, just for the process of data recalculation.

Additionally, I could also think of an option, which allows to "fill gaps" in the data (e.g. caused by server downtime). Maybe by checking the polling interval and generating the missing values. If no RAW values are present, the transformation script could still be executed. For a script like the one above, this would still return values to the final data.


Hopefully, these thoughts will be helpful to you.

Thank you and kind regards,
fbu
#4
Now, for my final update:

After some hours of digging, I found out that NGINX was the reason for the entire problem: It was not configured to pass the POST data to the proxied API server.

Thus, there is no problem with the original API code - I guess that the scenario, as stated in my second update, may be related to the entity's content being empty, once it is read (Somewhere, I read about it being a stream, rather than a storage variable, but I'm not 100% sure about that)

Anyway, I apologize for causing this false alarm and thank you for your support.

Keep up the good work on this tool

Best regards,
fbu
#5
Update #2:

Based on my previous update, I tested a workaround solution, by simply instantiating the JSONObject, based on the text value returned by the entity:

JSONObject data = new JSONObject(entity.getText());

This works perfectly fine for me, when bypassing the NGINX.

I still don't know what the actual problem is, but it works like this for me - I only need to fix the problem related to NGINX.

Thanks for your help, Victor. Hopefully the root cause of this problem can be determined.

Kind regards,
fbu
#6
Quick update:

Previously, I was using an NGINX in front of Tomcat9 for hostname and SSL/TLS handling.

When bypassing that NGINX and going directly to the Tomcat, the entity.getText() actually returns the provided data as string. However, new JsonRepresentation(endity).getJsonObject() and [...].getText() still return NULL and cause the same NullPointerException, as before. (see attachment for debug messages)

I am pretty much out of ideas, right now. I'll still try to at least get a similar result with using NGINX, as I will need it in the future, but this might just be a configuration error.

Hope this helps.

Kind regards,
fbu
#7
Hi Victor,

I managed to build the changed source code for the API and deploy it on tomcat. You can find the changes in the attachment "sourcecode_change.png" (lines 103 - 108).

The debug-messages are printed in the screenshot "debug_messages.png". I shows that entity.getText() returns NULL, which causes the StringReader called by getJsonObject() (now in line 107) to throw a NullPointerException (as seen in "nullpointer_cause.png".

However, I still don't know why getText() returns NULL in the first place ... I will keep searching for the cause, but I'm not feeling to optimistic about my chances.

Hope, I could help, so far.

Best regards,
fbu
#8
Okay, I will try my best. Is there a way to only build the war file for the API, or do I need to rebuild the whole project?

Kind regards,
fbu
#9
Hi Victor,

I added a screenshot of the stack trace for the NullPointerException. (Note that the date does not match, because I simply placed the request, again. But it is the same, as for the previous requests)

There are different debug outputs as well, but they are all related to other requests, this error message and stack trace are the first log lines that appear, when performing the request.

Best regards,
fbu
#10
Dear NetXMS,

I am currently trying to receive data from our NetXMS Server through the REST API.

Logging in via the basic Authentication header works fine and GET requests to /objects.*, /alarms.*, /info or /predefinedgraphs endpoints returns proper data on valid and error messages on invalid requests.

However, using POST requests on the /sessions and /summaryTable/adHoc endpoints results in {"error":46} responses, when adding a JSON object to the request.

Omitting the JSON object results in an "Invalid argument" message.

Digging through the sourcecode of the latest version (3.0.2329) and cross-checking with the Tomcat9 logs, I could find out that when not adding a JSON object to the request, the logging message in line 100 of AbstractHandler.java ("No POST data in call") is being printed and the method returns the expected error object.

But: When a JSON object is added to the request, the debug message in line 105 ("POST: data = [...]") is not being printed, although it should be and the loglevel is set to "debug" (Other debug messages of the API are visible).

The tomcat logs then state that the internal error is caused by a NullPointerException, so my assumption would be that either new JsonRepresentation(entity) or .getJsonObject() in line 104 of AbstractHandler.java return NULL, thus causing this exception.

Is there anything special that I did not take into account or did I stumble upon a bug? (Stacktrace is attached as screenshot "tomcat9_stacktrace.png")

Just as additional information: I am using Postman (note the missing "description" field in the response, as seen in the attached screenshot "postman_error_46.png") for testing, but I also tried with cURL, giving the same error.

I would be glad, if you could help me with this  :)

Kind regards,
fbu
#11
Feature Requests / Dashboard Templates
August 07, 2019, 05:40:10 PM
Hi NetXMS team,

I found a few instances of Dashboard Templates being mentioned on this forum and that they might be considered for development.

I think this would be a great addition to the tool, especially when having a network with numerous similar nodes which all should have individual dashboards.

Currently, we need to create the dashboards manually, which is very time consuming and dashboard templates would be really helpful with this.


Is there any update on this topic?


Kind regards,
fbu
#12
To add another point:
This could also be implemented for the geographical map (World Map view). There, if you want to access the object details of a device you chose in the map, you need to navigate to "Service dependency", open its dropdown menu there and hit "Show Object details".

I think, it would be nice to have the dropdown menu entry "Show Object Details" by default available. Maybe even not only in dashboards and geo-maps, but in every occasion, where you can open a dropdown menu for a specific object.

Kind regards,
fbu
#13
Dear NetXMS team,

as requested by Victor in my previous post under 'General Support' (https://www.netxms.org/forum/general-support/multiple-values-assigned-to-same-timestamp-with-nxapush/), I now post this topic as a feature request.

To sum up my point: I try to use nxapush to update DCI values at given timestamps, which results in multiple values being assigned to the same timestamp, as seen in the two screenshots attached to this post.

My idea would be to have an option (maybe -u) indicating that any value currently assigned to the given timestamp should be deleted and replaced by the new value, allowing a more flexible use of both nxpush and nxapush.

Best regards,
fbu
#14
Okay, I will do so.

Thank you very much.

Kind regads,
fbu
#15
Hi,

currently, if you want to display in a dashboard how the values of a certain DCI developed during a certain period of time, you use a line chart.

I think it would be cool, if it was possible to do something similar with bar- and tube-charts, as well.

Maybe even give the possibility to define intervals in which the period should be split up to and how the data in these intervals should be accumulated.

E.g.: Something like - "For the period of the last 8 months, display the sum of all data of each individual month", resulting in a chart similar to this:
https://preply.com/wp-content/uploads/2018/08/representation-bar-graph.png.

Kind regards,
fbu