That's complicated issue. Current implementation is a bit messy with $n variables after matching. Consider the following code:
v = "error 43";
if (v ~= "(error|warning) ([0-9]+)")
println("Step 1: $1=".$1." $2=".$2);
if (v ~= "(error .*|info ([0-9]+))")
println("Step 2: $1=".$1." $2=".$2);
In NetXMS 2.2 $2 will be equal "43" on step 2, which is kind of misleading. It was because implementation in 2.2 silently skipping unmatched capture groups. Implementation in 3.0 resets all unused $n variables, which has side effect of clearing $1 if there were no capture groups at all. Looks like it was bad decision to use same names for capture groups and script arguments.
I want to gradually clean up inconsistencies in NXSL, and that could lead to break up of some things. I think that for now I can introduce server configuration variable (off by default) that will prevent resetting of unused capture group variables. Also I will introduce separated naming for script arguments and capture groups so scripts could be migrated over time to non-overlapping variable naming. Also for transformation scripts I think it is worth passing raw value not only as first argument to the script but also as global variable, like $rawValue.
Best regards,
Victor
The problem here is that there are likely thousands of scripts all around the Forums, Wiki and the internet that this change breaks.
Every single one would have to be updated - which is unrealistic.
For new users who find some regex-based scripts on the internet, they will not work, and they don't know why.
Existing implementations (scripts) will start working differently upon upgrade, without any knowledge of the system operators
This will be VERY difficult to track down, as this will not produce script errors, rather things will just work differently.
This means there can be a threshold script which will "pretend" to still work, but in reality will not work.
It puts burden on system operators to review every script they have upon upgrade to check if it still works properly
What I personally would propose:
There are far less scripts using regex capture groups than just regex.
So if a change is needed, only regex capture group should be changed, without affecting regex operations without capture groups (as is the case now).
What about capturing regex returning arrays with captures?
matchArrray = "test" ~= "t(e)(s)t";
println(matchArrray [0] . matchArrray [1]);
for (match : matchArrray) {
print(match);
}
This would clean up the reuse of $1, $2, etc. without breaking ALL regex handling.
The change would only affect a smaller part of community, so less impact.