Windows ping subagent time resolution

Started by kmradke, May 24, 2010, 08:08:43 PM

Previous topic - Next topic

kmradke

I setup a Windows XP SP3 client with a ping.nsm SubAgent and added a *PING section with the target of two IP addresses.  It appears the timing resolution is not large enough, since the one IP address has an average and last values of 0, but other tools show it should be around around 3-5ms.  It seems to be rounding anything less than 20ms down to zero.

The other IP address which is much farther away seems to be reporting valid values in the 260ms range.

Is this a known limitation of the ping subagent on windows?

Victor Kirhenshtein

Just check this - ping subagent uses GetSystemTimeAsFileTime to measure time intervals, which has resolution of 15-20ms. I'll try to change it to high resolution timer and post test subagent here.

Best regards,
Victor

kmradke

Might be easiest just to use the multimedia timers and crank up the resolution to 1ms.
Could possibly call timeBeginPeriod in some startup routine.  Not sure of the overhead or if it hurts to repeatedly call it like this.
Might also need to include mmsystem.h if not already present through windows.h

I don't have a functional netxms build system or else I would just try this myself.

Something like this in tools.cpp:


INT64 LIBNETXMS_EXPORTABLE GetCurrentTimeMs(void)
{
#ifdef _WIN32
 INT64 t;
 DWORD ms;
 timeBeginPeriod(1);
 t = (INT64)timeGetTime();
 timeEndPeriod(1);
#else
  struct timeval tv;
  INT64 t;

  gettimeofday(&tv, NULL);
  t = (INT64)tv.tv_sec * 1000 + (INT64)(tv.tv_usec / 10000);
#endif

  return t;
}

Victor Kirhenshtein

We cannot change GetCurrentTimeMs in that way, because it is supposed to return current time since epoch in milliseconds, and timeGetTime returns time since system start. We need to replace call to GetCurrentTimeMs with timeGetTime or QueryPerformanceCounter inside IcmpPing function. I'll create patched libnetxms.dll this evening and post it for testing.

Best regards,
Victor

Victor Kirhenshtein

It takes a bit more time :) Updated libnetxms.dll can be downloaded from here: https://www.netxms.org/download/patches/006/libnetxms.dll. It now uses QueryPerformanceCounter for time interval measurement. You can replace existing libnetxms.dll with new one and test agent.

Best regards,
Victor

kmradke

Haven't had time to test it yet, but QueryPerformanceCounter can have issues on machines with SpeedStep enabled.  (Since the frequency varies dynamically.)  I believe some AMD multi-core systems have problems too.  I suspect it will work for most people fine.  I hadn't noticed the original function was ms since a set epoch.  That could easily be simulated with an initial call to save the offset from the system start time.  However, my example would still roll over after 49 days anyway, which could cause problems as well.  Windows just sucks at telling high resolution time...

Victor Kirhenshtein

Microsoft promises that frequency should not change between reboots :) (http://msdn.microsoft.com/en-us/library/ms644905%28VS.85%29.aspx). I choose to use performance counter because timeGetTime has much bigger call cost - but I'm not sure that this really matters in our case. I could also build libnetxms.dll with timeGetTime variant to compare. Interestin also is that on my Windows 7 machine event GetSystemTimeAsFileTime, which was used before, gives 1ms accuracy.

Best regards,
Victor

P.S. New libnetxms.dll included in 1.0.3 build, so if you upgrade agent, you will not need to download libnetxms.dll from here.

kmradke

Microsoft can promise all they want, but reality is sometimes different. :)

I couldn't find the msdn blog I was remembering, but I did find this kb article.  It appears they may have fixed a few things since this is considered retired now.
http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q274323&

I don't think they use the rdtsc counter in multi-core systems anymore, but they do still recommend to lock the timing thread to one processor since not all HAL implementations will give consistent readings between cores.  Closest thing my quick search returned is: http://msdn.microsoft.com/en-us/library/ee417693%28v=VS.85%29.aspx

BTW, I DO appreciate your timing changes and keep up the good work!!!  I just wanted to warn you that some odd hardware/os combos will probably fail to measure time correctly...