Just for your information...
Was upgrading from 1.2.17 to 2.0.3, but server segfaulted short after startup. Running under gdb gave that crashes took place in different parts of code from time to time, which was a bit confusing. But all seemed to be related to either loading DCI cache or applying templates.
In some template auto-apply scripts we use DCI values to determine if a template should be applied or not. As we know it takes some time before all DCI values are populated. In the old server this could cause nodes to be thrown out of templates at server start and then included again at next template-apply round. Problem we had with 2.0.3 was likely that we tried to reload cache for DCIs that just vanished because AutoApply removed them from nodes. This caused all kinds of null pointer crashes. Yes, this approach for applying templates may not be a recommended way, but in our case it is a neat way to distinguish between similar nodes but different versions that require different DCIs.
Our quick and dirty solution was to create the Apply template thread at a later point, and delay it a bit before it start working. Server then started as expected.
Hope this not has other unwanted side-effects.
Was upgrading from 1.2.17 to 2.0.3, but server segfaulted short after startup. Running under gdb gave that crashes took place in different parts of code from time to time, which was a bit confusing. But all seemed to be related to either loading DCI cache or applying templates.
In some template auto-apply scripts we use DCI values to determine if a template should be applied or not. As we know it takes some time before all DCI values are populated. In the old server this could cause nodes to be thrown out of templates at server start and then included again at next template-apply round. Problem we had with 2.0.3 was likely that we tried to reload cache for DCIs that just vanished because AutoApply removed them from nodes. This caused all kinds of null pointer crashes. Yes, this approach for applying templates may not be a recommended way, but in our case it is a neat way to distinguish between similar nodes but different versions that require different DCIs.
Our quick and dirty solution was to create the Apply template thread at a later point, and delay it a bit before it start working. Server then started as expected.
Hope this not has other unwanted side-effects.
Code Select
~/netxms-2.0.3/src/server/core$ diff -u objects.cpp_orig objects.cpp
--- objects.cpp_orig 2016-04-10 12:46:07.272845568 +0200
+++ objects.cpp 2016-04-12 08:41:45.788328287 +0200
@@ -69,7 +69,12 @@
*/
static THREAD_RESULT THREAD_CALL ApplyTemplateThread(void *pArg)
{
- DbgPrintf(1, _T("Apply template thread started"));
+ DbgPrintf(1, _T("Apply template thread started"));
+
+ DbgPrintf(2, _T("Delaying start of ApplyTemplateThread for 5min..."));
+ sleep(300);
+ DbgPrintf(2, _T("ApplyTemplateThread continuing."));
+
while(1)
{
TEMPLATE_UPDATE_INFO *pInfo = (TEMPLATE_UPDATE_INFO *)g_pTemplateUpdateQueue->getOrBlock();
@@ -241,8 +246,8 @@
// Initialize service checks
SlmCheck::init();
- // Start template update applying thread
- ThreadCreate(ApplyTemplateThread, 0, NULL);
+ // Start template update applying thread, moved to end of LoadObjects
+ //ThreadCreate(ApplyTemplateThread, 0, NULL);
}
/**
@@ -1773,6 +1778,10 @@
// Start map update thread
ThreadCreate(MapUpdateThread, 0, NULL);
+ // Start template update applying thread. Moved from ObjectsInit
+ ThreadCreate(ApplyTemplateThread, 0, NULL);
+
+
return TRUE;
}