system watchdog
Dear Developers,
For a while i am struggling with the system watchdog, it keeps comming randemly during runtime. I am logging all places in my algoritm, but i can not find the one spot where the watchdog is triggered, it seems to be always triggered on a different spot. is there anny documentation available on the system watchdog? it would realy help me finding the problem, most important off all what triggers the watchdog? Is it a linux watchdog, or is it triggered by the firmware? anything would help
kind regards Nick
Comments
Hi Nick, you are correct that there is not much documentation available on the System Watchdog at the moment. This is on the to-do list for our documentation team, so something will appear in the Info Center in the future.
My understanding is that (possibly among other things) the Watchdog component configures the Linux watchdog daemon, so we can look at the documentation for the Linux watchdog to see how it behaves:
https://linux.die.net/man/8/watchdog
The PLCnext Runtime starts the watchdog daemon using the configuration settings in the following file:
/opt/plcnext/data/System/Watchdog/WatchdogDaemon.config
You can see that there are three processes monitored by the watchdog daemon:
- the main PLCnext Runtime process.
- the Local IO process.
- the External IO process.
If the watchdog daemon thinks that any of these processes are not running, then the hardware watchdog will reboot the system.
However that information doesn't really help to identify the cause of the problem that you are seeing.
We have found from past experience that system watchdog events are often caused when code that was written in C++ is executed in a real-time ESM task, for example if a C++ program instance is used in PLCnext Engineer. The ESM is very time-sensitive, and in C++ it is easy to do something that will cause problems for the ESM and cause a system watchdog. For example:
Execute
method in a C++ program should never dynamically allocate memory, e.g. usingnew
ormalloc
.Execute
method in a C++ program should never call methods from open-source libraries, because it is 100% guaranteed that those libraries are not designed to run in a deterministic real-time environment like the ESM in PLCnext Control devices.If you are using C++ code in your project, and if you would like someone to review that code (or the overall project), then please let us know and someone will contact you by email.
Hi Martin,
Thanks for the information so far. Allocating using 'NEW' could be the problem I am facing
For my understanding:
There seems to be no way to allocate memorie using 'NEW' or 'MALLOC' during runtime since all of the programming is done in the Execute method, is this correct?
If I would like to dynamically allocate memory using 'NEW' or 'MALLOC , I should do it from the program class constructor? Is it allowed to use vectors to create customized 'arrays' and change e.g. the length during runtime (from the Execute method)?
kind regards Nick
If I would like to dynamically allocate memory using 'NEW' or 'MALLOC , I should do it from the program class constructor?
Yes, that's right.
Is it allowed to use vectors to create customized 'arrays' and change e.g. the length during runtime (from the Execute method)?
It's best to avoid doing anything that would resize an array or vector - or a String - in the Execute method, because that would potentially (re-)allocate memory.
For
std::string
variables I would use the reserve method in the constructor to reserve the required memory, and then check whenever the string variable is assigned in theExecute
method, to make sure the new string is no larger than the reserved size.For
std::vector
variables I would use the resize method in the constructor to reserve the required memory, and then check whenever data is assigned to the vector in theExecute
method, to make sure the capacity is not exceeded. The clear method can be used in theExecute
method to erase all the vector entries without changing the allocated memory.Strict rules like these are almost certainly broken in any open-source library you care to use, which is why it's not a good idea to call methods in open-source libraries from the
Execute
method. There are other ways to use open-source libraries from C++ Programs - e.g. moving the calls from the Program to the (non-real time) Component.These types of rules also help to explain why IEC 61131 languages (e.g. Structured Text) don't include all the features that C++ programmers take for granted, like being able to dynamically create new objects, or dynamically resize arrays. If you are trying to do something in a C++
Execute
method that is not possible in Structured Text, then that is probably a good indicator that it should not be done in a real-time ESM task.I hope this helps. Please let us know if you have any other questions.
Hi Martin,
It has bin quite "system wadgedogles" after getting rid of the use of NEW in the execute method, but now for some reason it pops op again. is there since the last time any update on the documentation on how the systemwadgedog could be debugged? is the for axemple a way to figure out which of the three processes trigggers the watchdog? there seems to be a log file in /opt/plcnext/logs/watchdogdeamon. unfortunately there is nothing in there. I need a solution with some urgency. hope to get some advice on how to debug it,
sincere nick
Hello Nick,
Is it okay for you if I contact you via your email address? Maybe we can analyze this problem together?
BR Eduard
hello eduard,
for some reason I missed you'r comment. I've been in touch with the plc-next team in the netherlands to solve the problem, but for sure I am intrested in a fresh view on the problem. I'l be happy to have some conversation over the E-mail.
yours sincerely Nick
good afternoon. I'm facing exactly the same problem. my AXC F 2152 is constantly rebooting at a random point in time. The FAIL + RUN 2 Ghz error is lit. After analyzing the error, I found that it was a "system watchdog". What is it and why is the problem only on one plc? I have a feeling that this error is related to a specific plc and is it a marriage?
Hello,
the "system watchdog" can be caused by different errors and must be analyzed in detail. The following information is required for analysis:
1. PLC type and FW version used.
2. Is the external SD card used? If yes, please try replacing or removing the SD card to check whether the error was caused by a defective SD card.
3. Please check the file “/opt/plcnext/logs/Output.log”. Are there any error messages, before system WD occurs?
4. Did the controller run without "system watchdog" occurence? If so, was something changed, fw update, system configuration or settings? Was an open source project or specific app installed?
5. Please check the voltage on the 24V power supply.
6. If you cannot find the error, please contact the Phoenix Contact support department in your country to create and handle this issue in the ticket system.
I hope it helps.
BR Eduard
Hello,
please execute the factory Reset Type1 (see Factory Reset in PLCnextInfoCenter) and upload/start your application program.
If the error persists, please execute the factory Reset Type2 (see Factory Reset, the FW update is needed).
If the Error persists after Reset Type 2, please upload and start an empty PLCnEng project on the PLC .
If the Error with empty project persists, please contact the Phoenix Contact "after sales department" in your country for repair/exchange the PLC.
If the error does not occur when starting an empty PLCnEng project, please contact the Phoenix Contact support department in your country to create and handle this issue in the ticket system.
Thanks & BR
Eduard
Good afternoon. I tried to reset the controller several times in the first way and then in the second way. Then I tried to upload an empty project into it, as you said. This is unsuccessful. After a while, he goes into error. I tried to upload an empty project and leave it running all night. When I came in the morning I saw an error. But not a single module was connected to the controller. It was launched just with an empty project.
Hello,
-> If the Error with empty project persists, please contact the Phoenix Contact support/after sales department in your country for repair or exchange the PLC.
BR Eduard