I'm sure most people have experienced a situation where the computer they use every day "hangs up" or "freezes", i.e., stops working for some reason (an unspecified operation is endlessly repeated, or stops dead and won't accept any command), and must be rebooted. In most cases, the user has some knowledge of what the computer doing and is able to recognize that a malfunction has occurred and take appropriate measures (rebooting).
What about applications that included embedded microcontrollers? The user is usually not an expert. In fact, it may be that the user is not even aware that a microcontroller is being used. If the user realizes that there is a problem, they may take action such as pulling the plug from the wall socket. However, it is not possible to monitor the application all the time, and it is unreasonable to expect the user to make the right decision and execute appropriate processing.
At the same time, application programs are getting larger and more complex. This is making it difficult for even designers to totally understand every aspect of a program. Consequently, it is almost impossible to include countermeasures for possible program malfunctions in the program itself (for starters, the level of program completion must first be raised before thinking about countermeasures). If countermeasures cannot be implemented by program alone, therefore, some kind of help must be obtained from the hardware.
The hardware function typically used to monitor normal operation of the program is the watchdog timer.
(2) What is a watchdog timer?
A watchdog timer (WDT for short) is a simple timer used to monitor whether the program is operating normally or not. The timer itself can be cleared from the program it is monitoring, but it cannot be stopped. When the watchdog timer overflows, the WDT interrupt is generated and a reset occurs.
Normally, the watchdog timer is cleared regularly to prevent overflow and avoid WDT interrupt generation and reset. The diagram below clarifies the operation of the watchdog timer. As shown here, unlike other functions, correct use of the WDT means it should not operate (the watchdog should not bark).
(3) WDT configuration and operation (outline)
The watchdog timer can be cleared but not stopped (i.e., it is a free-running timer). Once started, it simply continues counting the specified clock. The diagram below gives an image of the watchdog timer operation. The time until overflow is determined by the frequency of water droplets (count clock frequency). When the water builds up (clock is counted), if the bottom of the water container is opened, releasing the water (clearing the timer) before the water (timer) overflows, there is no problem. However, if someone forgets to release the water, or it is not released in time, and an overflow occurs, the switch to detonate the bomb will be pushed (a reset will occur). This procedure enables detection of a program malfunction.
(4) How to use WDT (outline)
While the program is operating normally, the WDT is designed not to operate if it is regularly cleared.
(5) Limits of WDT
Once the program has hung up or some other malfunction has occurred, the WDT should not be able to be cleared regularly. This causes generation of the WDT interrupt and/or reset (the dog barks), which enables detection of the malfunction. However, if the WDT continues to be cleared regularly even after a program malfunction has occurred, that malfunction cannot be detected.
This occurs because it doesn't matter where the WDT is cleared—much the same as a naughty watchdog who is fed meat from a burglar and thus wags his tail instead of barking. The WDT is by no means fail safe.
Moreover, there are some microcontrollers whose on-chip WDT is stopped in its initial state (dog is sleeping) and won't function (won't protect the property) unless it is activated (woken up).
(6) Current watchdog timers
The watchdog timers in recent microcontrollers have overcome the limitations mentioned above and are much more convenient. The main improvements are as follows.
- The watchdog timer cannot be cleared unless a specific data pattern is written; if a different value is written, a reset occurs.
(The dog will bite unless fed its usual food.)
- The timing at which the WDT can be cleared is restricted and a reset occurs if the timer is cleared at any other timing.
(The dog will bite if fed when it is not hungry.)
- The WDT is activated from the start using a dedicated clock that cannot be stopped.
(The dog is always awake and protecting the property.)
Further improvements can also be expected in the future.
(7) Cautions on use
The main points that require care when using a WDT are how much time to allow until overflow and the timing of clearing the timer. If the time until overflow is short, the timer must be cleared frequently, which increases constraints on the system (the dog must be fed as soon as it appears hungry, so the owner cannot leave the house).
If the clearance timing is set too close to overflow, however, the timer may not be cleared in time due to hardware variations such as variations in the count clock, or software variations such as variations in the execution time according to the program processing contents or the asynchronous generation of an interrupt. It is therefore necessary to take these points into consideration when designing the system to ensure that the WDT does not overflow, even in the worst case.
It is also necessary to consider what to do about the WDT when the CPU is stopped, such as in standby mode.
A common mistake is to give complete priority to clearing the WDT, which is too simplistic and may result in processing such as the WDT being cleared in the middle of a timer interrupt (feeding the dog only when required is a pain, so the dog is fed by an automatic feeding machine so that it never gets hungry). This is definitely a case of the tail wagging the dog.
The WDT function is not perfect and an understanding of the whole program is required in order to operate it effectively. However, it can be used as a means of detecting program malfunctions if the user understands how it operates and its limitations.