![]() |
James Thornton |
| Internet Business Consultant |
| Home | Blog | Bio | Projects | Contact | Latest Blog (new site): How to Get to Genius |
|---|
|
A Hardware watchdog and shutdown button
Abstract: The LCD control panel article explained how to build a small microcontroller based LCD panel with enormous possibilities. Sometimes you don't need all those features. The hardware which we design in this article is a lot cheaper (the LCD panel was already a bargain) and includes just 2 important features from the LCD panel:
What is a watchdog?A watchdog in computer terms is a very reliable hardware which ensures that the computer is always running. You find such devices in the Mars Pathfinder (who wants to send a person to the mars to press the reset button?) or in some extra expensive servers.The idea behind such a watchdog is very simple: The computer has to "say hello" from time to time to the watchdog hardware to let it know that it is still alive. If it fails to do that then it will get a hardware reset. Note that a normal Linux server should be able to run uninterrupted for several month, in average probably 1-2 years without locking up. If you have machine that locks up every week then there is something else wrong and a watchdog is not the solution. You should check for defect RAM (see memtest86.com) overheated CPUs, too long IDE cables ... If Linux is so reliable that it will run for a year without any problems then why do you need a watchdog? Well the answer is simple to: make it even more reliable. There is as well a human problem related to that. A server that made no trouble for a year is basically unknown to the service personal. If it fails then nobody knows where it is? It might as well lock up just before Christmas when everybody is at home. In all such cases a watchdog can be very useful. A watchdog does however not solve all of the problems. It is no protection against defect hardware. If you include a watchdog in your server then you should also ensure that you have well dimensioned (probably not the latest BIOS bugs and chipset bugs, properly cooled hardware). How to use the watchdog?The watchdog we design here only ensures that user space programs are still executing. To have a truly reliable system you still have to monitor your applications (web-servers, databases) and your system resources (disk space, perhaps CPU temperature). You can do this via other user space applications (crontab). All this is already described in the LCD control panel article. Therefore I will not go into further details here.Examples? Here is a small script that can monitor networking, swap usage and disk usage.
1,15,30,45 * * * * /where/the/script/is The watchdog hardwareThere is no standard relay. Every manufacture has it's own design. For our circuit it matters very much what the inner resistance of the coil is. Therefore you find below 2 circuits one for a 5V, 500 Ohm relay and one for a 5V, 120 Ohm relay. Ask for the impedance of the relay or measure it with a Ohmmeter before you buy it. You can click on the schematic for a bigger picture.120 Ohm relay: 500 Ohm relay: The shutdown button is a push button that connects RTS and CD when pressed. It looks a bit strange in the schematic because Eagle does not have a better symbol. I don't include a part list in this article. You can see what you need in the above schematic (Don't forget the DB9 connector for the serial line). For the diodes you can use any diode, e.g 1N4148. Personally I believe that the circuit with the 500 Ohm relay is better because you do not need R4 and do not need a 2000uF (or 2200uF) capacitor. You can use a smaller 1000uF capacitor for C1. Note: That for the 120 Ohm circuit you need a Red LED and for the 500 Ohm Relay a green LED. This is not joke. The voltage drop over a green LED is higher than over a red LED. Board layout, eagle files and postscript files for etching the board are included in the software package which you can download at the end of the article. The Eagle CAD software for Linux is available from cadsoftusa.com. How the circuit worksThe watchdog circuit is build around the NE555 timer chip. This chip includes 2 comparators, a Flipflop and 3 resistors 5K Ohm each to have a reference for the comparators. Whenever the pin named threshold (6) goes above 2/3 of the supply voltage then the Flipflop is set (state on). Now look at the schematic of our circuit: We use the RTS pin from the serial line as supply voltage. The voltages on the RS232 serial line interface are +/- 10V therefore we need a diode before capacitor C1. The capacitor C1 is charged very quickly and serves as a energy storage to be able to switch on the relay for a moment. Capacitor C2 is charged very slowly over the 4.7M resistor. The transistor T1 discharges the capacitor C2 if it gets a short pulse via the RS232 DTR pin. If the pulses are not coming (because the computer has locked up) then the capacitor C2 will eventually (the time is about 40 seconds) be charged above 2/3 of the supply voltage and the Flipflop goes to "on". The capacitor C1, resistor R2 the LED and the relay have to be dimensioned such that the relay is switched on shortly from the energy in capacitor C1 but there is not enough current to keep the relay on all the time. We want the "reset button" to be "pressed" just for a second or two. The LED will stay on until the server comes up again after a reset. As you can see in the schematic there is as well a shutdown button connected to pin CD. If you press it for a short while (15 sec) then the driver software will run "shutdown -h now" and shutdown the server. This is for normal maintenance operations and has nothing to do with the watchdog. The driver softwareThe driver software is a small C program that can be started from the /etc/init.d/ scripts. It will permanently switch on the RS232 pin RTS and then send pulses to DTR every 12 seconds (the timeout of our watchdog is 40 seconds). If you shut down your computer normally then the program will switch off RTS and give a last pulse to DTR. The effect is that the supply voltage capacitor (C1) will already be discharged before the timeout comes. Therefore the watchdog will not hit under normal operations. To install the software unpack the linuxwd-0.3.tar.gz file which you can get from the download page. Then unpack it and runmake to compile. Copy the resulting linuxwd executable to /usr/sbin/linuxwd. Edit the provided linuxwd_rc script (for redhat/mandrake, or linuxwd_rc_anydist for any other distribution) and enter the right serial port where the hardware is connected (ttyS1=COM2 or ttyS0=COM1). Copy the rc script then to /etc/rc3.d/S21linuxwd and /etc/rc5.d/S21linuxwd That's it. TestingWhen you have soldered everything together you should test first the circuit before connecting it to the computer. Connect the pin that will later connect to the RTS line of the serial port to a 9-10V DC power supply and wait 40-50 seconds. You should hear a little click when the relay is switched on and the LED should go on. The relay should not stay permanently on. The LED will stay on until you connect as well the line that will later go to DTR to +10V.When you have verified that this works you can connect it properly to the computer. The linuxwd program has a test mode where it produces some printouts and stops after some time to sent pulses over DTR to simulate a locked up system. Run the command linuxwd -t /dev/ttyS0 to run linuxwd in test mode (use /dev/ttyS1 if you have the hardware on COM2).Hardware installationThe RS232 interface has the following pinout:
9 PIN D-SUB MALE at the Computer.
Connecting the circuit to the RS232 should be straight forward. To connect the CPU reset line with the relay you need to locate the wires that go to the reset button on your computer. Connect the relay from our circuit in parallel to the reset button. ConclusionA watchdog is certainly not a 100% guarantee to have reliable system but it adds another level of security. A problem can be a situation where the file system check does not complete after a hardware reset. The new journaling filesystems might help here but I have not tried them out yet. The watchdog presented here is inexpensive, not too complex to build and almost as good as most commercial products.References
Talkback form for this articleEvery article has its own talkback page. On this page you can submit a comment or look at comments from other readers:
2002-06-22, generated by lfparser version 2.28 |
|
James Thornton, jamesthornton.com>Services: Search Engine Optimization Consultant |
Electric Speed: E-commerce Developer |
digup is a console tool to update md5sum or shasum digest files. It will read existing digest files, check the current directory for new, updated, modified, renamed, or deleted files, and query the user with a summary of changes. After reviewing the updates, they can be written back to the digest file cissp test questions. This makes digup very useful to update and verify incremental archives like chronological data storages, which are commonly stored and backed up on hard disks. Using a full file digest scan, even slowly creeping bad blocks on old hard disks can be detected HP0-S20. By using a crontab entry, this check can be performed unattended and routinely. Nagios is a host, service, and network monitoring system that will watch your network and alert you to problems before your clients or end-users do. The system runs checks on hosts and services that you specify using plugins that return status information to Nagios. When problems are encountered, the system will send notifications to system administrators so that they can take action on the problem HP0-S21. The JumpBox for Nagios gives you a head start to using the system. It eliminates the complexity involved in getting the application installed, and allows you to focus on the configuration for your specific environment. Since Nagios is based on plugins, depending on what you want to do this will vary in complexity HP2-T15.
-- selina gomez, March 27, 2010
642-813 dumpsan on-line web-zine or diary (usually with facilities for reader comments and discussion threads) made accessible through the World Wide Web642-825 dumpsan on-line web-zine or diary (usually with facilities for reader comments and discussion threads) made accessible through the World Wide Web642-832 dumpsan on-line web-zine or diary (usually with facilities for reader comments and discussion threads) made accessible through the World Wide Web642-845 dumps
-- marry davidson, July 21, 2010
comptia a+When you do this, Emacs uses the FTP program to read and write files on the specified host. It logs in through FTP using your user name or the name user. It may ask you for a password from time to time; this is used for loggingase certificationWhen you do this, Emacs uses the FTP program to read and write files on the specified host. It logs in through FTP using your user name or the name user. It may ask you for a password from time to time; this is used for loggingBH0-006 dumpsWhen you do this, Emacs uses the FTP program to read and write files on the specified host. It logs in through FTP using your user name or the name user. It may ask you for a password from time to time; this is used for loggingccent
-- marry davidson, August 4, 2010