EMail is one of those services that should never be down. Unfortunately that's almost impossible because something can and will go wrong. If you have a server that does mail chances are your mail may go down. Having multiple servers that do mail can alleviate that risk but unfortunately there's one issue that you have to resolve. The mail spool directory is where your mail resides. This directory can only reside on one place because of file locking issues, sendmail process must always be quickly have write access to that file. There are a few different ways of having redundant mail.
One way is to have mail gateways where mail gets spooled there temporarily and then gets transferred to that mail spool server. So if the mail spool server is down, mail is still received by the gateway servers. Problem with this scenario is people will not be able to read their email at that time.
Another scenario is having multiple mail spool servers and divide users access by each server, so users with a-m alphebetical order will be on one mail spool server and n-z will be on the other. With this you alleviate the problem for one group of users while the other group of users will still be down. Although this scenario is very good if you need to lighten up the load on the servers. In fact this scenario can be used with the next scenario.
A very inexpensive solution would be to have 2 computers connected to each other by a scsi disk and having a disk in between serving as the spool directory.
Here's an illustration:
Each server will have 2 scsi disk, each scsi disk will connect to the hard drive. There are two disks because we also want to add redundancy to the spool disks
. The spool disks are connected in a software mirror(Raid 1). That way even
if a disk goes bad your mail will still work.
The process can be setup with a heartbeat setup. The software can be found at Linux-Ha. The computers are also connected to each other by seperate NICs, eg: mail1 - 10.0.0.1 and mail2 - 10.0.0.2. They are also connected by a serial cable as well(for redundancy). The software every one second queries the other machine if the primary machine is down the second machine becomes the mail server. This means if you will be using 3 IP addresses. 2 will be direct IP addresses for the machines and third will be for the virtual server(this is the one users will be connecting to).
You will be installing the software on the first computer and then you will sync the software to the second. Installing the heartbeat software is painless if you use the rpms(Redhat 8.0). The configuration is also fairly simple. The config files are in /etc/ha.d. There's ha.cf, the following lines should be changed:
keepalive 1 deadtime 5 initdead 10 serial /dev/ttyS0 baud 230400 udpport 694 udp eth1 watchdog /dev/watchdog node mail1.mycompany.com node mail2.mycompany.comThe watchdog line enables you to use the watchdog kernel module, watchdog will try to reboot your machine if it detects your computer has frozen. You can find more information about that from watchdog documentation in the kernel sources. In the authkeys file put the following 2 lines:
auth 2 2 crcThis file controls heartbeat authentication between the mail servers. Since they will be using their own private network cards there's no point in having any REAL authentication. The haresources file will contain one liner, this line tells the software what services that it will regulate(start and stop) when a node goes up. In other words your sendmail service will not be started by the rc scripts nor will your spool disk be mounted by the rc scripts. Heartbeat will do this for you. Here are the contents of the file:
mail1.mycompany.com 192.168.0.100 Raid1::/etc/raidtab.md0::/dev/md0 Filesystem::/dev/md0::/mail::ext3 sendmail httpdThe ip address is your Virtual IP address of the virtual mail server, eg: mail.mycompany.com. Your spool disk will be mounted in /mail. Sendmail and httpd service will be controlled by heartbeat.
After the software has been installed. Install the scsi cards, make sure you have the same scsi cards all around. SCSI ID of the scsi cards will be 7 on the first computer and 8 on the second computer. Disable INIT command in the scsi bios. A note to mention is that sometimes one of the disks will stop working when one of the computers will be rebooting, this is nothing to panic as the disk will comeback within a few minutes.
Create your raid by writing this to the /etc/raidtab:
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/sdb1
raid-disk 0
device /dev/sdc1
raid-disk 1
Run mkraid command, mkraid /dev/md0. You can check the status of the raid
on /proc/mdstat command. Run mkfs.ext3 -j -m0 /dev/md0. Move your /etc/mail
directory to /mail/etc/mail and put a link back to /etc/mail. Move and relink
/var/spool/mqueue to /mail/mqueue. Move and relink /var/spool/mail to /mail/spool. In other words move everything and anything(dont forget to symlink the
original location) that deals with sendmail. We also have a webserver on our
mail server for webmail this means all httpd stuff will be moved and symlinked
from mail to original location.
In regarding to our custom setup we also have a simple C daemon program running that monitors heartbeat services(sendmail and httpd). Its installed in /etc/ha.d/proc_check/proc_check. This program will try to restart sendmail if its down. After three tries it will then attempt to switch mail to the second node.
We also have a sync perl program that sync the 2 servers together. From the first machine you run mail.sync located in /usr/local/bin, its config files are in /usr/local/etc/. It uses rsync to sync.
In heartbeat there is one active node and the other is passive. The filesystem, sendmail and httpd will be running on the active node, when passive node becomes active then these services will be running on the new node...and so on.