Building fault tolerant email storage for $30 a month

Ethereal.email is not a big news anymore. It’s a simple service that allows you to generate email accounts using an API call from Nodemailer or by clicking a button on the Ethereal homepage. If you try to send mail using that account, then all messages are caught and stored in the INBOX of your account where you can then access the messages through a web interface or by using your favorite IMAP client.

What might be still interesting though is that one of the main reasons I created it was to test out some of my newer email projects. The mail store happens to be based on Wild Duck Mail Server and unlike regular mail servers it is designed to be fault tolerant by default. Having lots of people sending all kind of messages to that system allows to see how it works in practice.

Whenever you connect to a Ethereal.email related URL or service, for example to the Ethereal homepage or the MX server or try to send mail through the Ethereal MSA then actually you hit an HAProxy instance first. This HAProxy instance then sends your request to one of three available application servers. Wild Duck is stateless, so all requests are balanced with the most simplest round robin algorithm.

All three servers make up a MongoDB replica set and also a Redis Sentinel set. Applications (IMAP, POP3, MSA, MX, WWW) connect to these replica sets to get their data. MongoDB is used as the mail store and Redis is for caching, counters, pubsub and such.

So to the fault tolerant part. If one of the application server dies for whatever reason, then the service should keep working unaffected. HAProxy detects that the specific backend is down and removes it from backend list, so no request is routed to that specific instance. As Redis and MongoDB are both set up in a replicated setup then applications either keep using the old Primary instance of MongoDB and Redis or wait until MongoDB replica set and Redis Sentinel have elected a new Primary and seamlessly switch over.

If your long lived IMAP connection happens to target the instance that goes down then connection is obviously lost. Usually IMAP clients resume the connection immediately and this time HAProxy routes the connection to a healthy instance, so as an user you probably don’t even notice that your mail client reconnected.

Application servers have 50GB of storage each which is not much but as Wild Duck is a lot more efficient than normal mail stores when storing messages to disk, then in reality it should be able to store a lot more messages than 50GB. All messages expire in 7 days, so even if the storage gets full, it’s a temporary problem.

On the left is the combined size of maildir folders for ~2000 users. On the right are the same messages imported to Wild Duck using all available optimization options

“Normal” email servers usually dedicate a specific folder on a specific disk to email users, so your IMAP connections must always end up in the same specific instance. Otherwise your mailbox becomes unavailable. In case of Wild Duck it does not matter to which host you connect to as all data is stored to replicated document storage.

And now to the costs. I host the application in OVH, with each instance using a €2.99 VPS (2GB RAM, 1 CPU). Application instances have an additional 50GB disk attached (€5 each). TLS certificates are handled by Let’s Encrypt (and acme.sh) which are free. So the total costs being 4*2.99 + 3*5 = 26.96€ which roughly translates to $30.

My benchmarks show that this $30 cluster is able to process 30 messages per second (10Mb/s). For testing I used the messages accumulated over the years to my main email address, so the processed messages should reflect everyday usage (different messages, different kind of attachments etc). Not quite enough to run a large scale email hosting but good enough for that money.

Leave a Reply

Your email address will not be published. Required fields are marked *