Installing a Mail Server for a Small Business

A sceptical guide

= Forewarning =

This page is still under construction.

I am neither an actual sysadmin nor a mail server expert, so take my advice with a big grain of salt. By the way, I am not a lawyer either, so you will find no valid legal advice here.

Your feedback is welcome, just drop me a line.

Once again, for the hard of understanding: if admittedly unprofessional advice, spiced with cynicism and maybe even some unjustified criticism, deeply disturbs or angers you, stop reading now!

= Big Decision Time =

So I hear that you need an e-mail service for your small business, volunteer-driven association or charity. Choose your lesser evil:


 * Outsource the whole thing:
 * Find a reasonable IT service provider. This is easier said as done.
 * Lose quite a lot of control. You cannot afford to annoy your IT service provider anymore.
 * Probably pay too much in consultancy fees.


 * Use a basic offering from your Internet Service Provider or some other mail service provider:
 * Lose control of your data. Your e-mails may end up in the wrong hands if your provider gets hacked. Start worrying about data protection regulation.
 * Risk being taken hostage. For example, your provider can go bankrupt or turn evil. At the very least, you need to do backups yourself, just in case.
 * Any free or bundled service from your mail or hosting provider will probably be too basic. For example, many such mail services have no shared calendar. And mailbox size limits are often too small.
 * If you choose a separate mail server provider with more e-mail and collaboration features, probably pay too much per mailbox. Some people consider $5/user/month to be cheap for a good mail and groupware service. Buy say you are in a volunteer-driven association or charity, with 20 users that only work a few hours a week (like in a small library). That's $1200 per year. Most volunteers would probably prefer to invest the money somewhere else.
 * If the Internet connection goes down, your e-mail stops working internally too.


 * Use a fully-fledged cloud service like business-grade Gmail or Microsoft Office 365:
 * Lose all control of your data to a company specialised in customer data snooping. You cannot stop paying your subscription, or your lose access to your e-mails. Start worrying about data protection regulations.
 * Definitely pay too much. Do you really need all the features you are paying for?
 * If the Internet connection is down, your e-mail stops working internally too.


 * Use a hosting provider and setup your own e-mail server:
 * You need many skills.
 * The server needs to be constantly kept up to date, for security reasons. Even if updates are automatic, you cannot really leave them unattended for long, you need to monitor at least whether they are still updating themselves.
 * Lose control of your data. Your provider has full access to your virtual machine. I heard that there are special encryption solutions, but if you can go that far, this guide is definitely not for you.
 * If the Internet connection goes down, your e-mail stops working internally too.


 * Install a complete open-source software distribution on premises specifically designed for small business management in general, or at least for e-mail or groupware in particular:
 * Examples are: Univention Corporate Server Kolab Kopano ClearOS Zentyal Zimbra Zarafa EGroupware Open-Xchange Sovereign Mail-in-a-Box iRedMail Koozali SME Server ... and many more, including some local offers in your country ...
 * This kind of complete solutions tend to be complicated and bloated. Some features are nice to have indeed, but the IT costs often outweigh the benefits in the long run.
 * You risk getting locked into a proprietary system. These small businesses distributions come and go, or change drastically. If they change too much, upgrading an extensively-configured system may be a huge pain. If it is very special, migrating to a different system system is no fun.
 * Once the system is configured, there is no reason why you should pay monthly fees. But vendors do not usually supply automatic or easy-to-install security upgrades for free. And these systems are designed to be exposed on the Internet, so having unpatched systems is risky. It's kind of contradictory if you think about it: the software is free, but you cannot really use it for free. But there are exceptions:
 * Univention does support a multidrop mode of operation and documents how to set fetchmail up, so the system needs not be exposed on the Internet, lowering the risk of running outdated software.
 * Koozali SME Server supports multidrop too.
 * iRedMail, Mail-in-a-Box and Sovereign configure existing operating system distributions, so bug fixes and updates are available for free. But these systems are designed to be exposed to the Internet, so you cannot really leave them unattended for long, you need to monitor at least whether they are still updating themselves.


 * Install and maintain your own mail server on premises. This is an impossible task for most people. You need not just general sysadmin skills, but specific mail server know-how. Just the research part is overkill. That is the reason why I wrote this guide.

The rest of this article is for do-it-yourself kind of people who want to install the mail server components on premises.

= Design your E-mail and Systems Operations Policy =

Use a Virtual Machine
You should install your mail server in a virtual machine. You could use a Docker container, but a virtual machine is probably easier to manage.

You are going to invest a significant amount of time setting the mail server up. If you make a mistake along the way, you can revert to an earlier snapshot, instead of starting from scratch.

Sooner or later you will have to upgrade the mail server software, or the operating system it is running upon. You can copy the VM to another PC and test the upgrade procedure before modifying your main server. If upgrading the main server does go wrong in the end, you can just restore to the latest snapshot to avoid long service downtimes.

Backing up your mail server becomes then just a matter of copying your VM as a big file. I have written a simple script to automate libvirt-based VM backups, but there are more optimised solutions available. After all, backing up a VM is a standard operation.

Avoid enterprise-level servers. They are expensive, special, and often do not live up to their promises. Take 2 standard PCs (or entry-level servers), and if the primary fails, start the VM on the secondary.

Offline Backup
You need a strategy for an offline and off-site backup. The classic 3-2-1 backup rule is probably your best bet.

An automatic cloud backup is more convenient, but has many drawbacks:
 * Monthly fees are due just to keep the data at rest on the cloud.
 * You need a very fast Internet connection to do a restore test of a complete backup in a reasonable amount of time.
 * If the cloud provider goes bankrupt, you no longer have a backup. So you need a secondary, offline backup anyway. The secondary backup needs to be offline to protect against encrypting ransomware. If you are going to be implementing it anyway, you may as well drop the cloud.

You need to encrypt your offline and off-site backups. You can use an encrypted container like VeraCrypt or LUKS, but even password-protecting with a simple compressor like 7z will probably be enough.

You can add data redundancy with a tool like par2 for extra robustness. I have written some scripts for 7z + par2 and for simple mirroring backups, but there is really no shortage of file backup solutions.

Archiving Policy
An e-mail archiving policy is often a must. The problem is that most users will never bother deleting big attachments from incoming or outgoing e-mails. Their mailboxes will forever grow, and your mail server will become unmanageable: backups will take forever, performance will drop, and disks will become full. Besides, when connecting a new e-mail client like Thunderbird or a smartphone, the new client may choose to download or sync a huge inbox or sent folder, unduly overloading the network connection.

No amount of convincing will do. You need to set a hard limit on the mailbox sizes. When the mailbox limit is reached, the user will need to clean old e-mails. Which is also not realistic.

Therefore, when a mailbox becomes full, your only practical options are to delete old e-mails or to archive them (move them to offline storage). An automatic, unforgiving system is often not appropriate in a small business, where you do not want to annoy your colleagues too much. But those 2 options do not have to be automatic: when the mailbox hits the limit, because the user has "forgotten" to take preventive action, you can then discuss with the user whether you will be deleting or archiving, and which e-mails will be affected.

But you need to be prepared. If a mailbox is full and cannot send or receive any more e-mails, that is not the time to start discussing or designing an archiving method.

Encryption at Rest
Decide whether you want encryption at rest on your server. Very few people implement this, because it is difficult. But if someone physically steals your server, he/she will then have access to all data on disk.

You can of course assume that your premises are physically secure, or that any robbers will only be interested in the hardware. There may be a legal requirement arising from the data protection regulation, so you should probably seek legal advice.

Options for encryption at rest are:
 * Use an e-mail server that encrypts each mailbox. This is non-standard and will probably make you life harder.
 * Use a special server solution that detects whether the server is not inside your premises anymore. I have heard of such a thing, but it sounds complicated. Your feedback is welcome.
 * Use a manual boot procedure for your server, so that somebody has to type the disk encryption password manually if the server restarts. If it is a virtual machine, you could do that remotely over a VPN or SSH. But it is still a manual procedure, so that your servers will not reboot automatically after an upgrade, or will not start automatically after a power loss.

Data Retention
You should find out whether you are legally required to enforce an e-mail retention policy at company level. If the answer is not clear, you probably need to talk to a lawyer.

This should not be hard to implement: just copy all incoming and outgoing e-mails to a separate mailbox. There are several methods to achieve this, but I haven't investigated them yet. Your feedback is welcome.

As a bonus, such a solution could serve as a last-resort backup. If a user manages to completely delete an e-mail, even from the trash folder, perhaps on purpose, you can always restore it from the data retention pool.

= How E-Mail Delivery and Bounces Work =

Say Alice wants to send an e-mail to Bob:


 * Alice presses the "send e-mail" button.
 * Her computer open an SMTP connection to Bob's computer. That is what IPv6 was for, wasn't it?
 * Bob's computer replies on the same SMTP connection with "Bob is awake and is reading your e-mail".

If only life were so simple. So let's add some servers and some trouble:


 * Alice presses the "send e-mail" button.
 * Her computer relays the e-mail to her SMTP server.
 * Alice's laptop suddenly runs out of battery, and she calls it a day.
 * In the meantime, Alice's SMTP server looks up Bob's MX DNS record and tries to contact his SMTP server.
 * Bob's SMTP server is down, so Alice's SMTP server will generate a "mail delivery delayed" bounce e-mail for Alice.
 * The apprentice at the web hosting company that Bob is hiring finally figures out how to restore the SMTP server.
 * Alice's SMTP server opens an SMTP connection to Bob's public (external) SMTP server.
 * Unfortunately, Alice misspelt Bob's e-mail address, so the SMTP connection will terminate with an "invalid recipient address" error.
 * Alice's SMTP server will generate a "invalid recipient address" bounce e-mail for Alice.
 * Next day, Alice got the spelling right. So this time Bob's public (external) SMTP server accepts the e-mail.
 * Unfortunately, Bob's Internet connection is down, so the e-mail waits at the web hosting company that Bob is hiring. Bob's Internet connection goes down too often. And he is worried that his web hosting company may lose all his e-mail one day, because he has met the new apprentice. So Bob is thinking about switching to another provider. For those reasons, Bob has installed an internal mail server, so that all mail is stored on premises, and at least internal mail works without Internet.
 * A week later, the Internet connection is restored, and Bob's internal mail server receives the e-mail from the web hosting company, either over SMTP, or by fetching over IMAP with a tool like getmail.
 * However, Bob is now on holiday, so the internal mailbox server (the one internally serving mails over IMAP) generates an "on vacation" autoresponse.

Be Careful when Generating Bounce E-Mails
The Internet is full of hackers, spammers, and people who cannot properly configure a mail server, so Bob has to be careful when receiving mails and automatically bouncing them.

There is one kind of problem called Backscatter that usually involves forging the sender address (e-mail spoofing). Because it is hard to harvest valid e-mail addresses, spammers sometimes try a few popular ones, like postmaster@example.com, info@example.com or sales@example.com. If those addresses do not exist, Bob's server will probably generate bounce e-mails to the forged sender addresses. That may annoy, or even flood, some random innocent person on the Internet. So Bob's mail server may land on some "known spammer sources" list (a bad reputation problem), which would then have a negative impact on Bob's mail service (other servers will stop trusting Bob's mail server).

This is what Alice and Bob can do to fight such spammers:


 * Alice should enable SPF, so that she cannot be impersonated so easily. There are other protection methods she could enable too. These techniques are just an attempt: there is no definitive way to tell whether an e-mail is spam.
 * Bob's SMTP server should reject mail as early as possible during the first incoming SMTP connection, so his server is not going to be the one generating the corresponding bounce e-mails. For example, if the recipient address does not exist, there is no need to accept the e-mail during the SMTP session, it can be rejected straight away with an SMTP error code. Such an early rejection saves processing power on Bob's mail server, because spoofed mails do not go further down in the processing chain.

Unfortunately, e-mails cannot always be immediately rejected during the first incoming SMTP connection. It may take too long to do a virus check, and the SMTP connection may time out. Or the SMTP server does not know whether Bob is on holiday at the moment. The corresponding bounce e-mails can only be generated later, after the SMTP delivery has reported success.

Bob needs to be careful then, because his mail server will be the one generating the bounce e-mails:
 * Bob should not generate automatic bounce e-mails if the sender is suspect. For example, if it fails the SPF check.
 * Bob should not generate too many automatic bounce e-mails in a short time, because that can quickly tarnish his sender reputation. For example, "on vacation" responses are usually sent only once a day per sender. Other such automatic responses should be throttled. There is no need to inform 1,000 senders per minute that their e-mail has bounced because it contained a virus. Some sources mention an industry-standard bounce rate of 2 % at most. If your mail server is getting so many such e-mails per minute, it is probably under attack, and should start silently dropping those e-mails.

The Importance of Instant E-Mail

 * In this day and age, people simply expect e-mails to arrive instantly. It is convenient. And there is no point in further losing market share to closed messaging or chat services, just because mail servers are slow to deliver e-mails. E-mail is a superior communication mechanism, it is a shame to let it degrade, or to prevent it from reaching its full potential, for no good reason.


 * Say that you are on the phone, spelling out your e-mail address to an acquaintance or to a new customer. The natural thing to do is to send a test e-mail straight away, in order to check whether the counterpart has got the spelling right. If e-mails are always delayed, it is a waste of time, and spelling mistakes will live longer than necessary.


 * Some authentication systems, like banking websites, send one-time codes per e-mail that are only valid for a short time. Rapid reception is then a must. Such automated e-mails may be missclassified as spam, so quick reception should include the spam folder too.

= Internal vs External (Exposed) Mail Server =

You certainly cannot afford to expose your mail server on the Internet. That is just crazy. The Internet is full of spammers and automated hackers just waiting for people like you, who do not have the resources to keep their mail servers patched and secure. Professional hosters have fallen in the past.

Even if software updates are automatic, you cannot really leave such systems unattended for long, you need to monitor at least whether they are still updating themselves. I would never install a mail server that is craving for attention in any way. In a small business, most people cannot keep up with the latest software releases. The most you can hope for is not to lag too much behind. You probably have heard of Windows XP and Windows 7 staying in operation years after their end of life. Big public and private organisations have been known to miss such deadlines too.

Besides, you cannot match the service level of a dedicated mail hoster in terms of high availability.

You also do not want to worry about:
 * getting a static IP address, or investigating some DynDNS workaround
 * setting up DNS records
 * configuring and testing SPF or DKIM
 * obtaining SSL certificates
 * dealing with spam and virus, maybe filtering by geographical location
 * fighting e-mail reputation issues
 * defending against denial of service attacks that use excessive bandwidth or server resources with measures like fail2ban
 * working around port 25 (SMTP) restrictions from your Internet or hosting provider

Unfortunately, most of the mail server guides you will find on the Internet assume that you are crazy enough to expose your server on the Internet, and do not feature any prominent warning about such security and operational aspects. My guess is that the people who write these guides are all evil sadists. 8-)

Therefore, the only sane configuration for your mail server is to be a secondary/satellite server to your web hoster's mail server.

There is one drawback though: your provider can read all incoming and outgoing mail. This is the price we will have to pay in this configuration.

Most Internet Service Providers provide very cheap but very basic mail services with reasonable virus and spam filtering. E-mail is a commodity item nowadays. If you are dissatisfied, changing providers is relatively easy. Your provider's mail server will be the one accepting external SMTP connections and doing virus and spam filtering. Standard e-mail security features like SPF will also be handled by your provider.

Your mail server will live inside your LAN and not be accessible from the outside. It will download all e-mails from your provider's mail server at regular intervals (or maybe even immediately upon e-mail reception), and delete them from the provider, either straight away or maybe after a few days. Such collective e-mail downloads are usually performed by a specialised tool like fetchmail or getmail. It is no longer so important that your mail server is not hardened and completely up to date, because your server sits behind your firewall and behind the provider's mail server.

This recommendation applies even if you run a hosted mail server on the Internet. Keeping your server protected (not exposed at all) is the only sensible approach if you are not a mail or hosting professional.

As a bonus, if your internal mail server goes down, e-mails will not immediately bounce back to their senders. Internal mail server outages will largely go unnoticed outside your LAN.

This constellation [external mail server from hoster + internal mail server] is actually pretty common. Microsoft used to sell a rather popular product called Small Business Server, which included a mail server called Microsoft Exchange Server. Exchange had a plug-in called POP3 Connector which was designed to collect e-mails from the hoster's mail server. Many comparable e-mail download products for internal mail servers still exist today. So do not listen to any scaremongers spreading fear, uncertainty and doubt about this kind of mail server setup.

With a mail server on premises, should the Internet connection go down, internal e-mails will continue to work.

If a sysadmin needs access to your internal mail server from the outside, you can set up an SSH server. If general users need access to e-mail, and maybe remote desktop and file server access too, you can set up a VPN, which is not very difficult. Incidentally, I have written a guide for installing OpenVPN, but there are many more on the Internet.

Nothing else, other than SSH and VPN, should be accessible from the outside.

Well, if you really want to push your luck, you could expose the IMAP port on the Internet, so that users have convenient and instant e-mail while on the road. But then you should also implement some traffic limiting or shaping, because synchronising a big mailbox over the Internet can eat most of your upload bandwidth and impact office users for a while. And you will need to start monitoring your mail server more closely.

Mailbox Strategy on your Provider's Server
There are several strategies for the mailboxes on your provider's mail server:


 * Each user has a separate mailbox on your provider's mail server. Assuming that your internal server does not delete the e-mails from the provider immediately after downloading them, your users have access to their newest e-mails when outside the office even without a VPN. Sending e-mails from outside the office always works, either with a mail client, or with the web interface the hosting company normally provides. Drawbacks are:
 * You need to create a mailbox per user on the provider too, which means more administration.
 * Most users will find it somewhat confusing. For example, there will be 2 "sent" folders: one of them is used when on the road, and another one when inside the office. Even if you automate everything, 2 copies of the same e-mail may live for some time.


 * Your hoster provides a single "catch-all" e-mail mailbox for your domain. This is usually called a "multidrop" or "domain mailbox". A single internal tool can download all e-mails for all users at once making server management easier. There is no reason why the standard virus and spam protection from the provider should not apply to this single mailbox too. The main problem is that your hoster can no longer reject immediately (inside the STMP connectio) e-mails to unknown recipient addresses, which can make your set up vulnerable to backscatter.


 * Your hoster provides e-mail aliases that operate like a multidrop mailbox. This feature is standard nowadays. You need to create a single mailbox, and then one alias per internal user. A setup with e-mail aliases is the right compromise, so it is the configuration that this guide will focus on. Before you go any further, you need to check whether your provider limits the number of aliases per mailbox. If there is a lowish limit, you can always use more than one mailbox, but that would make your configuration more complicated.

= Enable SPF at your Mail Provider =

Do not forget to enable SPF on your provider's mail service. SPF is your first line of defence against hackers and spammers forging e-mails that pretend to have been written by you. Nowadays, SPF is considered to be a basic e-mail protection measure.

If there is no easy "enable SPF" option at your provider, enabling it actually boils down to adding a DNS TXT record like this:

your.domain.com TXT  v=spf1 mx a include:outgoing-mail-server.your-hoster.com -all

There are many guides on the Internet about enabling SPF, and you can use one of several online services to check whether the DNS SPF entry is correctly configured.

If you publish an SPF record, it does not really make sense to say "well, I think so, but I am not really sure". So I would use "-all" and not "~all", in order to provide the watertight protection you would expect from a scheme like SPF.

= Find Out How Your Provider's Mail Server Works =

Go to your web hoster and create a "catch all" mailbox like multidrop@example.com, or a regular mailbox with a couple of aliases. Then connect to that mailbox with either your provider's webmail interface, or with an e-mail client like Thunderbird.

Send a few e-mails to addresses like test1@example.com and test2@example.com. They should all land in the same multidrop@example.com mailbox. If you are using aliases instead of a multidrop mailbox, send the e-mails to different, existing aliases. Now look at the full e-mail headers in each e-mail, and find out which header has the original recipient, like test1@example.com.

That header records the envelope recipient, and is what we will be using later to deliver the e-mails to the separate, internal user mailboxes. If the provider uses Postfix, then the header name will probably be X-Original-To.

You should do a further test, just to make sure you will not be having trouble later on. Send an e-mail to test1@example.com, and blind-copy (BCC) test2@example.com. You should get the same e-mail twice, each with a different envelope recipient header. If not, your provider's mail server is not configured properly, which is no good sign in this day and age.

= What Your Internal Mail Server Needs =

A mail server is an easy concept to describe. It is a piece of software that:


 * Receives e-mail.
 * Sends e-mail.
 * Stores e-mail.
 * Serves stored e-mails to mail clients.

It is like a virtual post office. Of course, it does get rather complicated, but a real post office is complicated too.

If you want to succeed in the small organisation market, your best bet is to make it as simple as possible. That is why Microsoft Exchange has been very successful. But a mail server is a boring, standard product, that stops making money at some point in time. So Microsoft has started abusing this product, making it more expensive, and pushing cloud subscriptions instead.

E-mail is actually a critical part of your communications infrastructure, so you should not depend on the whims of a big, private company that can do shenanigans like spy on you, or declare that your mail server licence is no longer valid. I'll give you an example: if you exceed the maximum mail database size for the licence you bought, the Microsoft Exchange Server will randomly stop working, and you need to restart it manually when that happens. I could not believe that Microsoft would do such a thing to their customers, but the related system log event actually tells you: "If the physical size of this database minus its logical free space exceeds the limit of 18 GB, the database will be dismounted on a regular basis."

Software Components
Unfortunately, the free software landscape is no good for small organisations. All free mail software I have seen seems to follow these 2 Unix philosophy principles:


 * Write programs that do one thing and do it well.
 * Write programs to work together.

So people started writing separate software tools for sending e-mail, for receiving e-mails, for storing e-mail, and so on. For reception there are even 2 kinds: Mail Submission Agent (MSA) and Mail Delivery Agent (MDA). This separation cannot work well for such a simple, coherent idea of a post office. So you end up with a mess of tools with many integration problems. But even though the Unix philosophy breaks down, there is no other choice at the moment.

You will need the following components in your local mail server:


 * A Mail Storage that understands the IMAP protocol. This guide will focus on Dovecot.
 * A Mail Transfer Agent (MTA) for sending mail that understands the SMTP protocol.
 * A tool to move mail from the external to the internal mail server (the e-mail reception part). This guide will use getmail.

These are examples of integration breakage caused by those 2 Unix philosophy principles:


 * When an SMTP server receives an e-mail, it does not know directly whether the recipient is out of office, whether the destination mailbox is full, or even if the recipient address exists. So it often cannot tell the sender straight away in the same STMP connection, it needs to generate a bounce message (or an automatic reply) later on, which then increases vulnerability to backscatter. If the SMTP server were integrated with the mail storage, handling these situations would be less problematic.


 * Your mail client has to upload an outgoing e-mail twice: once to the SMTP server, which sends it, and then to your mail storage, to place a copy in the "sent items" folder. Upload speeds are often much lower than download speeds, so sending big documents can take a long time. To top it all, if the sending succeeds, but the storing fails, you will have no record of the e-mail you sent.


 * You do not want to give your users 2 sets of credentials: one for sending e-mails (SMTP), and another one for receiving e-mails (IMAP). So your SMTP server needs to jump through hoops in order to get the user's IMAP credentials for authentication purposes.

Choosing an IMAP Server
There is no shortage of mail storage servers around:

... and many more ...
 * Dovecot
 * Cyrus IMAP
 * Courier-IMAP
 * Archiveopteryx
 * DBMail

There are some common traits among many of these servers which this documents discusses in the next sections.

Wrong Focus
The most popular IMAP servers are highly-complex software, written in C or C++, designed for enterprise deployment, and optimised for throughput and low resource usage. My guess is that they are all geared towards high-volume hosting providers.

This focus is completely wrong for a small organisation. Performance does not really matter. Ease of use, simple foolproof configuration, and helpful troubleshooting are paramount. What we need is a mail server implemented in a script language, perhaps at most in Java, with fewer opportunities for security holes, and easy to tinker with.

All the popular servers listed above only receive e-mail (but not over the usual SMTP), and then serve it to the users via POP3 and/or IMAP4. In fact, they are conceptionally just a message storage, like a database for e-mails.

They do not send e-mail at all. This is not immediately obvious, and goes against common intuition regarding what a mail server should be. See the discussion above about the Unix philosophy principles for more on this subject.

For example, you could argue that Microsoft Exchange is a complete mail server, and Dovecot etc. are only one half of it, or even less than half, because you need a separate server to receive e-mail over SMTP (or a tool to collect the e-mails from your mail provider), and yet another server to send e-mails over SMTP.

Over the years, I have noticed that most people soon feel at home with Microsoft Outlook, and many system administrators come to terms rather quickly with Microsoft Exchange. They seem to have the right focus (more or less), and that is what I miss in the free software counterparts I have seen.

The IMAP Protocol
Nowadays most people use the standard IMAP protocol in order to access a mailbox. The older POP3 protocol has some quirks and should be avoided. There are of course other proprietary protocols like Microsoft Exchange ActiveSync.

However, after some time, I have come to realise that there is something wrong with IMAP.

I am no IMAP expert, and I haven't found any document yet that clearly outlines its deficiencies. The official protocol documentation, RFC 3501, is hard to read and does not seem to have any "caveats" section. It is the same lack of honesty that we all somehow have come to accept in virtually all software documentation.

For starters, IMAP defines the term "mailbox" as an e-mail folder, which is very confusing because everybody else uses mailbox to refer to an e-mail account.

The next thing I noticed in the RFC 3501 is the definition of a server status field. A specific example with the text "[ALERT] System shutdown in 10 minutes" is given. Such information does not really belong in a mailbox protocol, so no mail client I know of makes use of it. Think for a moment that you are designing a mail client, how are you supposed to display such information (which may result from every command) to your users, and what are the users going to do with it? And the standard makes that behaviour compulsory: "The human-readable text contains a special alert that MUST be presented to the user in a fashion that calls the user's attention to the message". This is a mail protocol, so if the provider wants to notify me, they could send me an e-mail (which is what they normally do). And if the provider wants to spam me with compulsory notifications, they can send me, you guessed it, spam e-mails! But there is an upside: this kind of protocol definition is a good forewarning about what you can expect from the IMAP protocol and its authors.

Yet another quirk is that the original IMAPv4 specification (RFC 3501) has a COPY command, but no MOVE command. Unbelievable. The MOVE command is a later addition with RFC 6851. Therefore, if an e-mail client wants to move an e-mail to the "deleted" folder, it must be prepared to copy it first (in case the server does not support MOVE). This is in fact what Thunderbird seems to do. But the copy operation may fail if the mailbox is full. Yes, deleting an e-mail may fail if the mailbox is full. How cool is that. This was brought to my attention by the excellent FAQ "When is What ... Deleted, Expired, Expunged or Purged?" from Cyrus IMAP.

The New Mail Notification Problem
IMAP's RFC 3501 recommends that mail clients poll for new e-mails with command NOOP. Polling over a network connection is of course undesirable, so most clients like Thunderbird allow a minimum of 1 minute, which is still too long in this day and age.

I hear you thinking: no problem, I have a fast Internet connection, so I'll hack Thunderbird, or use another tool, to poll every second. But frequent polling consumes network and computing resources, so most mail providers implement a limit to the polling interval.

IMAP is a protocol specifically designed for mailbox access, so this shortcoming is not really justifiable.

The first attempt at fixing this issue was a new command called IDLE, specified in separate RFC 2177. The main IMAP RFC 3501 does not mention the existence of this command, that would be far too user friendly.

The trouble with IDLE is that it is only valid for one folder. That would normally be the "inbox". But this is bonkers. Say you receive a e-mail that the server missclassifies as spam. Your mail client will not see the new e-mail immediately because the message lands in the "spam" folder, and not in the "inbox" folder. How stupid is that? Just imagine the standard user, who is accustomed to instant reception on the "inbox" folder. The bank sends him/her a short-lived, 2-factor authentication code, with the hint "if you do not receive it, look in your spam folder". However, the user has the "spam" folder open, and the expected e-mail is not there either. But it would be, if IDLE worked for that folder too.

A few years later (the clock runs slowly in the mail server world), someone came up with the second attempt, the IMAP NOTIFY command described in RFC 5465. Unfortunately, Thunderbird does not implement support for IMAP NOTIFY yet, and neither does K-9 Mail.

The Expunge Antifeature
Deleting an e-mail is a 2-step operation in IMAP: first, you mark the e-mail with the \Deleted flag. At a later point in time, you issue an EXPUNGE command, which actually deletes the e-mail. But mail clients tend to delay the EXPUNGE command, opening a window for interesting effects.

Some clients like Thunderbird do now show e-mails marked as deleted by default, but others, like Roundcube, do show them with a strikethrough effect by default. So you delete an e-mail with Thunderbird, but it is still there, only you do not see it anymore. This is annoying, but more annoying is that the Thunderbird developers do not want to improve the situation. If performance were a concern when deleting many e-mails, Thunderbird could issue an EXPUNGE at the end, or after a short delay. With the current state of affairs, automated tools like getmail may then pick such deleted e-mails up, even a long time after the e-mails have been apparently deleted.

The following excellent article describes the EXPUNGE behaviour with more detail:


 * Thunderbird, IMAP, Expunge or How to Really Delete Emails from the Server
 * IMAP plays hide-and-seek
 * https://news.softpedia.com/news/thunderbird-imap-expunge-or-how-to-really-delete-emails-from-the-server-492525.shtml

The Naming of the Special Folders
Something is wrong with the naming of special IMAP folders like "inbox" or "deleted". Sometimes, e-mail clients do not identify them properly. The cause is often different languages, like an English IMAP server and a German mail client. But I have seen in my inbox 2 folders named "Draft" and "Drafts", so it is not only a language issue.

I never found the reason behind this. Some e-mail clients allow you to manually specify the server names of such special folders, but this should not be necessary.

No Concurrent Operation
IMAP only supports one command at a time. If you are downloading a big attachment, you cannot do anything else on that connection. So mail clients like Thunderbird have a setting to initiate multiple connections. This makes connection handling more complicated.

No Mail Filters
Mail filters are a standard feature nowadays, but the IMAP specification does not say a word about it.

IMAP defines message search features, but no message filter capabilities. Not even the standard "out of office / on vacation" autoresponse is covered. I am guessing that the IMAP protocol is not easily extensible, because people have resorted to implementing a new protocol called ManageSieve that runs on a separate TCP connection. Needless to say, this creates interoperability issues. See also the Sieve mail filtering language.

The mbox Format
The mbox format is the worst idea ever: all your e-mails are stored in a single text file, one after another.

The list of drawbacks is rather long:


 * If this one file gets corrupt, you risk loosing everything.
 * Backing your e-mails up means scanning the whole file in case some byte in the middle has changed from the last backup.
 * Many usual mail storage features, like message indexing, have to be implemented in separate files that point to the main mbox. And these separate files can course easily get out of sync.
 * Deleting e-mails leaves holes, so you need to "compact" the mbox file every now and then.
 * There is no single standard, but a family of very similar but slightly incompatible formats.

I could go on, but it is not worth it. Because mbox is the worst possible choice, it is only natural that my favourite mail client Thunderbird uses it. This way, you can experience all those drawbacks yourself. The Thunderbird project is of course aware of the problem, and a message store based on maildir is on the way. Since many years.

In theory, a Mail Delivery Agent (MDA) should be able to deliver mails to an mbox, no matter what e-mail client or server is using it. It is obvious that this cannot work, so it does not in practice. Using an independent MDA is mostly discouraged, due to concurrency issues (file locking) and small file format differences, so each mail server brings its own tool to deliver mail locally.

The Maildir Format
What do you do when you realise that mbox is a bad idea? You follow the Unix philosophy and design a second worst disk format called Maildir where every e-mail is stored in a single file.

Because, you know, filesystems, backup applications and even users love directories that contain many many thousands of files.

As with mbox, the same theory applies here: a Mail Delivery Agent (MDA) should be able to deliver mails to a Maildir, no matter what e-mail client or server is using it. It is obvious that this cannot work, so it does not in practice. Using an independent MDA is mostly discouraged, due to concurrency issues (file locking) and small file format differences, so each mail server brings its own tool to deliver mail locally.

A Custom Disk Format Optimised for Your Specific Mail Server
At some point in time, even the most hard-core Unix philosophers realised that mbox and Maildir are not really the way to go.

Therefore, a new, custom disk format for mail servers must be designed. It must be specific for mail servers, because the type of work they do is very special, so that unique optimisations can squeeze a sizeable amount of performance. Besides, mail server maintainers have vast amounts of development resources at hand, and are experts in complex data structures, so they are the perfect engineers for this work. These are ideal conditions for this kind of advanced not invented here technology.

This is how Dovecot's multi-dbox format must have been born. I do not really know much about it, partly because it is not well documented, but my guess is that it performs really well in the relevant benchmarks, and probably in real life too. If some corner case breaks, it's somebody else's e-mails after all.

SQL Database
Let's get serious for a moment: if you need reliability, safe concurrency, good overall performance, and advanced features like online backup or high availability clusters, you should probably choose a tool specifically designed for and tested in such scenarios. I am talking about a standard SQL database that implements ACID, something like Postgres.

Anything else is not worth considering in general, unless you think that losing the odd e-mail (or a bunch of them) is acceptable. A tried-and-true SQL server also comes with tools to deal with partially-corrupted databases. And you (or any third-party tool) can access the data inside with standard SQL techniques, should you need something special at some point in time.

Let's look at some of the IMAP server contenders:


 * Dovecot cannot use a standard database, so it fails in this subject.


 * Courier IMAP only supports an (extended) Maildir format, which is even worse.


 * Cyrus IMAP can use an SQL backend, but it's not its primary choice, so it does not really inspire much confidence.


 * Archiveopteryx uses PostgreSQL, which is fine.


 * DBMail supports several popular database backends and looks like the sanest choice in this regard.

= Set up Dovecot =

Yes, I know, Dovecot may not be the ideal IMAP server. But it is the only one I have (partly) learned. I am sorry!

Install Dovecot
On Ubuntu/Debian:

sudo apt-get install dovecot-imapd  dovecot-managesieved  dovecot-lmtpd  dovecot-submissiond

Just by installing the packages, the dovecot service starts automatically. That does not make any sense, because we have not configured the server yet, so stop and disable it now:

sudo systemctl disable --now dovecot.service

The Old Local Mail
Many many years ago, there was a central, very expensive Unix computer with many user terminals. Each user had their own mailbox and was often greeted with "you have new mail" when logging on via a text console. There was no Internet at the time.

Times have changed since then, and e-mail means now something completely different. Nobody reads such local mails anymore, and the old greeter message is even turn off by default as far as I can tell.

However, old traditions die hard. A standard Ubuntu installation still has a mail user account with /var/mail as home directory. There you will find a big text file for each user in mbox format, which is a text-based format. If you install a Mail Transfer Agent like Postfix for whatever reason (maybe automatically as a result of a package dependency), things will automatically start sending e-mails to local users, and those mailbox files will start to grow.

For example, I once added a cron job and forgot to append the usual ">/dev/null 2>&1" to the command. The cron service sent then a local mail to the root user with the command's output every time it ran. After a few months, I had "70102 messages 70102 unread", and the mailbox file had grown to 60 MiB.

Somebody should tell the Ubuntu/Debian developers that we now live in year 2020. Linux should actually run forever without automatically filling the disk with rubbish. The usual excuse for this kind of misbehaviour is that many system tools or processes have no other way to send notifications. But no amount of convincing will get users to start reading local mail nowadays.

Because I expect that this local mail rubbish will sooner or later go away, I will avoid storing any modern mailboxes under /var/mail, so that a mail server like Dovecot has no chance of interacting or colliding with the old local mail system.

The default Dovecot installation on Ubuntu 20.04 does however integrate with the local mail though through the following configuration settings:

mail_location = mbox:~/mail:INBOX=/var/mail/%u mail_privileged_group = mail
 * 1) Typically this is set to "mail" to give access to /var/mail.

Dovecot will then litter your existing user home directories with subdirectories named ~/mail. At this point, I distrust the default Dovecot configuration, so we will create a complete new configuration from scratch.

About Virtual Users
You normally do not want to associate e-mail accounts with local user accounts, so we will be using the "virtual users" mode, which means that e-mail accounts will be independent from any other local system user accounts.

You could use some fancy centralised server, like LDAP or Samba's Active Directory, to link e-mail accounts to computer user accounts. But then there would be a dependency between servers, and upgrading the LDAP server would impact the mail server. In a small business, I recommend keeping the servers separate, even if it means creating more than one account per user (one file server account, and one e-mail account).

Dovecot Configuration
If you think that this article is somewhat funny, I will mention that Dovecot's manual starts with the claim that it is "simple to set up". After a few day's worth of investigation and tests, I can confirm that Dovecot's documentation is funny too.

We will start by deleting the default Dovecot configuration, because as I stated above, I do not trust it. Just rename file /etc/dovecot/dovecot.conf away and start a new one.

After making any changes to the configuration, you need to restart Dovecot like this (but remember that we have disabled it altogether until the first configuration is complete):

sudo systemctl restart dovecot.service

Choosing the Mail Storage Location
Dovecot is very flexible, you can place your mailboxes wherever you want. For security reasons, we will create a separate local user with permission to access the mailbox files for all virtual users, so I will place the mailboxes in that user's standard home directory. But you may choose some other mailbox storage location of your liking, or even change the new user's home directory to somewhere outside /home.

Most people name this new user vmail, probably because we will be using "virtual users". But it's still a bad username, so I will use dovecot-mailboxes instead. Create it like this:

sudo useradd --create-home  --user-group  --system  --shell /usr/sbin/nologin  dovecot-mailboxes

Option --system is only a convention so that the new user does not come up in the logon screen.

Later on, we will specify in the Dovecot configuration file where exactly under /home/dovecot-mailboxes the mailboxes will be placed. You do not need to create any files or subdirectories there, as Dovecot will create the mailboxes on first touch. Dovecot sets the top-level subdirectory permissions to drwx-- upon creation, so it seems safe.

Create the First E-Mail Accounts
Now you are ready to define your mailboxes. First of all, install a fully-fledged SQL database server, and then learn enough SQL to be able to send user account queries, and then...

I am kidding you. You just need to store a few usernames and passwords, so a simple text file will do. Many mail server guides want you to install PostgreSQL, or some other monster database server, or maybe even integrate with an LDAP directory server. But you cannot realistically ban people from ever writing mail server guides for the unwary, can you?

So create file /home/dovecot-mailboxes/dovecot-passwd with the following contents:

account1@example.com:{PLAIN}pass:::::: account2@example.com:{PLAIN}pass::::::

You do not even need to change the example addresses above to your real e-mail addresses, you can actually use account1@example.com for your first IMAP connection test.

The Dovecot service, which runs with user 'dovecot' and group 'dovecot', needs permission to read the password file. Therefore, containing directory /home/dovecot-mailboxes must have 'x' permission for 'other' users, which is the default under Ubuntu/Debian. The password file itself could have the following permissions, owner and group:

-rw-r- dovecot-mailboxes  dovecot [...] dovecot-passwd

Such password files should actually use hashed or encrypted passwords, instead of the plain text password in the test content above, so it would not matter much that the file is readable by everyone. But you should always tighten file access permissions, just in case.

You can of course place the password file somewhere else on the filesystem if you like.

Create and Test the First Configuration
Place the following contents into file /etc/dovecot/dovecot.conf :

TODO

Now that Dovecot has a first complete configuration, you can enable and start it like this:

sudo systemctl enable --now dovecot.service

Check if Dovecot knows the first test e-mail account:

sudo doveadm user "account1@example.com"

You can now start a standard e-mail client like Thunderbird and connect over IMAP. The connection parameters are:

Server name: localhost Protocol: IMAP port 143 Connection security: none Authentication: password, transmitted insecurely (normal password) User name: account1@example.com Password: pass

Note that sending (SMTP) will not work yet, so it does not really matter what you configure.

Your e-mail client should see the inbox, and you should be able to create an e-mail draft (but not send it).

E-Mail Passwords in Plain Text over the Network
The mail server we are installing will have no SSL/TLS enabled, so all passwords will be sent in clear text over the network. That is clearly not secure.

But you will not be connecting to the mail server over the Internet, only over Ethernet switches in your private LAN, so the lack of transport security is not really a critical issue in this scenario. Skipping the security aspects saves us quite some work.

You can of course generate and distribute your own security certificates. Drop me a line if you come up with an easy guide that I could embed here.

= Downloading New E-Mails From Your Provider =

We are striving to build a secondary, internal mail server which is going to download all e-mails from your external mail provider.

If you expect some smart, easy, fast collaboration between mail servers in this respect, you haven't been paying attention. This is the mail software world after all. The best you can hope for is a quirky, external tool to do the job. You basically have 2 options: Fetchmail and getmail.

Fetchmail
Fetchmail is a bloated tool full of options that supports many outdated protocols or bad ideas like POP2 and SMTP ETRN that nobody should have ever used.

Some of the supported mail transfer methods don't have much in common. Only IMAP makes sense anyway.

After collecting the e-mails, Fetchmail wants to deliver them locally with SMTP by default. This is strange, as passing such e-mails to a local SMTP server should not be necessary and carries the risk of creating e-mail loops. You would normally want to deliver locally with a local Mail Delivery Agent (MDA) or with the Local Mail Transfer Protocol (LMTP).

IMAP Issues
It turns out that Fetchmail does not implement IMAP properly. From its documentation: "The IMAP code uses the presence or absence of the server flag \Seen to decide whether or not a message is new. This isn’t the right thing to do, fetchmail should check the UIDVALIDITY and use UID, but it doesn’t do that yet." This alone is a good reason to avoid fetchmail. Say there is a problem and you connect to your external mailbox with Thunderbird. Just by letting Thunderbird automatically mark an e-mail as "read", it will no longer be delivered to your local mail server.

Using UIDs to track read messages in the inbox would be the wrong take anyway. The smart thing to do would be to move processed messages to another folder, perhaps named "Retrieved by Fechmail" or "Moved to the Internal Mail Server". This way, the current state of affairs would be obvious to anybody using the external mailbox directly. Messages can then be deleted based on their age, like the archiving function in mail clients. Perhaps you do need to keep track of them with their UIDs, but on that separate folder.

And of course, the one feature you do want, IMAP NOTIFY support, is not supported.

Operational Issues
At least the documentation is reasonable, if not really helpful to the newcomer. An overview for the IMAP use case is missing. I could not find much information on the operational issues, like for example, how to deal with e-mails that permanently fail to be delivered (get "stuck"). There is no valuable guidance about how a sysadmin should implement this kind of mail retrieval service in a robust way.

There is an interesting hint in the manual page about such operational issues though: "If a given connection receives too many timeouts in succession, fetchmail will consider it wedged and stop retrying. The calling user will be notified by email if this happens." There is no way to adjust that number of timeouts. Another wedged condition is failed authentication. I could not find any indication about where Fetchmail stores those "wedged" flags, and whether those flags are only used in daemon mode.

The fact that Fetchmail wants to report operational errors per e-mail is also problematic. After all, Fetchmail is part of the mail infrastructure, so error reporting should use a separate way.

Once such error condition is when an e-mail is oversized. The smart thing to do would be to move problematic e-mails to a different folder, so that they stay out of the way in the next poll. The admin can then easily find and review those problematic e-mails. But Fetchmail seems to leave them in the inbox. I wonder whether those failed e-mails are marked as "read", to be skipped the next time around.

It is not properly documented, but if you read the documentation for "set no bouncemail" and other related options, you can piece together that there are 2 ways to generate such error e-mails: bounce messages and mail to the local postmaster with SMTP. I already mentioned that using the same mail infrastructure to report operational mail problems is perhaps not the best idea. You could argue that, if you configure your local SMTP server to just deliver local mail, then such e-mails are no longer part of the normal mail infrastructure, but that forces the admin to regularly check their local mail on the server, which is inconvenient.

The alternative, generating bounce messages, is risky because of backscatter. There seem to be no option to limit the number of bounces per sender or globally.

Single-Shot Operation
After some wrestling with its documentation, it dawned on me that Fetchmail's approach is fundamentally wrong. I guess that is the reason why there is no overview for the newcomer, because a good overview would make the problem too obvious. We are looking for a permanent link between mail servers, but Fetchmail is actually a single-shot tool: it goes through the configuration file, in a single-threaded fashion, and collects mail from each source mailbox. If one mailbox has a huge e-mail to download, or one server is slow to respond, delivery from any other mailboxes is unduly delayed.

It's like you have several computers and are looking for a cloud-based file sync solution, and they give you rsync. Or like you need a bridge between 2 islands, and all you get is a ferry. Yes, you can establish a permanent link between 2 islands with a ferry, but the service level is not the same, is it? And you need a captain constantly looking after it.

You may object now that Fetchmail has a daemon mode, but this mode is deceptive. It is just a loop around the single-shot logic. Fetchmail's own manual page states the following: "if you break the ~/.fetchmailrc file’s syntax, the new instance will softly and silently vanish away on startup". So you need to wrap Fetchmail with some systemd service and configure some sort of notification or alarm in case the service repeatedly fails. systemd services can already trigger at regular intervals, so Fetchmail's daemon mode is basically useless.

IMAP IDLE support is an afterthought, that was no doubt implemented because it is useful. But it does not fit into the single-shot architecture. The manual page states: "Note that this works with only one account and one folder at a given time, other folders or accounts will not be polled when idle is in effect!" Therefore, by using IMAP IDLE, you have effectively turned Fetchmail into a single-mailbox, single-folder retrieval tool.

Therefore, if you want instant e-mail delivery with IMAP IDLE, and each user has a separate mailbox on you provider's (external) mail server, you need one Fetchmail instance per user. This setup is inconvenient for a small organisation, but still workable. Even if you are polling (instead of using IMAP IDLE), you may also want to run separate Fetchmail instances to prevent a single misbehaving external server, or a single problematic external mailbox, from affecting your entire mail pipeline.

So much for a tool praising itself (see its FAQ) as being "a one-stop solution to the remote mail retrieval problem". It goes on with "Fetchmail is not a toy or a coder's learning exercise, but an industrial-strength tool capable of transparently handling every retrieval demand from those of a simple single-user ISP connection up to mail retrieval and rerouting for an entire client domain. Fetchmail is easy to configure, unobtrusive in operation, powerful, feature-rich, and well documented." If you think about it, the sole existence of such delusional self-praising is already a red flag on the project.

To be fair, the Fetchmail project is not being completely dishonest. If you look at the Design Notes On Fetchmail page, as of December 2020, under section "Concurrent queries/concurrent fetchmail instances", you will find the following statement: "ESR refused to make fetchmail query multiple hosts or accounts concurrently, on the grounds that finer-grained locks would be hard to implement portably." So somebody is aware of the problem.

getmail
getmail is newer, and according to its documentation, "is designed to replace other mail retrievers such as fetchmail".

But it is a single-threaded, one-shot tool like Fetchmail, so it shares most of the limitations. IMAP IDLE support is an afterthought too, and using it effectively turns getmail into a single-mailbox, single-folder retrieval tool as well. There is no support IMAP NOTIFY either.

For the end user, it is a small improvement overall, but nothing ground breaking.

Little Practical Advice
getmail does not waste precious electronic space giving you an overview of how things are supposed to work from the operations point of view. So you are forced to search on the Internet and experiment.

For example, the documentation states "To retrieve messages from all mailboxes, you would use: mailboxes = ALL". First of all, the term "mailbox" is overloaded. Here we are talking about folders inside a mailbox. And then, retrieving all of them makes no sense, because getmail then looks inside folders like "Sent" and "Deleted Items", which is most probably not what you want or need. In fact, it would probably be better to implement a negative list of folders to skip.

Another thing that the documentation fails to mention is that you normally have 2 folders where new e-mails land: "Inbox" and "Spam/Junk". If your mail hoster misclassifies an e-mail as spam, and you do not tell getmail to download the "Spam" folder too, you will never see the e-mail on your local e-mail mailbox. I certainly miss this kind of practical advice.

I guess that getmail expects you to do your own spam filtering, which is not really advisable in a small organisation with little resources. Your mail provider will probably do a better job. I would not filter e-mails inside getmail anyway, I would set up an incoming Sieve filter in Dovecot instead.

The IMAP IDLE support is somewhat confusing. The documentation for option --idle states "getmail should wait on the server to notify getmail of new mail in the specified mailbox after getmail is finished retrieving mail". You could then configure getmail to scan all IMAP folders and then wait for a single one. But then you would not get any new mails on the others, would you?

Furthermore, IMAP sessions tend to timeout after 30 minutes of inactivity. The IMAP IDLE specification RFC 2177 states: "clients using IDLE are advised to terminate the IDLE and re-issue it at least every 29 minutes to avoid being logged off". However, getmail does not document what IDLE timeout it is using, if any, and what happens when getmail times out. Does it issue an IDLE again, perhaps rescanning first all other configured folders, or does it exit? If it exits, is the error code depending on whether the server timed out an inactive connection?

Delivering Directly to Foreign File Formats
The documentation says things like "native safe and reliable delivery support for maildirs and mboxrd files", as if that were possible. Each mail server version is free to introduce small disk format changes that effectively render the different variants incompatible.

For example, all servers support some way of limiting a mailbox size, but there is no standard for that, so directly inserting an e-mail will at least bypass that check. And every mail server locks differently. It is not like getmail is very actively developed and will closely track any file format change in all mail servers. Therefore, the only safe (and usually recommended) way to deliver a message to a mail server is by using the mail server's own tools. Therefore, you should only use getmail's MDA_external method.

Removing the other local delivery methods would actually make getmail smaller and better.

Smart Message UID Tracking
getmail implements a smart UID tracking in order to know which e-mails have been already delivered. From the documentation:


 * What are these oldmail* files? Can I delete or trim them?


 * getmail stores its state - its "memory" of what it has seen in your POP/IMAP account - in the oldmail files.


 * Do NOT delete or edit these files. You'll make getmail re-retrieve all your old mail, or even prevent getmail from running.

The smart thing to do would have been to move processed messages to another folder, perhaps named "Retrieved by getmail" or "Moved to the Internal Mail Server". This way, the current state of affairs would be obvious to anybody using the external mailbox directly. And losing the oldmail* files would not cause duplicate delivery.

E-Mails Marked as Deleted Are Still Processed
getmail, as of version 5.13, does not ignore e-mails marked as "deleted". Say that you delete a problematic e-mail with Thunderbird. Because of the IMAP EXPUNGE antifeature, the e-mail will remain on the server for a while, but it will not be visible by default on Thunderbird (and possibly other mail clients). However, getmail will continue to pick it up and try to process it. This issue was mentioned in the mailing list on 05.11.2020, but was dismissed with the argument that, according to the IMAP standard, you must issue an EXPUNGE for an e-mail to be really deleted. I consider this reasoning short-sighted. After all, the EXPUNGE command may be delayed for performance purposes, and there is no good reason why a tool like getmail should deliver messages that have already been marked as "deleted".

Brittle Operation
The implementation is rather brittle from an operational point of view. It is not properly documented what happens when an e-mail fails to be retrieved or delivered. After some discussion in the mailing list, it seems that getmail returns a non-zero exit code at the end if anything fails.

But some failures are considered minor, and other major. For example, if you tell getmail that the "envelope_recipient" is "X-Original-To", and it fails to find that envelope in one e-mail, that is considered a critical error and processing of any other e-mails stops. So a single bad e-mail will stop all e-mail flow.

There was a discussion the mailing list about this, and the author considered that behaviour OK, because it is a symptom that the external server configuration is wrong. I do not think that this would normally be the case: whether the envelope header is correct is the first thing you test when you set getmail up. If you are a small organisation using an external mail server, you do not have full control of that mail server, and in that scenario, there are a few things that can create an e-mail with that header missing.

For example, the provider may inject an advertising e-mail, or some notice like an upcoming server downtime or a quota warning, that was not received as a normal e-mail. Or maybe you have connected with Thunderbird and did the wrong thing. There is another common scenario: if you set up a catch-all mail address, or a group address with an alias per user, most e-mails will have the "X-Original-To" header, but the ones specifically addressed to the catch-all or to the group address will probably not.

You can probably work-around such scenarios, if you know about them in advance. I have yet to see a mail provider that documents their behaviour in such level of detail. If if your provider did, the behaviour can change at any point in time, and I doubt you would get an advance notice.

The smart thing to do would have been for getmail to move such e-mails to a "Failed E-Mails" folder, so that the administrator can inspect them at leisure. At the moment, you have to find the problematic e-mails manually by looking at the log.

Processing of any other valid e-mails should not be affected.

And this lack of robustness is not restricted to failed e-mails: like with Fetchmail, if one mailbox has a huge e-mail to download, or one server is slow to respond, delivery from any other mailboxes is unduly delayed.

Outdated Python
Perhaps the biggest worry with this software is that it runs on Python 2.x, which has officially reached the end of life in January 2020, after a long forewarning period.

There are still some companies providing extended support, but in any case you need to watch out, because it is not a good idea to run software facing the Internet on an possibly unpatched or even unmaintained platform. The fact that getmail's documentation does not mention this matter, as of December 2020, is a big, red flag.

There is a fork that supports Python 3.x, but it does not seem to be an official fork. It is a strange thing that an unofficial project carries the same name, only with a higher version number.