GNU Make and Autotools

= GNU make =

.DELETE_ON_ERROR is a must
Say that an error occurs when building a file. Now your sandbox may be corrupt and stay corrupt until you do a "make clean" (assuming that the makefile implements such a target). It does not happen often, but it does happen every now and then.

The reason is that many makefile recipes and external tools write directly to the target files, so that they will leave incomplete output files behind when abruptly stopped. If you then restart the build process without cleaning your sandbox beforehand, any such incomplete files will be considered to be up to date (according to their timestamp), so they will not be rebuilt. An incomplete file will either make the build fail, or corrupt the final outcome, and it is often hard to know what went wrong.

Adding the following pseudo-target closes this window of opportunity for most scenarios:

.DELETE_ON_ERROR:

There is no reason to leave .DELETE_ON_ERROR out. It does not cost any performance. It may have no effect with other make implementations, but it will not break compatibility.

As a developer, you may think that your small project takes a short time to build, and you always do a clean beforehand anyway. But you should not ship risky makefiles to your end-users, as they are not generally aware of your 'healthy' habits.

GNU make's manual does state: "This is almost always what you want make to do, but it is not historical practice; so for compatibility, you must explicitly request it". I find this reasoning rather lame. I would just turn .DELETE_ON_ERROR on by default.

Automake users should add the .DELETE_ON_ERROR pseudo-target to their top-level Makefile.am file. In my opinion, this is something that Automake should be doing itself by default.

If some special makefile wants to find out whether a file is complete, and if not, carry on from its current state, that makefile should be modified to be compatible with .DELETE_ON_ERROR. Do not use .PRECIOUS and .IGNORE to that purpose.

Keep in mind that .DELETE_ON_ERROR is not completely reliable. If your PC crashes or loses power in the middle of a build process, GNU make will not get a chance to clean up. But such corruption should not surprise too many people. After all, filesystems themselves are not completely reliable in the face of such eventualities, often due to a configuration that improves performance at the expense of increased possibility for data corruption. For more information on such issues, search for options data=journal and data=ordered in the ext4 filesystem. In short: after a system crash, you should probably run "make clean" on your sandbox before restarting the build process.

There are however other problematic scenarios, such as killing GNU Make with SIGKILL, either manually or through some automated process like the Linux out-of-memory killer. There is also the question of whether GNU Make's automatic cleaning logic will work properly if the disk is full (or the disk quota has been reached). You may be redirecting GNU Make's output to a log file that resides in the same full disk, maybe over a tee process, so that multiple system calls will start failing and generating cascading SIGPIPE signals, including the attempts to print log messages about automatically cleaning files after the first error. Not many tools are so robust and get tested to that extent.

Properly fixing the problem
The real fix would be to use a transactional filesystem for the whole build process, but that is not looking feasible as of january 2018.

The second-best solution is to write all makefile recipes so that they write their outputs into temporary files, and move or rename them to their final target filenames at the end, and only if everything else was successful. The GNU Make manual calls this kind of recipe "defensive recipes".

Moving and renaming files are usually implemented as atomic operations at filesystem level, so this practice is much more reliable (but see fsync, in case of a power loss or complete system crash). The trouble is, those extra recipe steps cost a little performance and make the recipes more complicated. Furthermore, rewriting all existing makefile recipes is often not a realistic enterprise.

In general, you should code your makefile recipes defensively and not hope that external tools take good care of not leaving unfinished output files behind upon failure.

One common source of such errors are hand-crafted autoconfiguration scripts run from within makefiles, because they run seldom and are a pain to maintain, so they get neglected more often than not. If they hit an error, they tend to leave half-finished config files behind. Say your user has not installed some dependency, so that your configuration script fails. If the user installs the missing library, and issues a "make" again, it will then keep failing. Users will probably remember to run "make clean" before reconfiguring, but will often not do a clean after every failure, as you normally expect that fixing the error cause and restarting the build should be enough.

This kind of oversight with unfinished files is also very common among external tools, even in mature ones. Here is one example in par2cmdline:


 * Use temporary names while creating *par2 files
 * https://github.com/Parchive/par2cmdline/issues/84

In such cases, your makefile recipe should make those tools generate their file groups into a temporary directory, and rename or move the directory to its final location only upon successful completion.

Even if you have carefully implemented all your recipes to be resilient against errors and interruptions, adding .DELETE_ON_ERROR provides an adequate, if not watertight, safety net at virtually no cost.

Specify your shell and enable the shell's error detection
GNU make uses sh by default, so your makefile will probably fail on platforms   that link sh to some other shell than the one you are testing against. Therefore, always specify the shell you want to use. In practical terms, that means Bash, unless you want the extra work in making sure that all the rules you are writing are really POSIX compatible.

You should also turn on the nounset and pipefail error-detection options. See Error Handling in General and C++ Exceptions in Particular for more information.

This is what I use in my makefiles:

# This makefile has only been tested with Bash. # Option 'pipefail' is necessary. Otherwise, piping to 'tee' would mask any errors on # the left side of the pipeline. SHELL := bash -o nounset -o pipefail

This will probably not work with Automake, as the generated makefiles expect a POSIX-compatible shell without any such error-detection options enabled.

Don't forget to use .PHONY and .DEFAULT_GOAL
Don't forget to make the special target .PHONY depend on all of your non-file targets. This not only helps make, but human reviewers too. For example:

.PHONY: help

Using special variable .DEFAULT_GOAL helps document makefiles. Without it, it is not easy to locate the first (and therefore default) target in big makefiles. For example:

.DEFAULT_GOAL := help

Do not use built-in variables or rules, and warn if undefined variables
Always run your makefiles with --no-builtin-variables, which also disables all implicit rules. Automake-generated makefiles need built-in variables, but they seem to run fine with --no-builtin-rules.

If implicit rules are active, GNU make may look for many different possible source file types when trying to find a suitable rule for each target file. That can trigger many stat syscalls per target file, dramatically slowing down makefiles for large projects.

Option --warn-undefined-variables can help a lot when writing and debugging makefiles.

If your makefile is calling other makefiles (either directly or through a script), you probably want to filter out those flags, so that they do not get inherited in environment variable MAKEFLAGS. But you should still pass down all flags related to GNU make's job server. See the section below about the job server for more information on how to do this kind of filtering.

Properly document how to use your makefile
Provide an example with the following hints:


 * Explicitly mentioning -O2 makes the user realise that he has other options. Do not be so arrogant to state that -O2 is the best level because you have tested it a few times with some GCC versions and that level produced the best results. GCC is constantly evolving and sometimes -O3 provides a significant performance boost. For example, on GCC version 7 level -O3 enables "loop splitting optimization" and the vectorizer, which can significantly boost performance. You should also consider suggesting LTO, if it provides a noticeable performance boost to your software.


 * Option -march=native can provide noticeable performance gains in modern CPUs, at the cost of binary portability.


 * Do not waste your end-user's time by building on just 1 CPU. They should build with all available CPUs, like you do during development.


 * Use --output-sync=recurse in order to avoid the typical interleaving output from parallel builds. Such line mixing makes your life unnecessarily difficult when troubleshooting. This option is only available from GNU Make version 4.0 onwards, but older versions are not common anymore.


 * Mention what the default target is. If the makefile does not use the .DEFAULT_GOAL special variable, it is not easy to locate the first (and therefore default) target in big makefiles. Besides, it should not be necessary to read a makefile in order to use it. The user will need the default target's name if he wants to build the normal stuff plus say one extra target (like the manpage in HTML format).

This is the kind of documentation you could provide:

Run "make" to build the software. For example: make --output-sync=recurse   -j "$(( $(getconf _NPROCESSORS_ONLN) + 1 ))" \ CFLAGS="-g -O2" CXXFLAGS="-g -O2"  all Option -march=native in CFLAGS/CXXFLAGS can provide a noticeable performance boost, but the executable will only run on the current CPU type.

Job server considerations
GNU make's job server is a great idea. Say you have ProjectA with 100 source files. If you have a CPU with 4 cores, you want to compile 4 or 5 files in parallel, but not all of them at once. Now say you wish to build ProjectA, ProjectB and ProjectC by calling their makefiles from a single, top-level makefile. You do not want to build 4 * 3 = 12 or more source files at the same time. You still want to build all projects in parallel, but with the same global concurrency limit of 4 or 5. That's what the job server helps you achieve.

Unfortunately, GNU make has no "-j auto" option, so you have to manually pass the right flags among makefile invocations so that there is only one top-level job server that coordinates the whole build process.

If you are writing a build script that runs GNU make, you should not make assumptions about the best number of concurrent jobs. That is something for the end-user to decide. If you want to provide a default for convenience, make it just a default, do not hard-code any logic that starts a new job server if one already exists. Check out routine add_make_parallel_jobs_flag in JtagDueBuilder.sh for an example on how to do that.

If you are writing a makefile that calls other makefiles (either directly or through a script), you probably want to filter out some or all of the GNU make flags inherited in environment variable MAKEFLAGS, but leave the ones related to the job server. Check out variable EXTRACT_SELECTED_FLAGS_FROM_MAKEFLAGS in this makefile for an example on how to do that.

= Autotools =

Turn On Autoconf's Warnings
Everybody agrees that it is a good idea to turn most warnings on when using a C compiler, or Perl, or whatever language you are programming in. However, for some reason, most developers forget to do the same when using Autoconf. Error detection is also frequently neglected in the small configuration scripts that call Autoconf and related tools.

Check out this autogen.sh script for an example on how to do it properly.

Always Specify AC_CONFIG_AUX_DIR([build-aux])
AC_CONFIG_AUX_DIR([build-aux]) places auxiliary files in a subdirectory, which reduces clutter in the top-level project directory.

But more importantly, it also prevents the configuration script from looking for helper scripts outside the project, in ../ and ../../, possibly finding older versions or incompatible tools with the same name.

Explicitly Request Bash
If you did not make the necessary effort to write a strictly POSIX-compliant configure.ac script, you need to ensure that the user is running the same shell as you. However, both aspects are impractical. It is very hard to write POSIX-complaint configure.ac scripts. The POSIX features are annoyingly limited, and not all shells are perfectly POSIX compliant anyway. ShellCheck, for example, does not support checking Autoconf scripts. And you probably will not have time to test on a battery of interesting shells, whichever they may be.

Therefore, your only sane option is to write for Bash, test on Bash and demand Bash on the user's target system.

You may be fooled into thinking that Autoconf will generate POSIX-compatible scripts for you, but remember that any script code you add to configure.ac that is not strictly calling M4 macros is going to land on the final configure script unmodified.

If you never had a problem before, maybe you did not know that the generated configure script will actually switch shells on startup. Start it with /bin/sh, for example, and it will exec itself to Bash, or even to zsh if no Bash is found. That 'feature' is not actually documented. Most people have Bash around, so that is why you seldom see a problem. But if the user has no Bash available, your configuration script will probably break.

I could not find a way to force Autoconf to use a particular shell from within the configure.ac script. You you have to check manually if your configure script is running on Bash, and otherwise, tell the user to start it like this:

CONFIG_SHELL=/bin/bash ./configure

The necessary script code is not hard at all. Just add this to your configure.ac after initialising Autoconf:

if ! test set = "${BASH_VERSION+set}"; then AC_MSG_ERROR([Please run this 'configure' script with Bash, as it uses Bash-only features, and has only been tested against Bash. Start it like this: CONFIG_SHELL=/bin/bash ./configure]) fi

Do not give the standard "./configure && make && make install" advise
Most Autoconf projects state in the documentation that you should build like this:

./configure make make install

This advice is doing a disservice to your users. Somebody building your software from sources deserves better documentation.

First of all, you should not be encouraging your users to install your software as root. This is something that few people should accept straight away. It is not just a question of security risk, for installing as root may overwrite global files and break other software. Besides, the next package manager update may overwrite your changes anyway.

Most users will want to install the software without special privileges inside their home directories. Even when installing software for all users, you do not normally want to overwrite your distro's global files, so specifying your own installation directory is almost always a must.

Besides, the advice about about properly documenting how to use your makefile (see section further above) still applies. Therefore, this is the kind of documentation you should provide:

Suggested steps in order to configure and build the software: ./configure --help ./configure --prefix="$HOME/some-dir" \ CFLAGS="-g -O2 -march=native" \ CXXFLAGS="-g -O2 -march=native" make --output-sync=recurse   -j "$(( $(getconf _NPROCESSORS_ONLN) + 1 ))" make install-strip Option -march=native in CFLAGS/CXXFLAGS can provide a noticeable performance boost, but the executable will only run on the current CPU type.

Target install-strip can save valuable space on the target drive. Debug information is rarely needed and takes a lot of space. If the user needs it, he will probably know enough to use the install target instead.