Makefiles for Beginners

According to legend, someone was asked how much development effort went into "porting" a rather complex software suite to Linux. The reply was, in total: "We typed 'make'."

This Article is intended to help you write your own Makefiles, or to modify those hand-coded by others. If said Makefile has been written by another program (traditionally ./configure), please see Portable Software first.

Makefiles for the Minimalist

Most introductions to Makefiles begin by describing how to build rules from scratch, and only later (and often only briefly) mention using make's built-in rules and other powerful magic. I propose instead to do things the other way round, by assuming the gentle Reader knows (or can look up) such matters, and cut straight to the chase.

This, believe it or not, is a complete Makefile, which causes make to compile transform.c with headers for GSL (the GNU Scientific Library), then link it with libgsl, to form an executable called transform.

MACPORTS = /opt/local
CPPFLAGS = -I$(MACPORTS)/include
CFLAGS = -Wall
LDFLAGS = -L$(MACPORTS)/lib
LOADLIBES = -lgsl

transform: transform.o

Now let's unpack that, and explain the magic as we go.

MACPORTS = /opt/local

This sets the make variable $(MACPORTS) to the value /opt/local.

CPPFLAGS = -I$(MACPORTS)/include
CFLAGS = -Wall
LDFLAGS = -L$(MACPORTS)/lib
LOADLIBES = -lgsl

Set some more variables, which will couple into the rules I'll explain next. Note that two of them use the variable $(MACPORTS) we first thought of. (Please see man cc for what these arguments will mean to the preprocessor, to the C compiler proper, and at the link stage.)

transform: transform.o

Here's where the magic really starts kicking in. This is a dependency, which is a partial rule: it tells make that transform depends on transform.o, and assumes that some other rule will say how to do it. Since we haven't specified anything ourselves, make looks in its internal ruleset, finds some appropriate entries, and treats the above as if we'd written:

transform: transform.o
    $(CC) $(LDFLAGS) transform.o \
        $(LOADLIBES) $(LDLIBS) -o transform

.... where I've had to fold the line at the backslash, and where the whitespace on the left of the "action" lines begins with one or more tab characters. To obtain transform.o, make notices that we have transform.c in the current directory, riffles again through its internal ruleset, and thinks:

transform.o: transform.c
    $(CC) $(CFLAGS) $(CPPFLAGS) \
        -c transform.c

.... and we're home: if transform doesn't exist, or is older than transform.c, make builds transform.o, then transform, without you needing to do more than say "make" at the command line to kick off the process.

Changing compilers

In the above, I've expanded some of make's internal variables already, and elided others whose values are normally empty. One I haven't yet mentioned is $(CC), which is defined internally by:

CC = cc

If you want to change that (eg to avoid Apple's clang-based cc), add an entry to your Makefile to override it:

CC = gcc

Those who may wish to write:

CC = gcc-mp-6

.... to ensure they're using the MacPorts OpenMPI variant of gcc are presumed to know what they're doing, and which extra link-time libraries they need to use.

If you use Fortran, you'll probably also want to override make's traditional but woefully-outdated notion that f77 is the only Fortran compiler in town:

FC = gfortran

.... or, for OpenMPI devotees:

FC = gfortran-mp-6

You can then make use of the internal rules, which look something like:

foo.o: foo.f
    $(FC) $(FFLAGS) -c foo.f

bar.o: bar.F
    $(FC) $(FFLAGS) $(CPPFLAGS) -c bar.F

.... for any and all appropriate values of foo and bar. Note the lowercase f and capital F: this is how make distinguishes between Fortran code which doesn't, or does, need preprocessing to handle #include and #define statements. (This is agreed to be confusing if your partition doesn't properly distinguish between upper and lower case in filenames. Happily, GNU Make makes the correct safe assumptions in this, erm, case.)

The reason for using (eg) $(CC) instead of a bare cc in the rules is the usual one for defining and using a variable: if you need to change its value, you need only change it once. Thus, for example, it's easier and less error-prone to alter a single FC=g77 statement in a third-party Makefile than to manually alter several tens or hundreds of hard-coded invocations of g77 to use gfortran instead. (The latter has been observed in real life. The fact that there were so many invocations in the offending Makefile is a separate issue, which need be explored no further here.)

Multiple source files

Let's say that the final program transform is built not only on transform.c, but also interface.c and glue.c. We (and make) already know how to produce all the .o files, so we write:

transform: transform.o interface.o glue.o

This is rewritten as:

transform: transform.o interface.o glue.o
    $(CC) $(LDFLAGS) $^ \
        $(LOADLIBES) $(LDLIBS) -o $@

.... where $@ is make's more compact way of saying "whatever the target is" (anything to the left of the ":"), and $^ means "whatever the target depends on" (everything to the right), whenever they appear in the action part of a rule.

Header files

Of course, any self-respecting set of C programs will use header files. Let us suppose that the following appears in interface.c and glue.c, but not in transform.c:

#include "glue.h"

If glue.h is updated, then interface.o and glue.o need to be recompiled (but not transform.o). To inform make about this, we need to write another dependency:

interface.o glue.o: glue.h

.... and similarly for any other local header files named in #include statements in the source.

Using gfortran as a linker

Let's suppose we add some Fortran code in grunt.F to do the heavy lifting. make would use $(CPPFLAGS) as before to deal with any library-specific #include statements, and any options to pass to the compiler proper would need to go in $(FFLAGS); but make will still use $(CC) to do the linking, unless instructed otherwise.

One awkward facet of using Fortran code is that object-code produced by the Fortran compiler requires extra libraries, over and above the standard C libraries; annoyingly, these libraries tend to be compiler-specific. Rather than second-guess what (eg) gfortran requires, and how it differs from whatever was put in someone else's Makefile that pertains to g77 (the traditional culprit is libg2c), it's better to persuade the Fortran compiler to do the linking. This isn't as simple as it could be, but is far from impossible.

Probably the best approach on balance is to add an action to the dependency line we already have, based on the internal rule, but with the required twist. Thus we'd write:

transform: transform.o interface.o glue.o grunt.o
    $(FC) $(LDFLAGS) $^ \
        $(LOADLIBES) $(LDLIBS) -o $@

Now that we have an action attached to the dependency, make will ignore its internal rules, and use our specified action instead. You will, of course, need to check that this does enough in your particular circumstances, and across all environments in which you and your collaborators may be working; but this is a reasonable start, and beats hand-crafting hardcoded library dependencies on specific Fortran compilers hands down.

This isn't of course the only way to do this, though it's the least likely to be non-portable.

One alternative is to put this definition in the Makefile:

CC = $(FC)

This trick relies on the fact that gfortran is part of the GNU Compiler Collection (GCC), alongside the GNU C compiler (gcc), so it knows how to compile C programs. This admittedly replaces a dependency on one specific compiler with one on a particular compiler suite. In particular, it may or may not work if you're using the Intel compilers instead, and is likely to be especially hazardous if you're attempting to mix Intel and GNU compilers in the same build process (which is likely to be seriously Bad News in any case).

Another alternative involves digging into the details of how GNU Make builds up its internal rules using variables (I'll show you how to find this out in a moment). The default definitions of "what to use to do linking" include this, which is used in the rule to generate an executable (transform) from a collection of object-code (.o) files:

LINK.o = $(CC) $(LDFLAGS) $(TARGET_ARCH)

If we rewrite this as:

LINK.o = $(FC) $(LDFLAGS) $(TARGET_ARCH)

.... then gfortran (or whatever's named in $(FC)) does the compiling. This almost certainly replaces a dependency on a particular compiler with a dependency on GNU's Make: your collaborator, who's expecting to build the same codeset on Berkeley UNIX, may not love you for this.

Anybody who writes:

$(CC) [....] -lgfortran

.... is deemed to know what they're doing, and to be able and willing to handle the consequences. It is not yet known whether:

$(CXX) [....] -lgfortran

.... is or is not required to handle whatever g++ (or other C++ compilers) may request and require in the way of extra libraries which gcc cannot reach; enlightenment on this is humbly requested.

Peeking under the bonnet

Now you may be wondering where I got all this information from. Partly it's by poring over the manuals, but mainly make itself told me:

make -n

The -n flag says "don't do anything, but say what you would do". If you're wrestling with make and your Makefile, it's your first port of call. To see more details, including the rules and variables (from make's internal set, as modified by the contents of your Makefile):

make -np |& less (tcsh)
make -np 2>&1 | less (bash)

The extra -p flag says "print your rules as read", and the strange usage around the pipe symbol captures the standard-error output as well. After that, it's a simple matter of searching up and down the (painfully voluminous) output using less's pattern-searching facilities, and knowing that GNU Make uses the "%" character as a wildcard in the dependency part of its pattern rules, corresponding to "$*" in the action lines. Thus, I found, on seeking for LINK:

[....]
# default
LINK.o = $(CC) $(LDFLAGS) $(TARGET_ARCH)
[....]
%: %.o
# commands to execute (built-in):
    $(LINK.o) $^ $(LOADLIBES) $(LDLIBS) -o $@
[....]

.... and the rest was copy-and-paste (following COMPILE in the same way is left as an exercise). Welcome to my shop floor.

Tidying up

I just can't resist this ---

MACPORTS = /opt/local
CPPFLAGS = -I$(MACPORTS)/include
CFLAGS = -Wall
FFLAGS = -Wall
LDFLAGS = -L$(MACPORTS)/lib
LOADLIBES = -lgsl

target = transform
objects = transform.o interface.o glue.o grunt.o
glue_h = glue.o interface.o
grunt_h = transform.o grunt.o interface.o

default: $(target)

$(target): $(objects)
    $(FC) $(LDFLAGS) $^ \
        $(LOADLIBES) $(LDLIBS) -o $@

$(glue_h): glue.h
$(grunt_h): grunt.h

clean:
    rm -f $(objects)

clobber: clean
    rm -f $(target)

.PHONY: default clean clobber

The extra target lines are traditional, and will be expected to be present if you send your Makefile elsewhere.

  • default: The first target is the one a bare "make" will make. This saves shuffling your Makefile if you have multiple major targets, and you decide that something lower down is of more importance.

    • I often create the target help as a quick "help" facility in my more complex Makefiles, and make default depend on it. This puts a valuable safety catch on saying "make" at the command line without due care and attention.

  • clean: Clean up after yourself. If your dependencies aren't quite right or complete, or to just make absolutely sure, say "make clean" at the shell prompt before you say "make" again.

  • clobber (less often seen): Really clean up after yourself. In some circumstances, this may be pronounced distclean; in others, this may not be a direct translation.

  • The targets install and uninstall in others' Makefiles will offer to put things into (or remove them from) system directories such as /usr/local/bin, /usr/local/lib and so forth. If you're running the code in question from within its build directory, these actions are neither wise nor possible.

    • So don't do that; or if you must do that, work out how to use (eg) /Data/foopackage/bin, /Data/foopackage/lib etc instead.

  • .PHONY is a standard builtin target with magical properties: things which depend on it are ignored if they happen to exist as real files. For example, without this, saying:

    touch clean
    make clean

    .... would yield only a comment that clean is up-to-date.

Collecting object-code names into variables again applies the "modify once, use twice" principle; also, if these statements go at the head of the file, it saves (eg) having to discover which of three dependency lines scattered through a whopping great Makefile needs to be modified to fix a header dependency problem. Think of this as make's way of saying #define .... but, as with #define, it's all too possible to get carried away.

Categories: Astro software | Development | HOWTO | Software