Using the m4 macro processor for fun and profit
Learn to write productive m4 scripts in 20 minutes
m4 is a great tool to boost your productivity. Have a lot of fun writing m4 scripts. This page will learn you the most essential parts, enough to start writing great scripts.
Introduction to the m4 macro processor
The m4 macro processor has been in use on Unix systems for a long time. The main purpose of m4 is to generate files. Initially m4 was created as a pre-processor for Fortran program code. This was about 30 years ago. Today m4 is used as a tool to generate configuration files, and is most famous as a generator of the sendmail.cf file.
m4 is still very useful today. I use m4 mostly to generate xhtml files. The combination of awk, m4 and make provide a powerful tool and together they are a good replacement for a content management system.
The combination of awk, m4 and make builds xhtml-files with good working links in menus. So it works like some kind of templating system. Also m4 helps me to separate content from (xhtml-)code
I keep all the xhtml-code in one file, the content
for each page in a separate content-file per page
and have put the m4 scripts in a few files on their
own. There is one single configuration file which contains
a list of files to generate, with the page-titles,
menu definitions etc.
Output is generated by running make.
The result of this:
- Change of xhtml-code requires only the editing of one file. Running make will build the new version of all the web pages through a single command.
- Adding a web page to the site is trivial. A line is added to the configuration file, telling the name of the new file, its title and if and how it should be adopted in the different menus of the site. The content of the new page is put in its own file in the content sub-directory. Running make will build the new web page and build new versions of the pages that have altered menus.
- Updating the content of a web page only requires the editing of the specific content file in the sub-directory and running make to build the new version of the web page.
I have build a number of websites this way and have maintained them this way for some years now. And I am still happy with this solution :)
Because all the scripts as well as the content are ordinary text files, it is very easy to keep them in a CVS repository. This allows not only for reversibility of changes but also provides a very good mechanism to keep everything neatly organized. As a bonus the CVS repository simplifies the backup procedures.
Below follows a small introduction in the basic usage of m4.
First steps in m4
The standard syntax of a macro definition in m4 is:
Notice the back tick (`) and the single quote ('), these are the standard delimiters.
Here we use it in an example:
define(`yoo',`Hello World!') I say this: yoo
Put this line into a file, called my_first_m4_program and run it with m4:
m4 my_first_m4_program I say this: Hello World!
Redirect the output of m4 to a file
Because m4 transfers its output to stdout, it is very simple to redirect the output of m4 to a file:
m4 my_first_m4_program > test_file cat test_file I say this: Hello World!
We simply put " > filename" behind the m4 command.
m4 macro definition
A simple macro just replaces some part of the text on the input. Although this is a simple mechanism it lead to powerful m4 scripts.
Literal text is placed between text-delimiters (standard: ` and '), variables not.
The statement can be broken into several lines:
define(`yoo', `Hello World!' ) I say this: yoo
This will result in some extra white lines in the output, though.
Also it is possible to let the second part of the statement be text that is several lines:
define(`htmlheader', ` <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>my_title</title> <meta http-equiv="Content-Type" content= "text/html; charset=utf-8" /> </head> ')
Parameter list to mimic function-like calls
A M4-definition can be enhanced with a parameter list:
This last command (my_value(`test') returns the output test_file.
The first parameter is addressed with $1, the second with $2, etc.
Conditional statements in m4
Conditional statements enhances the usefulness of our scripts. This is the syntax:
- ifelse: the m4 command
- first_text: this is the first parameter
- second_text: this is the second parameter
- true_action: this is the output if first parameter and second parameter are equal
- false_action: this is the output if first parameter and second parameter are not equal
An example in real life usage:
ifelse(my_filename,`index.html',`Home',`<a href="/index.html" title="To index page">Home</a>')
This is a part of a m4 macro that creates the menu in a web page. If the current page has the filename "index.html" (which is fed to the macro in the variable my_filename) then the output is a line with just the word "Home", otherwise the output is a hyperlink to the homepage.
m4 macros can be nested. This means that one macro uses the output of another macro to modify that.
When combined with conditional statements this results in a very strong mechanism.
Another, more complex example:
define(`my_menu',`<li>ifelse(filename,$1,`<span class="selected">$1_menu</span>',`<a href="$1_file" title="$1_title">$1_menu</a>')</li>')
Feeding the value of a variable
With the switch -D a variable can be given a value at the invocation of the m4 program on the command-line:
m4 -D variable=value my_program.m4
By including a file it is expanded on the location of the include statement.
I use this mechanism to include the content into templates. This way all the content is kept separately in a sub-directory. Also parts of m4-code can be put in different files, as the m4 processor processes them just as if they were put in place of the include command. An example of how this could be done:
include(`pagedefinitions') include(`webmenudefinitions') include(`xhtmldefinitions') build_htmlheader(`current_webpage') insert_content(`current_webpage') build_htmlfooter(`current_webpage')
- pagedefinitions: definitions of the current page (title, content of meta tags like description, etc.)
- webmenudefinitions: set of m4 macros to build the menus of your web pages
- xhtmldefinitions: set of m4 macros to build the xhtml code (you could think of this like some kind of template mechanism)
- current_webpage: variable which holds the name of the current file to be generated
- build_htmlheader: macro that assembles the xhtml to the <body> tag
- insert_content: set of macros that generate the content and add this to the partly build xhtml file
- build_htmlfooter: completes the xhtml file with sidebar, bottom menu, etc.
The use of divert in m4 macros
It is very easy to introduce a lot of whitespace in the output of your m4 macros.
The first step to reduce the generated whitespace is the use of the command dnl (dnl: delete everything from here to the first newline).
This approach will still leave a lot of whitespace as a result of macro-definitions. The next step is the use of the command divert (divert: re-route the output to a different stream). When we issue the command divert(-1) the output is sent to stream -1, which is a non-existing stream (like /dev/null).
After some commands like macro definitions we then issue the command divert to reset to output stream to its original.
Example of the use of divert to reduce whitespace
divert(-1) define(`my_macro1', `some_macro_expansion' ) define(`my_macro2', `another_macro_expansion' ) divert
This makes it possible to write m4 macros that are easy to read and maintain.
I build most of my websites using this. I build a small system that uses awk to generate Makefiles, and m4 to build the pages. Some examples are
m4 is very versatile. It can be used for many jobs. I have used m4 not only for maintaining websites (xhtml-pages) but also for generating large xhtml-forms as well as generating php-code.
Generate code with m4
It might seem a bit strange to use a preprocessor like m4 to
generate code. You have to develop the scripts to generate the code.
This means you have to debug these scripts too. But at some point
the tradeoff can be positive.
For repetive xhtml-code there are two options: generate static pages before publishing on a webserver or generate dynamic by the webserver. When the code doesn't change very often, generating static pages is more efficient.