box.matto.nl
Enjoying Open Source Software

Awk script to convert tsv to recutils for LaTeX

Why I use tsv files

The tsv (tab separated value) format is very robust. It is not sensitive for the use of comma's or semicolons inside the text of fields.

This is why I use tsv files as a foundation for the automated generation of LaTeX reports. I compile periodic management reports with different levels of detail, that are all based on the same tsv file.

For some other LaTeX reports, I fill tables with an awk script that runs on a tsv file.

Usually I have handcrafted the tsv file, with either ed or with vim.

Just before generating the new reports, I edit the fields that needs to be changed, like in a progress report.

Difficulty with tsv files

When the number of fields gets bigger, or when the contents of a field is one or more sentences long, the individual lines of a tsv file tends to get very long.

This makes it difficult to edit these files.

Conversion to recutils

Recently I saw a posting of @tomasino@mastodon.sdf.org on the GNU recutils, see his posting on mastodon and his webpage on GNU recutils.

This got me interested, and I started some experimenting.

It seems that recutils is also robust and not sensitive for the use of comma's or semicolons inside the text of fields.

With the utility recfmt one can design the output format of recsel. You can basically draft any file as a template, and use the format {{fieldname}} as a placeholder for the different fields.

So this can also be a file with some LaTeX code, that can be included in a different LaTeX document (with \input{filename} ).

awk script to convert from tsv to recutils

Make sure the tsv file has the field names in the first line. This script will convert the tsv file to a file that can be used to recutils.

Spaces inside a fieldname are replaced by underscores. After conversion, apply a line with "%rec:" followed by the name you want to give to your newly created recutils database in the top of the file, and adjust the key, id and mandatory fields to your linking :)

BEGIN {
FS="    "
header_printed=0;
}
{
    if (NR==1) {
        i=1;
        while ( i <= NF )
        {
            gsub(" ", "_", $i);
            myfields[i] = $i;
            i++;
        }
    }
    else
    {
        if ( header_printed == 0 )
        {
            printf "%s","%mandatory:";
            for (var in myfields )
            {
                printf " %s", myfields[var]
            }
            printf "\n%s\n", "%key: id";
            printf "%s\n\n", "%sort: id";
            header_printed=1;
        }
        i=1;
        printf "id: %s\n", NR;
        while ( i <= NF )
        {
            printf "%s: %s\n", myfields[i], $i;
            i++;
        }
        printf "\n";
    }
}

Have fun!

Tags: awk recutils latex