William Matthews


Easily Implanting CSV data into LaTeX documents and Report Generation


Introduction

This short post covers more than just importing CSVs into LaTeX, as I’d run out of things to say very quickly. I briefly cover how to use data from a SQL query to efficiently create a LaTeX report, which was used for the Oxfeud.it project I am working on with Daniel Mroz. (This is all using the assumption you are using MySQL as your SQL server.)

Motivation

Quite often in Engineering we are presented with large sets of data that need to included with a document (quite often in appendicies), and it can be quite a pain to include a large set of data in a writeup if in LaTeX. This guide shows a nifty package that allows comma separated value documents to be placed in a LaTeX document with relative ease.

csvsimple

The package in question is csvsimple, (full guide available here) which achieves the effect we want neatly.

In short, using \csvautotabular{data.csv} does what it says on the tin, placed inside a float you will be able to control the table enough for standard data presentation.

Automatic Generation of Reports

Quite frequently the data we need is stored in a database, so we need a supporting script to get the information out and into a form we can work with.

We begin by creating the Bash script ltxrun.sh, which has the command:

mysql -u username -p password --database database_name -h mydomain.com --port 3306 -ss -e "${SQLCMD}" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > "${filename}.csv"
Which you may notice is a fairly ominous command, so let’s break it down.

mysql -u username -p password --database database_name -h mydomain.com --port 3306 -ss -e "${SQLCMD}$" is a request to use MySQL to log into a remote database with username and password defined and run the query specified by the bash variable ${SQLCMD}. This, run on its own, would dump the results to the screen.

We then pipe the output of the MySQL command (using the pipe symbol ‘|’) to the next command, sed which has some rather funky argument attached to it, again let’s break this down to an understandable level.

sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' transforms the output from the MySQL request into a usable CSV format, by use of the rather dense argument. This argument is made of four parts:

  1. s/\t/","/g; - Replace all tab characters (\t) with "," (include the quotes).
  2. s/^/"/; - Replace all line beginnings (^) with " (double quotes).
  3. s/$/"/; - Replace all line endings ($) with " (double quotes).
  4. s/\n//g - Replace all new lines (\n) with nothing.

Altogether this acts to cleanly produce a CSV compliant file, which is then piped by means of overwriting and creating a new file (using >) to the file with the name specified by the variable ${filename}.

The CSV file will exist on the machine where this command has been run, so from here we simply run pdflatex to generate our PDF document using the CSV file.

We can call pdflatex in the ltxrun.sh script, which will generate our report.

For my system I had a title that changed depending on the query run. This is easy to implement - if we dump the title we want to a file and read it in from the LaTeX document.

A very basic example is included below.

Possible improvments you can easily make include:

  1. You can add an argument to ltxrun.sh to change output directory.
  2. You can go the the lengths of making a .tex script generator for use in environments where you risk multiple people generating reports close together (have timestamped CSV files, and pass the timestamped filename of the CSV file to a custom LaTeX script which will be added through an \input).
  3. Scheduling to avoid conflicting requests.

ltxrun.sh

# Process command flags and option arguments
while getopts s:f:t: option
do
case "${option}"
in
s) SQLCMD=${OPTARG};;
f) filename=${OPTARG};;
t) title=${OPTARG};;
esac
done

# Run the SQL command and put data in a CSV
mysql -u username -p password --database database_name -h mydomain.com --port 3306 -ss -e "${SQLCMD}" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > "${filename}.csv"

# Move the data file to a standardised name and put document title in a format readable by LaTeX
mv "${filename}" ./latex/data/data.csv
echo "${title}" > ./latex/data/title.dat

# Change to latex directory, to keep above dir clean, and run (silenced) renderer
cd ./latex
pdflatex ./template.tex &> /dev/null
pdflatex ./template.tex &> /dev/null
cd ..

#move template.pdf file to a location of your choice and name it appropriately
mv ./latex/template.pdf "/location/of/your/choice/${filename}.pdf"

#clean up after ourselves
rm ./latex/template.aux ./latex/template.out ./latex/data/title.dat ./latex/data/data.csv

template.tex

\nonstopmode
\documentclass[11pt, oneside]{article}
\usepackage{geometry}
\geometry{%
a4paper,
left=20 mm,
right=20 mm,
top=20 mm,
bottom=20 mm
}
\usepackage{lastpage}
\usepackage{csvsimple}
\usepackage{hyperref}
\usepackage{datetime}
\hypersetup{
    colorlinks=true,
    linkcolor=blue,
    filecolor=magenta,
    urlcolor=cyan,
}
\usepackage{fancyhdr}
\headheight 14pt
\pagestyle{fancy}
\rhead{\href{http://example.com}{Goto Reports Page}}
\lhead{Page \thepage~of \pageref{LastPage}}
\chead{\href{http://willmatthews.xyz}{WillMatthews.xyz}}
\lfoot{Compiled at \currenttime, \today ~in \LaTeXe}
\cfoot{}
\rfoot{\href{http://example.com}{Disclaimer \& Legal}}
\renewcommand{\headrulewidth}{0.4pt}
\renewcommand{\footrulewidth}{0.4pt}

\begin{document}
\begin{centering}
\vspace*{1cm}
\Huge\textbf{\input{data/title.dat}}\\
\vspace{1cm}
\large\textsl{A WillMatthews.xyz Report}\\

\vspace{2cm}
\csvautotabular{data/data.csv}

\end{centering}
\end{document}

Final Thoughts

For some of the more proficient MySQL users, you may ask why we can’t use INTO OUTFILE, this is because OUTFILE can only place a file on the server itself, you cannot create a CSV directly onto a remote machine. Of course you can use scp to move the file, but this is horrendously bad practise and I do not reccomend it.

The report generator can be used with a php website to generate the form of your choice, and present the user with the pdf results, or even the CSV if you want to cut out the messy LaTeX middle-man.

Done!

You should now know how to generate a basic report!

Adknowledgements

sed command credited here.