Log requests to Apache 2 using a MySQL database

Not maintained

Please note that this information is only here for reference! I do not maintain mod_log_mysql anymore.

Introduction

mod_log_mysql is a module for the Apache 2 webserver which permits request logging into a MySQL database.

Key features are:

Requirements

The modular mod_log_config.
This replacement for the original mod_log_config allows third-party log writers (like this mod_log_mysql) to hook into log data processing. It's 100% backwards compatible with the original module.
A fixed Apache Portable Runtime (for Apache prior to 2.2)
APR-util, Apache's portable utility collection, has a tiny bug: One of its functions always returns success, even if it failed. Because of this, mod_log_mysql will crash whenever a database connection needs to be established and the database server is not running in that unlucky second. This does not happen very often — but it can.
Due to the nature of the bug it is not possible to insert a work-around into mod_log_mysql, and although this bug has been fixed in APR-util with revision 58942 of apr_reslist.c, the fix did not make it into to the Apache 2.0 releases for some time. If you're running Apache 2.2, everything's fine, though.
The fix is rather easy: Just fetch the current apr_reslist.c from Apache's source repository and replace your existing apr_reslist.c (see Installation below).
Apache webserver 2.0.46 or better
The modular mod_log_config includes code from Apache's original mod_log_config to make it 100% backwards compatible. This (original) code uses a function which was introduced into the Apache core in 2.0.46 — that's why the original mod_log_config and thus also the modular mod_log_config need 2.0.46 or better.
If your Apache complains about a "missing function ap_escape_logitem()" during startup, you should update your webserver to at least 2.0.46.

Installation

Download

Build

In order to build mod_log_mysql you will have to copy mod_log_config.c and mod_log_config.h into the directory modules/loggers of your Apache source. Second, if compiling the Apache 2.0 series, do not forget to replace your old srclib/apr-util/misc/apr_reslist.c (I did tell you this, didn't I)!
Next, build your Apache as usual.
Do not forget that an update of the Apache webserver will restore Apache's original mod_log_config.c, mod_log_config.h and apr_reslist.c!

Installation as shared module is recommended so you can keep and update this third-party module separate from the server (or update the server separate from third-party modules, if you like :-).
Drop mod_log_mysql.c into a directory of your choice and use Apache's apxs tool. Example:

apxs -a -i -c -L/usr/mysql/lib -I/usr/mysql/include -lmysqlclient_r -lz mod_log_mysql.c

In case you're using a threaded Apache, make sure you link with MySQL's thread-safe client library libmysqlclient_r, not libmysqlclient. Although mod_log_mysql itself is thread-safe, the thread-safe MySQL library will give you an extra bit of security.
See How to make a threaded client in the MySQL documentation.

The argument -lz makes apxs link with zlib, a library needed by MySQL for compression. This library is most likely installed on every *nix system as it's used by many other applications (e.g. gzip and Apache's mod_deflate).
If apxs tells you something like "/usr/bin/ld: cannot find -lz", make sure you have zlib's header files zlib.h and zconf.h somewhere among your include files (/usr/include or the like).

Activate

If compiled as shared module, don't forget to add the following line to your webserver configuration (if apxs did not already):

LoadModule log_mysql_module [path to your mod_log_mysql.so, usually "modules/mod_log_mysql.so"]

Use and Configuration

Selecting mod_log_mysql as logging target

mod_log_mysql registers itself automatically with the modular mod_log_config. To select MySQL as target, use a URI of the following form in mod_mod_log_config's CustomLog or TransferLog:

mysql:[user[!password]][@[host][:port]][/database]
or
mysql:[user[!password]][@[host][:socket/path/]][/database]

Items in square brackets can be omitted.

Notice the way a Unix file system socket is handled in combination with a database, this is a bit tricky: The socket path must always end with a slash, even though it's of course not a directory. Second, the database name must start with a slash (but not end with one). However, if you enter both a socket path and a database name, you must not enter double slashes. Confused? Here are some examples:

/a/b/c
/a/b is the socket here, c is the database.
/a/b/c/
The last part of this string ends with a slash, so /a/b/c is the socket and there's no database name.
/a
There's only one part and it does not end with a slash: a is the database name and there's no socket path.
/a/
Again, there's only one part, but this time it does end with a slash: /a is the socket path and there's no database name.
/a/b//c
This is invalid, the MySQL connection will fail. Use /a/b/c (like in the first example above).

That said, mod_log_mysql reads defaults for all items of this URI from the sections [client] and [mod_log_mysql] (in that order) of your MySQL configuration file (usually /etc/my.cnf). If MySQL is set up accordingly, you can ommit at least the host and port/socket parameters. Please see your MySQL documentation to read more about configuration options for MySQL clients.

Logging

It's up to you what to store in your logs and how you do that, the pattern given to mod_mod_log_config's CustomLog or LogFormat can be any SQL command. Both CustomLog and LogFormat work the same way as they always did.
For information about the possible % directives, which log the characteristics of the request, see Custom Log Formats in the original mod_log_config documentation as well as mod_log_io, mod_ssl and possibly other modules.

There is a special argument to all string directives named mysqlname. Its use is to convert the data to insert into a valid MySQL column or table name by simply removing anything that is not a letter or a number, i. e. anything not alphanumeric. This way you can use the server name directive %v to select the table that stores the data for the host in question.

LogFormat "insert into %{mysqlname}v set host=%h,time=%t,method=%m,url=%R,status=%>s" mysql

Please note that no quotes are needed around strings since mod_log_mysql inserts these itself. This is due to the fact that mod_log_mysql may store NULL values if the requested item is not applicable, for example if the HTTP client did not send a referrer.

Configuration

There is only one configuration directive and on a stable system, you do not even need it. Cool, isn't it? :)

LogMySQLFallback [fallback dir] [fallback retry timeout]

In case the connection to the database server is down, mod_log_mysql can write the SQL commands into files, one per log target.
LogMySQLFallback sets the directory in which these files will be created. The second argument - if given - sets the time between two connection retries in seconds. The default is 30 seconds, i. e. mod_log_mysql will only try to reconnect every 30 seconds.

mod_log_mysql will leave a message in the server's ErrorLog whenever the connection fails and when the connection comes back. The module will not automatically send the data it wrote into the fallback files to the database if the connection is up again. This might change in future releases, but for now you can simply parse the files with the mysql command line tool:

mysql -p -D httpd < /var/log/apache2/bitbrook_de

As mentioned above, mod_log_mysql uses one file per log target. The format is simply the log URI given to CustomLog or TransferLog with any unsafe characters replaced by an underscore.

Comments and Questions?

Feedback and comments welcome, please use the contact form or the email address at http://kino-fahrplan.de/intern/kontakt.