Skip to main content

syslog-ng name-value pair naming

I was giving a lot of thought recently to the topic of naming name-value pairs in syslog-ng. Until now the only documented rule is stating somewhat vaguely that whenever you use a parser you should choose a name that has at least one dot in it, and this dot must not be the initial character. This means that names like MSG or .SDATA.meta.sequenceId are reserved for syslog-ng, and APACHE.CLIENT_IP is reserved for users.

However things became more complex with syslog-ng OSE 3.2. Let's see what sources generate name-value pairs:
  • traditional macros (e.g. $DATE); these are not name-value pairs per-se, but behave much like them, except that they are read-only
  • syslog message fields (e.g. $MSG) if the message is coming from a syslog source
  • filters whenever the 'store-matches' flag is set and the regexp contains groups
  • rewrite rules, whenever the rewrite rule specifies a thus far unknown name-value pair, e.g. set("something" value("name-value.pair"));
  • and of course parsers when you tell syslog-ng to parse an input as a CSV, or use db-parser together with the patterns produced by the patterndb project
The latest stuff generating name-value pairs is the support for process accounting logs, in this case even the syslog related fields are missing and only things like "pacct.ac_comm" (to contain the program name) are defined.

So I was thinking whether it should be "pacct.ac_comm" or ".pacct.ac_comm". With the quoted rule it should be simple: it is generated by syslog-ng itself, thus it should be in the syslog-ng namespace and should start with a dot. However in the era of syslog-ng plugins, what consists of syslog-ng at all?

First, I wanted to use "pacct.ac_comm" (e.g. without a dot), because I liked this name better. I was trying to explain myself why it would not violate the rule above. The explanation I had for myself was: I'm going to "register" names such as this in the patterndb SCHEMAS.txt file. With this - not yet published - explanation, I've committed a patch to convert the pacctformat plugin to use a dotless prefix.

Next, I was figuring that it is true that process accounting creates name-value pairs without going through patternization, but I've felt, that nothing ensures that these name-value pairs would be directly usable, when trying to analyse the logs. The patterndb concept uses tags and schemas to convert the incoming unstructured data into a consistent structure. However, pacct may not completely match what the user needs. And, in the future, when SNMP traps or SQL table polling are going to be supported, it is going to be even more true: these name-value pairs may need a conversion: from the SNMP/pacct structure to the patterndb schema described structure in order to handle these message sources consistently with regular syslog (and to make it easy to correllate these).

So at the end, I've committed another patch, this time going back to ".pacct" as a prefix and leaving the original naming rule intact. The "pacct" prefix is up to the users to use, they may want the same information in a "pacct" schema, but that may come from data not directly tied from process accounting (e.g. from syslog messages).

So this post is about doing nothing with regards to the naming policy, but I thought it'd be important to shed a light behind the scenes. Giving such decisions enough thought and coming up a with a long-term plan makes our lives much easier in the future.

This post may be a bit more involved than the others, but feel free to ask me to elaborate, if you are interested.

Comments

Popular posts from this blog

syslog-ng fun with performance

I like christmas for a number of reasons: in addition to the traditional "meet and have fun with your family", eat lots of delicious food and so on, I like it because this is the season of the year when I have some time to do whatever I feel like. This year I felt like doing some syslog-ng performance analysis. After reading Ulrich Deppert's series about stuff "What every programmer should know about memory" on LWN, I thought I'm more than prepared to improve syslog-ng performance. Before going any further, I'd recommend this reading to any programmer, it's a bit long but every second reading it is worth it. As you need to measure performance in order to improve it, I wrote a tool called "loggen". This program generates messages messages at a user-specifyable rate. Apart from the git repository you can get this tool from the latest syslog-ng snapshots. Loggen supports TCP, UDP and UNIX domain sockets, so really almost everything can be me...

syslog-ng OSE 2.1 released

I have just uploaded the first release in the syslog-ng Open Source Edition 2.1 branch to our website. It is currently only available in source format at this location: http://www.balabit.com/downloads/files/syslog-ng/sources/2.1/src This release synchronizes the core of syslog-ng to the latest PE version and adds the SQL destination driver. This is an alpha release and thus might be rough around the edges, but it basically only contains code already tested in the context of the Premium Edition. The SQL functionality requires a patched libdbi package, which is available at the same link. We're going to work on integrating all our libdbi related patches to the upstream package. If you want to know how the SQL logging works, please see the Administrator's Guide or our latest white paper Collecting syslog messages into an SQL database with syslog-ng. The latter describes the Premium Edition, but it applies to the Open Source one equally well.

syslog-ng 3.2 changes

I've just pushed a round of updates to the syslog-ng 3.2 repository, featuring some interesting stuff, such as: SQL reorganization: Patrick Hemmer sent in a patch to implement explicit transaction support instead of the previous auto-commit mode used by syslog-ng. I threw in some fixes and refactored the code somewhat. Configuration parser changes: the syntax errors produced by syslog-ng became much more user-friendly: not only the column is displayed, but also the erroneous line is printed and the error location is also highlighted. Additional plugin modules were created: afsql for the SQL destination, and afstreams for Solaris STREAMS devices. Creating a new plugin from core code takes about 15 minutes. I'm quite satisfied. With the addition of these two modules, it is now possible to use syslog-ng without any kind of runtime dependency except libc. The already existing afsocket module (providing tcp/udp sources & destinations) is compiled twice: once with and once withou...