Skip to main content

syslog-ng name-value pair naming

I was giving a lot of thought recently to the topic of naming name-value pairs in syslog-ng. Until now the only documented rule is stating somewhat vaguely that whenever you use a parser you should choose a name that has at least one dot in it, and this dot must not be the initial character. This means that names like MSG or .SDATA.meta.sequenceId are reserved for syslog-ng, and APACHE.CLIENT_IP is reserved for users.

However things became more complex with syslog-ng OSE 3.2. Let's see what sources generate name-value pairs:
  • traditional macros (e.g. $DATE); these are not name-value pairs per-se, but behave much like them, except that they are read-only
  • syslog message fields (e.g. $MSG) if the message is coming from a syslog source
  • filters whenever the 'store-matches' flag is set and the regexp contains groups
  • rewrite rules, whenever the rewrite rule specifies a thus far unknown name-value pair, e.g. set("something" value("name-value.pair"));
  • and of course parsers when you tell syslog-ng to parse an input as a CSV, or use db-parser together with the patterns produced by the patterndb project
The latest stuff generating name-value pairs is the support for process accounting logs, in this case even the syslog related fields are missing and only things like "pacct.ac_comm" (to contain the program name) are defined.

So I was thinking whether it should be "pacct.ac_comm" or ".pacct.ac_comm". With the quoted rule it should be simple: it is generated by syslog-ng itself, thus it should be in the syslog-ng namespace and should start with a dot. However in the era of syslog-ng plugins, what consists of syslog-ng at all?

First, I wanted to use "pacct.ac_comm" (e.g. without a dot), because I liked this name better. I was trying to explain myself why it would not violate the rule above. The explanation I had for myself was: I'm going to "register" names such as this in the patterndb SCHEMAS.txt file. With this - not yet published - explanation, I've committed a patch to convert the pacctformat plugin to use a dotless prefix.

Next, I was figuring that it is true that process accounting creates name-value pairs without going through patternization, but I've felt, that nothing ensures that these name-value pairs would be directly usable, when trying to analyse the logs. The patterndb concept uses tags and schemas to convert the incoming unstructured data into a consistent structure. However, pacct may not completely match what the user needs. And, in the future, when SNMP traps or SQL table polling are going to be supported, it is going to be even more true: these name-value pairs may need a conversion: from the SNMP/pacct structure to the patterndb schema described structure in order to handle these message sources consistently with regular syslog (and to make it easy to correllate these).

So at the end, I've committed another patch, this time going back to ".pacct" as a prefix and leaving the original naming rule intact. The "pacct" prefix is up to the users to use, they may want the same information in a "pacct" schema, but that may come from data not directly tied from process accounting (e.g. from syslog messages).

So this post is about doing nothing with regards to the naming policy, but I thought it'd be important to shed a light behind the scenes. Giving such decisions enough thought and coming up a with a long-term plan makes our lives much easier in the future.

This post may be a bit more involved than the others, but feel free to ask me to elaborate, if you are interested.

Comments

Popular posts from this blog

syslog-ng fun with performance

I like christmas for a number of reasons: in addition to the traditional "meet and have fun with your family", eat lots of delicious food and so on, I like it because this is the season of the year when I have some time to do whatever I feel like. This year I felt like doing some syslog-ng performance analysis. After reading Ulrich Deppert's series about stuff "What every programmer should know about memory" on LWN, I thought I'm more than prepared to improve syslog-ng performance. Before going any further, I'd recommend this reading to any programmer, it's a bit long but every second reading it is worth it. As you need to measure performance in order to improve it, I wrote a tool called "loggen". This program generates messages messages at a user-specifyable rate. Apart from the git repository you can get this tool from the latest syslog-ng snapshots. Loggen supports TCP, UDP and UNIX domain sockets, so really almost everything can be me

syslog-ng roadmap 2.1 & 2.2

We had a meeting on the syslog-ng roadmap today where we decided some important things, and I thought I'd use this channel to tell you about it. The Open Source Edition will see a 2.1 release incorporating all core changes currently in the Premium Edition and additionally the SQL destination driver. We are going to start development on the 2.2 PE features, but some of those will also be incorporated in the open source version: support for the latest work of IETF syslog protocols unique sequence numbering for messages support for parsing message contents Previously syslog-ng followed the odd/even version numbering to denote development/stable releases. I'm going to abandon this numbering now: the next syslog-ng OSE release is going to have a 2.1 version number and will basically come out with tested code changes only. The current feature set in PE were developed in a closed manner and I don't want to repeat this mistake. The features that were decided to be part of the Open

syslog-ng 3.0 and SNMP traps

Last time I've written about how syslog-ng is able to change message contents. I thought it'd be useful to give you a more practical example, instead of a generic description. It is quite common to convert SNMP traps to syslog messages. The easiest implementation is to run snmptrapd and have it create a log message based on the trap. There's a small issue though: snmptrapd uses the UNIX syslog() API, and as such it is not able to propagate the originating host of the SNMP trap to the hostname portion of the syslog message. This means that all traps are logged as messages coming from the host running snmptrapd, and the hostname information is part of the message payload. Of course it'd be much easier to process syslog messages, if this were not the case. A solution would be to patch snmptrapd to send complete syslog frames, but that would require changing snmptrapd source. The alternative is to use the new parse and rewrite features of syslog-ng 3.0. First, you need to f