Skip to main content

small incompatible change for 3.1

I've just commited a small incompatible change for syslog-ng 3.1, even though theoreticaly I shouldn't have.

The change is not big, simply the 'store-legacy-msghdr' flag became default for all sources, whereas earlier you had to specify that explicitly.

In order to understand why I did that, a short description of the flag follows below.

syslog-ng processes all incoming messages into fields (things like $PROGRAM and $DATE) and then reconstructs the message based on this parsed information when it has to write the message to a file.

Before syslog-ng 3.0 a message was split into the macros: "$DATE $HOST $MSG", which expanded to the actual log message. "$MSG" above was expanded to a line like:

"program[pid]: message"

With syslog-ng 3.0 and the integrated handling of RFC5424 and RFC3164 this format was changed and an $MSGHDR macro was created for the "program[pid]: " part and I got rid of this part from $MSG. (of course if you are running syslog-ng in compatibility mode, you get the old behaviour). The reason is simple: RFC5424 has separate fields for program/pid.

The contents of $MSGHDR is constructed programmatically, e.g. the punctuation characters '[' and ']' around the pid and the colon, is added to the format by syslog-ng, based on the available information in $PROGRAM and $PID.

However (and here comes the magic) there are programs that do not adhere to this format and omit the space after the colon character. E.g. if syslog-ng received:

"program:value"

as the syslog message, it added an explicit space character, and you'd get this in your log file:

"program: value"

NOTE the added space. This resulted in the workaround called "store-legacy-msghdr", which made syslog-ng remember the original formatting of the MSGHDR macro. However this proved to be a performance issue, thus it didn't become default, and I let my users discover this problem and add the flag explicitly if they cared about the extra space.

syslog-ng 3.1 however solves the performance issue (with the NVTable refactorization), and more and more people run into the very same issue, who are migrating from 2.1 or earlier.

Therefore I've decided to make 'store-legacy-msghdr' the default, and added a 'dont-store-legacy-msghdr' flag. My hope is that
  • people who cared: they already had the store-legacy-msghdr, for them, nothing is changed
  • people who didn't notice: they don't have the flag, but should be better of with the original formatting
  • people who changed their parsing scripts: well, those are who I address this message to as a HEADS up.
I hope this post makes things clearer.

Comments

Warwick Poole said…
Balázs, is there a simple way to expand $MSG into additional fields (split according to a regex possibly), to allow more specific queries on syslog-ng data inserted into a slightly customized MySQL schema?

In other words: is there any way to natively filter $MSG data pre-insertion into MySQL via a fifo?

I have tried to search the mailing list archives but have not found a solution.

Thanks for all your work.
Anonymous said…
@Warwick:

You might want to look at csv-parser (if you have specific delimiters) or db-parser (for any pattern) to get this sorted, you can find more info on that below:

http://bazsi.blogs.balabit.com/2008/10/syslog-ng-message-parsing.html
http://bazsi.blogs.balabit.com/2009/03/as-promised-on-mailing-list-here-comes.html

Popular posts from this blog

syslog-ng fun with performance

I like christmas for a number of reasons: in addition to the traditional "meet and have fun with your family", eat lots of delicious food and so on, I like it because this is the season of the year when I have some time to do whatever I feel like. This year I felt like doing some syslog-ng performance analysis. After reading Ulrich Deppert's series about stuff "What every programmer should know about memory" on LWN, I thought I'm more than prepared to improve syslog-ng performance. Before going any further, I'd recommend this reading to any programmer, it's a bit long but every second reading it is worth it. As you need to measure performance in order to improve it, I wrote a tool called "loggen". This program generates messages messages at a user-specifyable rate. Apart from the git repository you can get this tool from the latest syslog-ng snapshots. Loggen supports TCP, UDP and UNIX domain sockets, so really almost everything can be me...

syslog-ng roadmap 2.1 & 2.2

We had a meeting on the syslog-ng roadmap today where we decided some important things, and I thought I'd use this channel to tell you about it. The Open Source Edition will see a 2.1 release incorporating all core changes currently in the Premium Edition and additionally the SQL destination driver. We are going to start development on the 2.2 PE features, but some of those will also be incorporated in the open source version: support for the latest work of IETF syslog protocols unique sequence numbering for messages support for parsing message contents Previously syslog-ng followed the odd/even version numbering to denote development/stable releases. I'm going to abandon this numbering now: the next syslog-ng OSE release is going to have a 2.1 version number and will basically come out with tested code changes only. The current feature set in PE were developed in a closed manner and I don't want to repeat this mistake. The features that were decided to be part of the Open ...

An introduction to db-parser()

As promised on the mailing list here comes a short description of the new db-parser functionality of syslog-ng. For an introduction to parsers in general see my previous blog post here . The aim for db-parser is two-fold: extract interesting information from a log message attach tags to a log message for later classification. For instance here's a log sample (lines broken for readability): Feb 24 11:55:22 bzorp sshd[4376]: Accepted password for bazsi \ from 10.50.0.247 port 42156 ssh2 This message states that a user named "bazsi" has logged into the host named "bzorp" using SSH2 from the quoted IP and port. When you read this message as a human, the event that happened is perfectly clear. However if it is not a human, but a piece of software that has to make out the meaning of the message, you need to identify the event (e.g. that a user login has happened) and the additional information associated with the event (e.g. that he used 10.50.0.247 as the cl...