Skip to main content

small incompatible change for 3.1

I've just commited a small incompatible change for syslog-ng 3.1, even though theoreticaly I shouldn't have.

The change is not big, simply the 'store-legacy-msghdr' flag became default for all sources, whereas earlier you had to specify that explicitly.

In order to understand why I did that, a short description of the flag follows below.

syslog-ng processes all incoming messages into fields (things like $PROGRAM and $DATE) and then reconstructs the message based on this parsed information when it has to write the message to a file.

Before syslog-ng 3.0 a message was split into the macros: "$DATE $HOST $MSG", which expanded to the actual log message. "$MSG" above was expanded to a line like:

"program[pid]: message"

With syslog-ng 3.0 and the integrated handling of RFC5424 and RFC3164 this format was changed and an $MSGHDR macro was created for the "program[pid]: " part and I got rid of this part from $MSG. (of course if you are running syslog-ng in compatibility mode, you get the old behaviour). The reason is simple: RFC5424 has separate fields for program/pid.

The contents of $MSGHDR is constructed programmatically, e.g. the punctuation characters '[' and ']' around the pid and the colon, is added to the format by syslog-ng, based on the available information in $PROGRAM and $PID.

However (and here comes the magic) there are programs that do not adhere to this format and omit the space after the colon character. E.g. if syslog-ng received:

"program:value"

as the syslog message, it added an explicit space character, and you'd get this in your log file:

"program: value"

NOTE the added space. This resulted in the workaround called "store-legacy-msghdr", which made syslog-ng remember the original formatting of the MSGHDR macro. However this proved to be a performance issue, thus it didn't become default, and I let my users discover this problem and add the flag explicitly if they cared about the extra space.

syslog-ng 3.1 however solves the performance issue (with the NVTable refactorization), and more and more people run into the very same issue, who are migrating from 2.1 or earlier.

Therefore I've decided to make 'store-legacy-msghdr' the default, and added a 'dont-store-legacy-msghdr' flag. My hope is that
  • people who cared: they already had the store-legacy-msghdr, for them, nothing is changed
  • people who didn't notice: they don't have the flag, but should be better of with the original formatting
  • people who changed their parsing scripts: well, those are who I address this message to as a HEADS up.
I hope this post makes things clearer.

Comments

Warwick Poole said…
Balázs, is there a simple way to expand $MSG into additional fields (split according to a regex possibly), to allow more specific queries on syslog-ng data inserted into a slightly customized MySQL schema?

In other words: is there any way to natively filter $MSG data pre-insertion into MySQL via a fifo?

I have tried to search the mailing list archives but have not found a solution.

Thanks for all your work.
Anonymous said…
@Warwick:

You might want to look at csv-parser (if you have specific delimiters) or db-parser (for any pattern) to get this sorted, you can find more info on that below:

http://bazsi.blogs.balabit.com/2008/10/syslog-ng-message-parsing.html
http://bazsi.blogs.balabit.com/2009/03/as-promised-on-mailing-list-here-comes.html

Popular posts from this blog

syslog-ng fun with performance

I like christmas for a number of reasons: in addition to the traditional "meet and have fun with your family", eat lots of delicious food and so on, I like it because this is the season of the year when I have some time to do whatever I feel like. This year I felt like doing some syslog-ng performance analysis. After reading Ulrich Deppert's series about stuff "What every programmer should know about memory" on LWN, I thought I'm more than prepared to improve syslog-ng performance. Before going any further, I'd recommend this reading to any programmer, it's a bit long but every second reading it is worth it. As you need to measure performance in order to improve it, I wrote a tool called "loggen". This program generates messages messages at a user-specifyable rate. Apart from the git repository you can get this tool from the latest syslog-ng snapshots. Loggen supports TCP, UDP and UNIX domain sockets, so really almost everything can be me...

syslog-ng contributions redefined

syslog-ng has been around for about 12 years now, but I think the biggest change in the project's life is imminent: with the upcoming release of syslog-ng OSE 3.2, syslog-ng will become an independent entity. Until now, syslog-ng was primarily maintained & developed by BalaBit, copyrights needed to be reassigned in order to grant BalaBit special privileges. BalaBit used her privileges to create a dual-licensed fork of syslog-ng, named "syslog-ng Premium Edition". The value we offer over the Open Source Edition of syslog-ng are things that larger enterprises require: support on a large number of UNIX platforms (27 as of 3.1), smaller and larger feature differences (like the encrypted/digitally signed logfile feature) better test coverage and release management longer term support Although perfectly legal, this business model was not welcome in various Free Software communities, and has caused friction and harm, because BalaBit has enjoyed a privilege that no others cou...

syslog-ng message parsing

Earlier this month, I announced the new syslog-ng 3.0 git tree, adding a lot of new features to syslog-ng Open Source Edition. I thought it'd be useful to describe the new features with some more details, so this time I'd write about message parsing. First of all, the message structure was a bit generalized in syslog-ng. Earlier it was encapsulating a syslog message and had little space to anything beyond that. That is, every log message that syslog-ng handled had date, host , program and message fields, but syslog-ng didn't care about message contents. This has changed, a LogMessage became a set of name-value pairs , with some "built-in" pairs that correspond to the parts of a syslog message. The aim with this change is: new name-value pairs can be associated with messages through the use of a parsing. It is now possible to parse non-syslog logs and use the columns the same way you could do it with syslog fields. Use them in the name of files, SQL tables or c...