Friday, December 18, 2009

syslog-ng OSE 3.1beta2 release

I've mentioned shortly in my previous post, but here's a more official announcement: I've released syslog-ng OSE 3.1beta2, containing some important bugfixes.

The list of changes: http://www.balabit.com/downloads/syslog-ng/open-source-edition/3.1beta2/changelog-en.txtLink

Thanks to Martin Holste for the feedback he provided, hopefully we can forget about the "beta" part soon.



Patterndb release for syslog-ng 3.1

You may probably know that starting with syslog-ng 3.0, we started poking into the message payload by being able to extract information from the log messages and use that information in structured form for message routing, filtering and storing them as separate fields in a database table.

You may have read about patterndb on this blog or on Marci's blog and we have also given talks about it on different conferences: NNM 2009 and LSM/RMLL 2009.

The reason I'm raising the topic here again is that we have now released about 8000 patterns covering about 200 applications for patterndb and are now in the process of creating a community site to maintain this database.

You can download the database from www.balabit.com.

Also an important thing to know that syslog-ng OSE 3.1 features enhanced performance with regard to handling information extracted from the message payload and it also has support for the latest patterndb database format. So if you want to try the new database, fetch a copy of the latest 3.1beta2 release.

Thursday, December 03, 2009

syslog-ng OSE 3.1beta1 released

I'm proud to announce that syslog-ng OSE 3.1 has been released and uploaded to our webserver. This version is new in two ways:

1) of course it has new features, see below for the most interesting bits

2) it is a "feature release", which means that once syslog-ng 3.2 or syslog-ng 4.0 is released, the support for this release will be ceased. See our new version policy at this link:

https://www.balabit.com/network-security/syslog-ng/opensource-logging-system/roadmap.bbx

Since the documentation is not yet up to date with this beta release, I'll try to include the most crucial information about the new features right here in this announcement.

For those who hurry, here's a link for the source code:

https://www.balabit.com/downloads/files/syslog-ng/open-source-edition/3.1beta1/source/syslog-ng_3.1beta1.tar.gz

And here are the binaries for Linux/FreeBSD systems:

https://www.balabit.com/network-security/syslog-ng/opensource-logging-system/

Select the Downloads tab, and in the Version selector select 3.1beta1.

Please try this beta version. Any feedback, positive or negative is appreciated, if you have comments, please post them on the mailing list at: syslog-ng@lists.balabit.hu

And now the new features in this release:

Support for patterndb v3

syslog-ng 3.1 now supports the patterndb v3 format, along with a bunch of new parsers: ANYSTRING, IPv6, IPvANY and FLOAT.Patterndb (more exactly the db-parser()) is a high performance message classifier and information extraction tool, that makes it easy to get away from the unstructured nature of syslog.

Patterndb has evolved since it was first introduced in syslog-ng 3.0. It is at the 3rd iteration, hopefully slowly reaching its final form. syslog-ng OSE 3.0 supported v1, our SSB product supports v2 and now syslog-ng OSE is the first version supporting v3.

Patterndb in general and the v1 format database is described in the syslog-ng manual at:

http://www.balabit.com/dl/html/syslog-ng-admin-guide_en.html/ch02s12.html

The XML schemas that describe the different patterndb versions are available in the syslog-ng source tree:

http://git.balabit.hu/?p=bazsi/syslog-ng-3.1.git;a=tree;f=doc/xsd;hb=HEAD

The changes in the patterndb format as they evolved were described in Marton Illes's blog at

http://marci.blogs.balabit.com/2009/06/new-db-parser-format-and-other.html

But see the other related posts as well.

Old patterndb databases can be converted to the new format by putting them in the /opt/syslog-ng/etc/patterns.d directory and using the pdbtool utility using the command:

$ pdbtool merge -p /opt/syslog-ng/var/patterndb.xml \
-D /opt/syslog-ng/etc/patterns.d


Assuming the installation prefix of syslog-ng is /opt/syslog-ng. The above filenames are also the recommended and default names for patterndb related files.

Some v2 format patterns are distributed by BalaBit itself for its SSB product, download location:

https://www.balabit.com/downloads/files/patterndb/1.0-20081117/patterndb/

You can convert these db files using pdbtool as described above.

Work is ongoing to publish a more comprehensive patterndb, but more on that in a separate post.

pdbtool

Added a new "pdbtool" utility to manage patterndb files: convert them from v1 or v2 format, merge mulitple patterndb files into one and look up matching patterns given a specific message.

See the manpage (by adding /opt/syslog-ng/share/man in the MANPATH) and Marci's post:

http://marci.blogs.balabit.com/2009/08/db-parser-new-utility-pdbtool.html

Message tags

Support for message tags was added: tags can be assigned to log messages as they enter syslog-ng: either by the source driver or via patterndb. Later it these tags can be used for efficient filtering.

http://marci.blogs.balabit.com/2009/05/tag-support-in-syslog-ng.html

Rewrite structured data

Earlier structured data fields in the new RFC5424 style syslog protocol were only read-only values that could be referenced in a template, but they couldn't be changed, and neither was it possible to add new fields in an already existing syslog message.

Now all these became possible by using the same syntax that didn't work earlier, e.g.

rewrite r_sd { set("55555" value(".SDATA.meta.sequenceId")); };

Macro and name-value integration

Macros and name-value pairs got a little tighter integration. syslog-ng 3.0 was limited in the use of macros in the value() option of the match() filter: it could only use name-value pairs, although intiutively it should have supported macros as well. This was changed, starting with 3.1 it is now possible to use macros as well.

The following now works:

match("regexp" value("R_DATE"));

syslog-ng is now warning you in case you are using '
Name-value pair performance improvements

With the advent of patterndb and the spreading use of name-value pairs in syslog-ng, a strong limitation was the performance penalty of using dynamically created name-value pairs. This was now solved, 3.1 features a new data structure to store message payload and name-value pairs in, which results in a 3 times better performance when looking up a name-value pair.

Patterndb parser enhancements

Some parsers got additional features: NUMBER is now able to parse hexadecimal numbers, ESTRING is now able to search for a sequence of characters as the end of the string. These changes make it easier to describe log messages in patterns.

Information about non-portable facilities

Added non-standard and non-portable facility codes (range 10-15), and decoupled syslog-ng facility name database from the system used to compile syslog-ng on.

Until this time the facility codes as understood by syslog-ng were dependant on the platform syslog-ng was compiled on. This is not true anymore, syslog-ng comes with its own "facility" code assignments, based on the RFC, and adding some non-standard values found on various UNIX systems. prefix in the value syntax, because you can't use the full template syntax when you specify a value to match against.

Name-value pair performance improvements

With the advent of patterndb and the spreading use of name-value pairs in syslog-ng, a strong limitation was the performance penalty of using dynamically created name-value pairs. This was now solved, 3.1 features a new data structure to store message payload and name-value pairs in, which results in a 3 times better performance when looking up a name-value pair.

Patterndb parser enhancements

Some parsers got additional features: NUMBER is now able to parse hexadecimal numbers, ESTRING is now able to search for a sequence of characters as the end of the string. These changes make it easier to describe log messages in patterns.

Information about non-portable facilities

Added non-standard and non-portable facility codes (range 10-15), and decoupled syslog-ng facility name database from the system used to compile syslog-ng on.

Until this time the facility codes as understood by syslog-ng were dependant on the platform syslog-ng was compiled on. This is not true anymore, syslog-ng comes with its own "facility" code assignments, based on the RFC, and adding some non-standard values found on various UNIX systems.

Monday, September 21, 2009

The neatest syslog-ng hack ever

One of my collegues probably felt like crazy and implemented a twitter() destination driver for syslog-ng. Although the value is dubious, I think it is the neatest contribution to syslog-ng so far. :)

Sunday, August 09, 2009

syslog-ng 3.1 status

Like I announced in one of my previous posts, towards the syslog-ng OSE 4.0 release I'm going to make smaller, short-term supported releases. The first of these, called syslog-ng 3.1 is nearing completion, and thus a status report is due.

Here's the original plan (quoting the roadmap page here):
  • support tags for syslog messages: each message can be marked with one or more tags, then apply filtering based on tags
  • patterndb: add tag support
  • patterndb: v2 database format support
  • patterndb: add parsers for IPv6 addresses and hex numbers
  • converge macros in templates and name-value pairs even more (right now it is not possible to use any macro in match())

I've just pushed out another set of updates to our git repository, which:
  • adds tag support: a new tags() filter and a tags() option for all sources and a builtin logic to assign the syslog-ng source name as a tag (in the format: .source.)
  • adds support for patterndb v2 and a newly introduced but compatible v3 format
  • adds "pdbtool" a new utility for managing patterndb files (not yet complete)
  • a couple of new parsers (IPv6, ANYSTRING, FLOAT)
The last item in the roadmap is not yet addressed, in fact I haven't even started it yet. I'm thinking about leaving that out altogether in order to have 3.1 released as soon as possible. If you have an opinion about that please don't hesitate to post it here on the mailing list.

If you are experimenting with patterndb you are advised to use the 3.1 branch as development happens here. Of course if we find something that affects our current stable 3.0, I'm backporting the fix, but since 3.0 is stable, I'm only backporting bugfixes and not new functionality.

If you are interested you can get the sources via git, or if you prefer a tarball, just drop me an email.

Tuesday, August 04, 2009

Developer tools

BalaBit has grown quite a lot in the last 9 years since it was founded, these days there are about 60 employees and more than 50% of that is working in the development field (give or take a couple, I've lost count some time ago). As we currently work on 4 products, support 5-6 CPU architectures and a host of different Operating Systems, automation in development is a must.

We try to automate everything and that means a lot. Some examples:
  • preparing the development workstation for development/testing work in 15 minutes for any of our products
  • building source code for tens of CPU/OS combinations by issuing a single command
  • creating bundles of intermediate components when generating setup packages
  • doing releases
  • test automation
  • and a host of other things
Some of these solutions are completely our own development, others are derived from public projects, and as BalaBit tries hard to be a good friend of Free, Libre and Open Source Software (FLOSS) we try to contribute back to projects that we use.

A couple of weeks ago, I published our modified version of dogtail, a test automation framework for AT-SPI based applications. We maintain our own dogtail in-house and since our patches were not accepted, we published our changes in a public git repository.

Earlier, one of our developers contributed to WAF to support building with Microsoft Visual C++, we've been using his work in two of our internal projects.

And this time, we published cccl a wrapper for MSVC to make it compatible with the gcc command line, in order to compile autoconf based projects under MSVC.

LinkAs you could guess, BalaBit is primarily a UNIX/Linux shop, but we need to support products aimed at Microsoft Windows, however with some heavylifting combining the best of both worlds is possible. And we've never been afraid of challenges. :)

Hopefully you can use some of these results, maybe even contribute back.

Thursday, July 16, 2009

patterndb updates pushed in syslog-ng OSE 3.1

According to the plan of my recently published syslog-ng OSE roadmap, I've worked on integrating the various patterndb related fixes/enhancements in the syslog-ng OSE 3.1 tree.

This now means that OSE 3.1 is now capable of working with all the version2 style pattern databases that syslog-ng Store Box is using. Here is a link for the SSB patterns: http://www.balabit.com/downloads/files/patterndb/1.0-20081117/patterndb/

I still need to work on integrating the new tags framework and the integration between tags and patterndb. Once that is done, I only have one item left for the 3.1 feature release.

So with some luck, we can have a new shiny syslog-ng OSE release this summer.

Please note that this is not released code yet and is only available via git, however if there's demand, I'm willing to create an alpha release (with binaries) if you want to try it. Just drop me an email, or simply write a comment to this post, and I'm going to create one for you.

Stay tuned.

Wednesday, July 08, 2009

syslog-ng rewrite use case: dpkg logs

One of my collegues (Péter Höltzl, he does all our trainings) has created a nice detailed example on how to use the parser/rewrite framework to pull in yet another application into syslog: dpkg, the Debian package manager.

If you are interested in what rewrite/parser can do for you, but didn't have the time to find out, the blog post is worth a read.

Friday, June 19, 2009

syslog-ng pipelines

The other day someone wanted a special syslog-ng macro that would expand into digit changing every 5 seconds (e.g. R_UNIXTIME % 5) and although I couldn't give an exact solution to his problem, I've came up with this configuration snippet:

rewrite p_date_to_values {
set("$R_DATE", value("rdate"));
};

filter f_get_second_chunk {
match('^... .. [0-9]+:[0-9]+:(?<rdate.second_tens>[0-9])[0-9]$'
type(pcre) value('rdate'));
};

The way it works is as follows:
  • the rewrite statement sets the name-value pair named "rdate" to $R_DATE (the macro)
  • the filter statement uses Perl Compatible Regular Expressions to parse the value of the "rdate" value and uses a named subpattern on the tens of seconds position to store that character in a value named "rdate.second_tens"
  • Later on in the configuration you can use "rdate.second_tens" just like any other macro/value.
This proves that the current rewrite/parser/filter subsystems are really powerful, however even though this proved to be possible, there are some lessons learned from this example:
  • the macro and name-value space should really converge to each, this would mean that the match() filter could directly match against the macro value $R_DATE without the need for the separate rewrite statement
  • when you are after a given goal, you don't really want to differentiate rewrite/parser/filter rules at all. The current syntax of using separate blocks for separate type of log processing elements is a pain.
So I'm thinking about inventing yet another block, which simply wouldn't care what kind of processing element is added to it, something along the lines:

pipeline rdateseconds {
set("$R_DATE", value("rdate"));
match('^... .. [0-9]+:[0-9]+:(?[0-9])[0-9]$'
type(pcre) value('rdate'));

};

And then:

log {
source(src);
pipeline(rdateseconds);
destination(dst);
};


Maybe I should even allow the creation of rewrite/parser/filter elements right there in the log statement:

log {
source(src);
filter(facility(mail));
destination(dst);
};


What do you think?

Wednesday, June 03, 2009

Nordic Meet on Nagios 2009

I'm sitting at NMN 2009 right now, and although the event title says it is a Nagios meet, I'm going to give a presentation on syslog-ng and the new features that 3.0 brings and an example on how to integrate syslog-ng and Nagios.

If you are here and have a question just feel free to find me in the "BalaBit" T-Shirt. :) There's also live streaming on the conference website, so you can catch me at 15:50 Central European Time.

Saturday, May 30, 2009

syslog-ng 4.0 roadmap plus release policy changes

I've updated the syslog-ng OSE roadmap on the syslog-ng webpage to include information about the upcoming syslog-ng version:

http://www.balabit.com/network-security/syslog-ng/opensource-logging-system/roadmap/

Also, I'd like to bring the changed release/support policy to your attention, that you can read at the same location above. I'd like to introduce stable track and feature track releases, the first being supported for a long time, whereas feature track releases are only supported until the next feature/stable release is published. When a sufficient number of features were published via feature track releases, the last one becomes stable and the cycle continues. Note that feature releases are NOT development snapshots, they are releases just like the major versions previously, the only difference is that instead of a large feature list like with syslog-ng 3.0, only a smaller set of changes are included.

This makes it possible to publish features more often, always concentrating on a few of them at a time, instead of doing development for a long time and come out with a feature packed release. I hope to increase the pace of syslog-ng development with this change and also to cause less problem for users who prefer stability over features. Please read the details on the roadmap page.

I've also opened the syslog-ng 3.1 repository and pushed it to our git server. Right now there are no differences (except for the version number) between 3.0 and 3.1, I'm planning to integrate Marton's message tagging and patterndb changes as soon as possible (his git tree is here). Hopefully the 3.1 cycle will be quite short as most of the things on the roadmap are already implemented, although scattered around in various public and private trees.

With the opening of the 3.1 branch, I'm also obsoleting 2.0 (in the new support model two stable track versions are supported at any given time and we have 2.0, 2.1 and 3.0 right now), but that'll go in a separate post/announcement.

Friday, May 08, 2009

syslog-ng OSE 3.0.2 released

After a long time and a lot of accumulated bugfixes, I've pressed the "release" button and syslog-ng OSE 3.0.2 was published on our website. The first official version to feature binary packages for Linux and BSD platforms. Since there was a long time between 3.0.1 and 3.0.2 the changelog is quite large, however most of it are bugfixes, only some minor enhancements here and there.

Hopefully I didn't miss any important bugs and problems. It must be much better stability/functionality wise than 3.0.1 was.

The diffstat since 3.0.1:
150 files changed, 4332 insertions(+), 3000 deletions(-)

You can also check the patches in our git repository.

If you are using the 3.0 branch you are really recommended to check out this release. If you are using anything earlier than 3.0 you are also recommended upgrade, syslog-ng 3.0 is revolutionary to previous versions in many ways, especially if you want to do more to your logs than merely store them in a plain text file.

OSDC 2009 slides

I've uploaded my OSDC 2009 presentation slides to
http://people.balabit.hu/bazsi/slides/osdc-2009-syslog-ng-3.0.odp Which has an example for processing iptables logs with db-parser() and putting the results in a customized SQL table.

Sunday, May 03, 2009

Nordic Nagios Meet 2009

I'm going to give a talk on syslog-ng on the upcoming Nordic Nagios Meet 2009. I expect the event to be great fun, just like last year. If you are in the Nordic region and use Nagios, rrdtools or syslog-ng, I recommend to pay a visit as you can meet the primary authors and some active contributors to these projects.

If you are there and have anything to ask/talk about syslog-ng, just feel free to approach me, I'm probably going to wear a badge, so you can recognize me :)

OSDC 2009 and syslog-ng automatic testing

I've spent the last week in the nice city of Nuremberg where Open Source Data Center Conference took place, organized by Netways AG. I really liked the talks about Puppet, DRBD and the description of the booking.com infrastructure which runs MySQL.

Although I really enjoyed the conference I also had some free time to improve the automatic test program for syslog-ng, which now also covers TLS encrypted source and SQL destinations. I've also implemented a small script to collect coverage data of the testcases, thus right now I know that about 63% of syslog-ng is covered by automatic tests. (initially it was 55% but there were some low hanging fruits). I expect to raise this number easily to around 80%, then it'll probably become much more difficult to increase it further as the rest is error processing paths, and unless I come up with something to inject errors from the testcases those are difficult to test.

Of course having a test suite is not a replacement for real-life, field testing, but nevertheless it makes it much easier to do releases as it ensures that no important functionality is broken completely.

Based on this test infrastructure I'm going to release 3.0.2, after which I'll probably change the way I manage releases for syslog-ng, but I'll talk about that in a forthcoming post.

My son is 7 weeks old

From Dani-aprilis-25


The reason I was absent from this blog in the couple of last weeks is my now 7 weeks old son, Dániel. You can find a picture of him right here in the post, but some additional ones in my Picasa albums.

Monday, March 23, 2009

Features that fell off the radar

I was long pondering with the problem that it is quite tricky to enter regexps into syslog-ng configuration file, since if you enclose the string in double quotes (e.g. in ""), the backslash character needs to be escaped.

Since backslash is used in regexps quite often, it can become cumbersome to enter regexps like:

match("[a-z\\-]+");

Note that the backslash is doubled because otherwise the syslog-ng string parser would pass the sequence to the regexps compiler as: "[a-z-]+" which is certainly different in meaning what the above expression says.

I always remembered that syslog-ng also supports single quotes (aka apostrophes), but I remembered they behaved just as if you used normal quotation marks. Therefore I was thinking about a 3rd string format, one that would not require escaping.

However I was reading the related code the other day, and found that apostrophes work exactly the way I planned this 3rd string syntax to behave: not to get in the way when entering regexps. In fact it behaves just like apostrophes in the UNIX shells. It does not care about escaping, it only cares about the terminating apostrophe.

I was dealing with regexp related questions on the mailing list a lot, and the root cause of the problems was most times this escaping stuff, and I never knew the proper answer and behaviour is already in syslog-ng, I've just forgotten about it completely.

And now as I check the documentation for syslog-ng, it does not mention this syntax either, even though it had been present even in the 1.6.x times.

So if you had trouble writing lots of regexps in syslog-ng configuration, and I told you to properly escape your regexps, please forgive me. syslog-ng is better than I've thought :)

Monday, March 16, 2009

Newborn baby

After about two weeks being late, my son was born yesterday evening at 22:45CET. He weights 3270g and 56cm. Both the mother and the child are fine and I'm a proud new father.

I guess this starts a section in my life, hopefully for the better.

Saturday, March 14, 2009

syslog-ng OSE binary packages

I' happy to announce that BalaBit has decided to make the binary packages for syslog-ng OSE available for free.

As you may know, BalaBit has various syslog-ng support packages and as a part of this service it prepared binary installation packages for different platforms. The access to these packages either required a support contract but could also be purchased separately for a yearly fee.

With syslog-ng 3.0, the binary packages for syslog-ng OSE will become freely accessible.

Since syslog-ng is an open source project, BalaBit planned to finish this task in the Open Source spirit: open and visible to all community members. This also means that the set of packages published with this e-mail is NOT yet release grade, rather it is more of a development snapshot of the current state of affairs. So please don't ruin your production systems with this package, it is more advisable to try them in a test environment (chroot or a dedicated test machine).

With all these said, here is the link:

https://www.balabit.com/network-security/syslog-ng/opensource-logging-system/upgrades/

Please pick the release named "3.0HEAD". This contains a source snapshot (effectively git from two days ago), and a set of packages for SUSE 10, RHEL4/5, FreeBSD 6.x, Debian etch, and Linux generic.

The binary packages contain all runtime dependencies needed to run syslog-ng, thus no further packages are required, it is an all-in-one package. The rpm/deb packages are prepared the same, they install syslog-ng in /opt/syslog-ng in order to avoid clashes with a system supplied syslog-ng daemon.

There are two install kits for each platform:
  • one that includes database drivers (dubbed as "server")
  • one that does not include database drivers (dubbed as "client")

Currently there are no other differences between the packages, but later on there might be.

With the current infrastructure in place, I'm confident that with each syslog-ng OSE release, I can publish the source AND binary packages at the same time.

I'd really appreciate success/failure reports and also any kind of comment you may have.

I'd like to release 3.0.2 together with its binary packages, let's hope that I get enough feedback on these packages so that I can do that.

Enjoy!

Wednesday, March 11, 2009

First IETF syslog-protocol related question

I'm happy as I've received the first question about the new IETF specified syslog-protocol support. There's a need for that after all :)

Next event on the horizon

I didn't realize it is already that time of the year, but I was reminded that I'm going to give a talk on syslog-ng 3.0 on Open Source Data Center conference in Nürnberg, Germany at the end of April. I'm going to talk about the nifty new features of syslog-ng 3.0.

It would be very nice to meet syslog-ng users there. :)

Tuesday, March 03, 2009

An introduction to db-parser()

As promised on the mailing list here comes a short description of the new db-parser functionality of syslog-ng. For an introduction to parsers in general see my previous blog post here.

The aim for db-parser is two-fold:
  • extract interesting information from a log message
  • attach tags to a log message for later classification.
For instance here's a log sample (lines broken for readability):

Feb 24 11:55:22 bzorp sshd[4376]: Accepted password for bazsi \
from 10.50.0.247 port 42156 ssh2


This message states that a user named "bazsi" has logged into the host named "bzorp" using SSH2 from the quoted IP and port. When you read this message as a human, the event that happened is perfectly clear. However if it is not a human, but a piece of software that has to make out the meaning of the message, you need to identify the event (e.g. that a user login has happened) and the additional information associated with the event (e.g. that he used 10.50.0.247 as the client).

If I wanted to express this as name-value pairs, it would be something like this:

event="user login", protocol="ssh2", \
client="10.50.0.247:42156", method="password"

Surely this latter form is easier to analyze than the first. So the first step of all kinds of log analysis is to extract information from messages. At a first glance, the easiest way to extract this information is the use of
regular expressions. For example:

^\w{3} [ :0-9]{11} [._[:alnum:]-]+ sshd\[[0-9]+\]: Accepted \
(gssapi(-with-mic|-keyex)?|rsa|dsa|password|publickey|keyboard-interactive/pam) \
for [^[:space:]]+ from [^[:space:]]+ port [0-9]+( (ssh|ssh2))?

Once you match with the regular expression above (courtesy of the logcheck project), the parentheseses mark the variable part of the information that you can reference as $1, $2 and so on.

The problem with regular expressions are several fold:
  • they are difficult to write (just look at the example above)
  • they are even more difficult to understand, once written (again, please look at the example)
  • they are slow and they scale poorly with the number of regexps that we need to match against the incoming message stream.
Projects like logcheck use regular expressions, but with the number of patterns increasing, the time needed to analyze logs skyrockets, which makes the whole thing unfeasible. Also, logcheck does not aim at extracting information from messages, it merely classifies them.

Clearly a different approach is needed. And that's what db-parser in syslog-ng is.

The db-parser() functionality of syslog-ng has the following objectives:
  • use a database to match various messages (and not filters embedded in the configuration file)
  • classify events into logcheck-like classes (cracking, violation, ignore, unknown)
  • extract variable information from messages, and place those into name-value pairs
  • be fast, scale to a high number of events/sec and high number of patterns
  • integrate well to the rest of syslog-ng
db-parser() is a generic parser, fits nicely to the parser framework inside syslog-ng. You can use it just like csv-parser():

...
parser p_db { db-parser(); };
...
log { source(src); parser(p_db); destination(d_parsed); };
...

The database used by db-parser is an XML file that is read during syslog-ng startup. Here is an example entry from the db-parser() database:

<patterndb>
<ruleset name='sshd'>
<pattern>sshd</pattern>
<rules>
<rule provider='balabit' id='1' class='system'>
<patterns>
<pattern>Accepted rsa for@QSTRING:username: @from\
@QSTRING:client_addr: @port @NUMBER:port:@ ssh2</pattern>
</patterns>
</rule>
...
</rules>
</ruleset>
</patterndb>



As you can see the database is structured, and the first selection criteria to apply is the name of the application (e.g. the value for $PROGRAM). Then each rule matches against the message payload (e.g. the value for $MESSAGE) with the syslog header stripped off. The rule specifies the classification (e.g. 'system' in the example above) and lists one or more patterns. If any of the patterns match, the rule is considered a match.

The variable part of the pattern is specified using special sequences, starting and ending with a '@' character. Within the enclosing '@' characters a colon separated list of parameters are listed:
  • the parser to apply (QSTRING and NUMBER in the example above)
  • the name of the value to be extracted from this position
  • additional arguments to be passed to the parser
The available parsers are currently not really documented, but here is a
list of them (you can find these in the radix.c source file):
  • IPv4: to parse an IPv4 address
  • NUMBER: to parse a number
  • STRING: to parse a word
  • ESTRING: to parse a sequence of characters ending with a specific character
  • QSTRING: to parse a string enclosed within quotes
Of course further parsers can be added to the code easily. You don't have to specify monsterous regexps to match an IPv4 address anymore. Not to mention IPv6 :)

If a message matches a rule, the db-parser() will make the following list of values defined for the given message:
  • .classifier.class: logcheck-like classification
  • .classifier.rule_id: the ID of the database entry that matched
  • pattern specific values: variable part that get extracted from the message by patterns
Each of the values defined previously can be referenced inside syslog-ng using a macro, e.g. you can do things like:

# You can use them in a filter:
filter f_class {
match("system" value(".classifier.class"));
};

# but you can also use them in the names of files:
destination d_parsed {
file("/var/log/messages/${.classifier.class}.log");
};

That's a rough skeleton of what db-parser() is. If you are interested, you can find the db-parser() implementation in syslog-ng OSE 3.0:

http://www.balabit.com/network-security/syslog-ng/opensource-logging-system/

You can also find some example pattern databases here:

http://www.balabit.com/downloads/files/patterndb/

We are also thinking about further ideas to enhance db-parser() and make it the foundation of an Open Source log analysis framework. Stay tuned!

Sunday, January 18, 2009

GStaticMutex and AIX

If you use GLib on non-Linux platforms such as AIX and think that G_STATIC_MUTEX_INIT does nothing but zero-initialize the mutex, think twice. Although quite clearly stated in the documentation, I thought I was smarter and used a GStaticMutex embedded in a structure that was zero initialized.

If you look at the definition of G_STATIC_MUTEX_INIT on most platforms (Linux, Solaris, BSDs), it contains nothing but zeroes. This lead me to the impression that zero filling a GStaticMutex instance is enough to initialize it.

In reality it isn't. On AIX this renders the mutex to be entirely useless without warnings or aborts. The results are of course bugs that are difficult to track down and fix.

This took me an entire day to figure out, as the SQL driver in syslog-ng had this problem. This was fixed since, but if you are running syslog-ng on AIX with the SQL driver, be sure to have this patch applied.

Thursday, January 15, 2009

syslog-ng OSE 3.0 finally released

Finally I could take the time to actually announce the freshly released syslog-ng OSE 3.0 branch. It was uploaded to our website during the winter holidays, but I had to integrate syslog-ng OSE to our new release infrastructure, which among others has a much nicer web interface.

Here is a summary on what is new in syslog-ng 3.0:

http://www.balabit.com/dl/html/syslog-ng-admin-guide_en.html/ch01s04.html

Enjoy!