Friday, December 22, 2006

syslog-ng 2.0.1 released

I have released syslog-ng 2.0.1, available at the usual places. I have added various missing bits that fell out in the 1.6.x -> 2.0.x change. These include:
  • DNS cache support,
  • overwrite_if_older (used to be called remove_if_older),
  • various fixes,
All-in-all 2.0.0 was a successful release, hopefully this one will not be much worse. See the NEWS entry of 2.0.1 for more information.

Thursday, December 14, 2006

Interview with me on linux.com

Just a short note, an interview with me was published on linux.com, more specifically here.

Wednesday, November 08, 2006

2.0.0 experiences

As it turned out the 2.0.0 release was not so bad after all. At least I have not received show-stopper bugreports, which either means that noone is using it, or everything is fine and dandy :). Hopefully it is the latter, rc releases were tested by a few people.

In the meanwhile I started adding a few missing bits that were still present in 1.6.x but I never got around to implementing in the 2.0.0 tree. Among them I readded the remove_if_older() option. By the way, I don't really like the name of this option, does anyone have a better idea? If you do, please put it in a comment here or send me an email. (I was thinking about retention_time() but I'm afraid it is more difficult to understand what it would do)

The other bit is the new/shiny DNS cache, which also supports persistent entries. This means that syslog-ng can read your /etc/hosts file, resolve IPs that are present there, and use IP addresses for anything else. This removes the dependency on DNS, and should also improve overall performance.

So all in all, syslog-ng 2.0 is in a good shape, give it a try. Testing the latest snapshots, especially the new DNS cache parts, would be appreciated.

Saturday, November 04, 2006

Syslog-ng 2.0.0 released

You might have already noticed, but I thought I'd write an entry on this blog on this topic: the wait is over I have released syslog-ng 2.0.0. It took a bit longer than I have anticipated, I needed to prepare 4 rc releases, as each had some bug here or there and I really wanted to release a stable 2.0.0

I hopefully succeeded, no breakage reported so far. 2.0.0 was released last saturday, but the announcement went out only on Friday.

Some rarely used functionality is still missing though (a prominent example is spoof_source), and I already committed the remove_if_older() option after the release of 2.0.0. I'm going to concentrate on filling the missing bits.

I'll try to avoid committing to 1.6.x, migrating to 2.0.x is strongly recommended.

Wednesday, October 04, 2006

Catching up with things

It's been a long while since my last post here, but I was really busy in the past months and after I left for a two weeks vacation to Corsica. I returned on Sunday, started to catch up with work and such but I still was not able to read my syslog-ng mailing list folder, containing almost 100 unread messages. Please be patient, if you have a question open and forgiving in the unfortunate case I'd forget to reply.

On the other hand Corsica is a beatiful island, be sure to visit it if you can. Nature is almost untouched at a couple of places, everything is green and a lot of mountains. A pair of hiking shoes is a useful item if you are in Corsica. :)

So the island is beatiful we had some minor nuances with waiters, as neither me or my wife speaks French, and this seems to be a sin in the eyes of Corsican waiters. So at the end we came up with cooking for ourselves, lucky us our apartment was nicely equipped with cooking gear.

Back to syslog-ng, I'd really like to release 2.0.0 now. I released 2.0rc3 right before I left, if it had no problems in the last two weeks, it should be a reasonable 2.0.0 release. But I first need to read the 100 unread
mails on the topic :).

Tuesday, August 01, 2006

Some kernel hacking

After some time I needed to do some kernel coding again. To seamlessly support dynamically created interfaces in Zorp, we need something I called "interface groups".

Each interface might belong to a single group that basically describes how the interface was created. For instance there's an interface group for each PPP profile, but an interface group can encapsulate interfaces created by PPTPD.

It is quite difficult to match dynamic interfaces by their nature: iptables sports wildcard interface name matching with the '+' character but it only works if interface names have some kind of prefix _AND_ if you don't want to differentiate between two groups.

If you have two sets of PPP devices (like in the example I described above), then you have no way to create a separate ruleset, unless you reload iptables everytime a new interface is added to the system.

Adding to the burden, in Zorp we want to be able to bind a service to these dynamically created interfaces, of course without listing the actual IP address in the configuration file.

The idea is simple, I added an "interface group ID" to the net_device struct, and an option to the "ip link" command to set/query this ID. Once an interface is created by some kind of program (for instance pppd), a script is executed in its /etc/ppp/ip-up.d directory and userspace can assign a group ID based on the PPP profile name. Then Zorp gets notified about the change through NETLINK and can react by binding to the IP address of the new interface. The configuration remains static, no reloading needs to be done when such a change happens, and you can create firewall policies for something like: please allow this set of services for everyone using this PPP profile, without entering one specific IP address to the configuration. Neat, eh?

I posted my work on netdev and netfilter-devel, I'm curious what the kernel maintainers will think about it.

Sunday, July 09, 2006

syslog-ng 2.0rc1 released

After my last requests for testing of the latest 1.9.x code base, I have received a couple of bug reports, which were fixed in recent weeks. Since I have received no reports the past two weeks I decided to name the new release as "2.0rc1" to raise awareness of the new codebase.

I'm planning to create the new branch for 2.1.x, I have some exciting features in my mind, which I did not want to start before the release of 2.0.0. The old stable series 1.6.x is still supported, but expect less development time to be dedicated in maintaining that release.

Build queues for various architectures are not yet up, so only a Debian sarge binary is available for those with binary maintenance contracts.

Friday, July 07, 2006

Thoughts on the patent system

You might know that there is a standardization effort on the syslog protocol in the IETF. The work has started several years ago and the efforts produced RFC3164, the first documentation of the BSD syslog protocol after being in use for over two decades.

This group also produced RFC3195 in 2001, a reliable syslog protocol using the BEEP framework which did not really take of. I personally did not implement this in syslog-ng due to its highly verbose nature and the complexity which BEEP brings in.

Couple of months ago an effort started to create a simpler, but still reliable syslog protocol somewhat similar to what syslog-ng has been using for a couple of years now. First some layering was decided, e.g. to define the syslog protocol in a transport independent manner and then define various transports, like legacy UDP and TLS encrypted TCP.

After syslog-transport-udp was written by Rainer Gerhards, work has started on the TLS encrypted transport and someone from Huawei (you know the Chinese Cisco clone) volunteered to write the draft, which he did with the help of other group members. Basically the contents of the ID represented consensus (a rare event in the syslog group) and was heavily based on the previous years' work.

The ID was published and we were finally approaching a standardized syslog over TCP protocol, everything was nice and dandy.

Except that a few weeks later Huawei published a patent claim on the contents of the published ID, they basically said that they have a not-yet-published patent pending which covers at least parts of the ID. It is yet to be determined which sections of the internet draft is affected, but as far as I know it takes several months till this information is going to be available.

So what now? Basically I don't know, some prior art is certainly available, I personally found articles describing the combination of syslog-ng TCP transport and stunnel. Even if the patent will not be granted, the work of the working group is endangered by patent threat.

Did I mention already that I don't like US style patents? I'm happy to live in Europe, we are still not affected,
assuming that syslog-ng is developed and used within Europe. In the US, even end-users can be threatened if they use a product that uses protected technology and which does not license the patent.

I need to make a difficult decision:
  • avoid using the recent work of the working group and fall back to using an updated version of RFC3195, OR
  • don't care about the IPR claim Huawei has published in the hope that the patent will not be granted.
How would you decide?

Tuesday, June 20, 2006

Getting married

This is just a quick note that I'm not yet lost, I'm on holidays for two weeks, as I'm getting married this weekend, or to be more precise 24th June, 2006.

I'm returning to work on 3rd July, 2006. As my internet connection is not perfect, I might not be able to respond to e-mail messages timely.

Wednesday, May 24, 2006

Syslog-ng 2.0.0 release date

It was just a week or two ago when someone asked me about the planned release date of syslog-ng 2.0.0, the first stable release of the third incarnation of syslog-ng. Probably I did not even respond to the email as I did not know the answer. "When it's ready" is an answer users do not usually perceive very well.

It is very difficult to judge when a rewrite of such a critical software package is stable enough for production use: I wrote both functional and unit tests, used syslog-ng on my laptop for over a year now, but as I currently lack a system were non-production code can be uploaded, syslog-ng was drifting slowly in the stabilization process: whenever someone reported a bug, I fixed it.

So the release date in the current state is determined by the syslog-ng user community and not me. IF there's certain confidence that a pile of code runs fine, it can be tagged stable and everyone can be happy. If there is no feedback, an optimist might think that everything is going fine, the pessimist would say that nobody is using the product.

My point is that positive feedback is _VERY_ important, it is an indication that people are using the code, but have no problems.

syslog-ng 1.9.x is currently in feature freeze, I don't plan to do anything that threatens stability, but this also means that people waiting for things like message rewrite capabilities need to wait until syslog-ng 2.0.0 is out of the door. And the key to that is YOUR participation: download the latest release, try it and report back. Even, if it works. Especially if you are not running Debian, which I happen to run on my notebook.

Friday, May 19, 2006

Thinking about rewrite rules

Again the question on Solaris message IDs was raised in an email sent to me in private. For those who don't know how a Solaris msgid looks like, look at this example:

May 14 18:51:57 inbound2 su: [ID 366847 auth.notice] 'su root' succeeded

I was asked to include an MSGNOID macro which excludes this msgid in the final destination. The problem I have with this approach is that it simply does not scale: there are simply too many combinations to cover with various macros, an example using the msgid case:
  • a macro that includes neither the name of the program, nor the msgid
  • a macro that includes program name only
  • a macro that includes msgid but not the program name
  • a macro that includes both the program name and the msgid
As you can imagine this quickly becomes a maintenance nightmare even if one finds out a proper name for all of these combinations, especially if you add that other devices have their own extensions to syslog.

What I am pondering is to renew my old ideas about adding sed-like rewrite rules to syslog-ng, something along the lines of:

rewrite r_msgid { msg("s/\[ID [0-9]+ [a-z]+\.[a-z]+\]//");

log { source(s_local); filter(f_noid); rewrite(r_msgid); destination(d_messages); };


Of course similar functionality would be added to manipulate all syslog message parts, like hostname. The results would become part of the message itself, thus macros would use the rewritten message. And by the way backreferences could be used to refer various parts of the message, matched by regexps.

What do you think?

Saturday, May 06, 2006

syslog-ng 1.6.11 released

I have released syslog-ng 1.6.11 which fixes the problems outlined in the previous post. You can find it at the BalaBit website.

Tuesday, May 02, 2006

syslog-ng 1.6.10 broken

Just a quick one, it turned out that syslog-ng 1.6.10 is broken in several ways, first reading messages from /proc/kmsg is broken, and second the time_sleep() feature that was added in 1.6.10 has missed an important chunk from the parser code which made time_sleep() unconfigurable.

So a feature that cannot be used and an important problem. :(

I'm going to release syslog-ng 1.6.11 soon.

Infosec in London

I spent the last week in London, visiting InfoSec Europe. It was a great fun, I liked the exhibition as well as the city itself.

I have not been to London before (except for a single-day business trip two years ago, but that does not count), and I liked the city very much. I walked about 40-50km on these three days, I had my legs completely worn out. British people are quite strange I would say. Everything is completely in the reverse: the cars, the direction the trains arrive from, the way the taps need to be opened, I think even the screws must be unmounted in the reverse direction. I hated these non-mixing taps, one tap for cold another for hot water, no way to mix something tepid. Beside this strangeness I liked the atmosphere of the city, I visited all the important places, I even spent two hours in the British Museum, but it was nothing but a scratch on the surface.

The exhibition was also interesting, met a couple of interesting persons, like the Watchfire guys who invented HTTP request smuggling and some real computer forensics guys. We were talking about the problems with encryption vs. forensics and what possible solutions there are to this problem.

All in all it was an exhausting week.

Sunday, April 23, 2006

Committed IPv6 support for syslog-ng

I have finished IPv6 support for syslog-ng, I'm wondering how this will improve the number of people actually using the new syslog-ng 1.9.x tree.

In the implementation I've created separate udp6() and tcp6() source and destination drivers, because this was somewhat easier to implement. I'm expecting some portability trouble, but otherwise the implementation is nice and simple.

Some smaller fixes went in recently as well, like:
  • avoid chown/chmod files that do not exist as it clobbered error reporting,
  • added close-on-exec flag to file descriptors to avoid child processes to inherit tcp/udp connection fds,
  • fixed an off-by-one in flush_lines calculation,
  • a possible memory leak, and
  • a fix for non-existing filter references in the internal() message path
Apart from IPv6 support these are mainly bugfixes and I'm confident we can have a 2.0.0 real soon now.

Monday, April 17, 2006

syslog-ng and IPv6

I received an email last week asking about IPv6 support in syslog-ng. The question also referred to an IPv6 application page where most of the system logging applications were listed red showing that they lack IPv6 support.

I started hacking on it but I thought I would ask you how important you think adding IPv6 support to syslog-ng is?

My idea was to add tcp6() and udp6() source and destination drivers, however a lot of applications seem to do IPv6 support with a single listener, e.g. open an AF_INET6 socket and assume that the system shares IPv4 and IPv6 sockets. Which of the two approaches is preferable? The first gives more control the latter seems to be a bit easier to use, it works out of the box.

Saturday, April 15, 2006

Released syslog-ng 1.9.10 for real this time

Just a quick one this time, I have released syslog-ng 1.9.10 available at the usual places. The release contains mainly bugfixes and an implementation of the previously missing netmask() filter and support for bad_hostname() and check_hostname() options.

Sunday, April 09, 2006

Timezone woes

As I have written in my previous post there was a timezone related problem triggered by one of the unit test programs of syslog-ng. Apart from a minor issue in the testprogram itself, it turned out that there's a timezone conversion problem at the reception of messages. Syslog-ng 1.9.x has support for messages that use the ISO 6501 timestamp. As an example the current local time here right now in ISO 6501 is: 2006-04-09T22:43:24+02:00. The important part is that it includes an explicit timezone offset. This offset is processed by syslog-ng and it can convert timezones when necessary.

I spent about half a day to fix timezone conversion, I even used a pen and a sheet of paper to do some calculations. All I can say that there's an important building block missing from the POSIX time handling functions, which would have made my job as an application developer way easier: one that converts a broken down time representation to a UNIX time_t value, where the time to be converted is NOT in the local timezone, but in GMT. The other side of this conversion exists: localtime() converts a UNIX timestamp to the local timezone, and gmtime() does the same but instead of using the local timezone and daylight saving settings, it uses GMT as timezone.

The only portable function to convert a human readable timestamp to UNIX timestamp is mktime(3), which assumes that the converted timestamp is in local time. At first blick this can be easily used in place of our imaginery mktimegm() function: mktime() returns a value offseted by the local timezone, but we also know the local timezone offset, so we substract this from the return value of mktime() and we have a stamp in GMT, right? No, not right.

There are cases when mktime() changes its incoming broken down time representation when Daylight Saving kicks in: the value of "2006-03-26 02:00:00 CET" does not exist, it is equal to "2006-03-26 03:00:00 CEST" (CET is +01:00, CEST the daylight saving time is +02:00), and this happens to every value in this time interval, e.g. 2:33 CET becomes 3:33 CEST.

Remember, I have a timestamp with an explicitly specified timezone offset where the daylight saving settings of the syslog-ng process should not count, e.g. the sender sends something like 2006-03-26 02:00:00 +02:00, which is converted to 2006-03-26 03:00:00 +02:00 by the mktime() function, e.g. it is off one hour. And all this happens only in the transition hour. Good, heh?

The solution was to check this change in the time by mktime() and adjust the returned value, this seems to work reasonably well for the transition hour.

While writing this post I have found that there is a GNU extension defined, a function named timegm(3), which seems to do exactly what I have wanted. The problem that this function does not seem to be too portable. The notes in the manpage say that for achieve timegm() functionality, the application should change its own environment, set the TZ environment variable, call mktime(), and reset the environment variable. This does not look too clean I would even call that ugly. IIRC setenv() allocates memory, I would need to call this kludge for each and every incoming message.

I think this important hole in the API should be plugged, there are a lot of applications that need to work with various timezones and I have a bet that a lot of those work incorrectly in daylight saving transition hours.

I already have one example: GNU date program also allows specifying an explicit timezone offset:

bazsi@bzorp:~$ date -d "2006-03-26 01:59:59 +0100"
Sun Mar 26 01:59:59 CET 2006
bazsi@bzorp:~$ date -d "2006-03-26 02:00:00 +0100"
Sun Mar 26 04:00:00 CEST 2006

The second one should only be one second later than the first, e.g. it should be 03:00:00 CEST, and not 04:00:00 CEST. Try it with your favourite application :)

Thursday, April 06, 2006

Almost released syslog-ng 1.9.10

... but at the end I didn't. I prepared the NEWS file, changed version number etc, but the end it turned out that one of my unit test program which tests macro expansions failed.

I still have not looked into the issue, hopefully it is only the test program, time related macros seem to use a bad timezone offset. Again I seem to have made a timezone related bug :(

Although timezones and time related functions seem to be simple at first, it proved to be a problematic area, it already had a lot of bugs and again here is this one. Not to mention the problem that different platforms have different set of variables/functions to cover the issue. For instance "timezone" is a global variable on Linux and a function on BSD. Linux has a "tm_gmtoff" member in "struct tm", BSD doesn't.

OK, I quit whining now :) Hopefully I'm going to have some free time to look into this bug in the nearfuture.

I also have two other issues on my radar for syslog-ng 1.9.10, first I've received some reports about missing configuration keywords (namely bad_hostnames and check_hostnames), and second I want to change some currently reserved words to identifiers, so that "kernel" can be used as the name of sources again. And oh yes, I have also received a report on an abort(), although I don't have enough info on this one yet.

One thing is certain: the 1.9.10 release of syslog-ng is coming.

Saturday, April 01, 2006

SSH publickey authentication implemented

I have hacked on our SSH gateway today to add publickey authentication support. By the way I may not have explained this before, so a short introduction is due: Zorp is an application layer gateway with support for 21 protocols, among them an SSH gateway capable of looking into the encrypted SSH stream and restricting the protocol to a subset that you really want to allow to your users. (e.g. you can forbid TCP port forwarding while still allowing terminal access).

The problem with publickey authentication is that the signature covers the so called SSH session_id which is a unique value derived during key exchange. My proxy implements a man-in-the-middle, so the client<->proxy and proxy<->server connections have a different session id, thus simply replaying the authentication packets of the client will not work since the SSH session ids do not match.

The solution is that we are going to replace user keys transparently when crossing the firewall, which means that private keys need to be stored there. This is both a feature and a drawback: a feature since you can control which keys you are allowing to leave your perimeter and a drawback as this requires additional management tasks. It would have been so much nicer if we could do this transparently, but I am afraid this is not possible unless we modify all clients out there or alternatively we manage to find a way to crack the Diffie-Hellmann key exchange algorithm.

On the syslog-ng side I have committed a fix to make files over 2GB work again. It should be available in the next snapshot shortly. I'm also thinking about preparing 1.9.10 with the fixes accumulated so far.

Friday, March 31, 2006

OpenSSL problem found

I have finally tracked down the issue I was writing about in my previous blog post. It turned out to be a problem with OpenSSL on Linux 2.6 and NPTL. The default implementation of CRYPTO_thread_id() assumes that getpid() returns a unique value for each thread, however with NPTL each thread has the same pid and only their tid (thread id) value differ.

This made OpenSSL to hash its thread-specific error state to the same memory area, thus possibly overwriting memory concurrently freed in another thread. This caused a heap corruption which in turn caused crashes every now and then, of course showing a backtrace completely unrelated to the original problem.

The funny part is that I had a suspicion (see the yesterday's post) about this error state allocation I just have not seen the obvious reason: I had the impression that getpid() returns a different value for threads just like it did with LinuxThreads. Knowing the exact reason makes the whole issue trivial :).

I have posted this as an email on openssl-dev, I'm wondering what the reactions are going to be.
Take care.

Thursday, March 30, 2006

Spending time in gdb...

I have spent the last three days debugging an ugly crash in the upcoming Zorp 3.1. First I had some problems with the core files produced with Linux 2.6.12, as the register values proved to be invalid, thus the backtrace was even more unusable than it is usual with heap corruptions.

I could get access to the original register values as Zorp dumps part of its stack when a fatal signal is encountered. Using that information I could locate the stack frame of the signal handler and luckily Linux passes a "struct sigcontext" to each signal handler as parameter which contains register information. But nevertheless it made analyzing the core files difficult.

After a post to the gdb mailing list it turned out to be a kernel problem rather than a gdb problem and with the help of my collegue Krisztián Kovács (of Netfilter ct_sync fame) we could solve the problem by backporting a fix from 2.6.15, so core files are now ok.

The problem however seems to be difficult, I have already studied the libc malloc implementation, disassembled and annotated the _int_malloc and _int_free functions, I'm now able to read hexdumps of heap areas fluently but I still don't have a fix for the problem. Lucky us Zorp restarts itself in this situation and the scenario where this problem occurs is not frequently used.

My suspicion is that the SSL error state for threads are the cause of the problem as I have evidence that the freed heap block is overwritten by ERR_clear_state(), which destroys the next and prev pointers in the freed memory block, thus resulting in the crash. The error states are supposedly thread-specific variables, but the way the allocation is done is suspicious.

I hope I can finally find this problem tomorrow.

Tuesday, March 28, 2006

Preparing syslog-ng release

I have started to prepare syslog-ng 1.6.10 for release, the tarball has already been uploaded to the website, but I still have not sent an announcement to the mailing lists. So if you read this here, you might download a still unannounced version :)

Nothing really important in the release, a cleanup in the documentation with several fixes and a migration to DocBook/XML from the SGML favour and a new tunable called time_sleep().

The latter was worked out together with John Morrissey who did some profiling and found that on hosts with a lot of syslog connections syslog-ng might become a bottleneck. The option does nothing but sleep() a defined amount of time which makes syslog-ng to process incoming messages in batches, this way decreasing the number of poll() loop iterations which was listed high (about 67%) in the profiles generated by John.

Setting time_sleep() to about 50ms decreased the CPU load by 80% which is quite significant I'd say.

As Rusty Russell would say I have just received a SIGWIFE, so going to bed now :)

Starting my blog

I have considered starting my own blog for some time now and have finally started doing something for it. I first tried to set up blosxom but as I did not want to spend too much time customizing it I finally gave up and tried to find a nice blogger website which does everything for me. This is blogger.com, I like what I see so far.

Ops, I should have started by introducing myself: my name is Balázs Scheidler, I live in Budapest, Hungary and I started this blog because I would have some things to publish about some free software projects I am involved in and it is trendy to have a blog anyway :).

Back to my projects, I am the author of syslog-ng that you might know as an alternative system logging package for UNIX based systems. And also Zorp, an application layer gateway. You can find out more about these at http://www.balabit.com/. I also contribute patches to a couple of others (whenever I encounter something I don't like or which simply bugs me) and I sometimes poke into kernel development as well (generally netfilter related development like transparent proxying support for Linux).

So far, so good. Hopefully I won't give up too soon.

Bazsi