I like christmas for a number of reasons: in addition to the traditional "meet and have fun with your family", eat lots of delicious food and so on, I like it because this is the season of the year when I have some time to do whatever I feel like.
This year I felt like doing some syslog-ng performance analysis. After reading Ulrich Deppert's series about stuff "What every programmer should know about memory" on LWN, I thought I'm more than prepared to improve syslog-ng performance. Before going any further, I'd recommend this reading to any programmer, it's a bit long but every second reading it is worth it.
As you need to measure performance in order to improve it, I wrote a tool called "loggen". This program generates messages messages at a user-specifyable rate. Apart from the git repository you can get this tool from the latest syslog-ng snapshots.
Loggen supports TCP, UDP and UNIX domain sockets, so really almost everything can be measured.
Then I've put together a test environment, which consisted of a 4-way Opteron box as a server (two dual-core CPUs at 2.6GHz and 1MB of cache), where the syslog-ng center ran in 64 bit mode, and a venerable P4 Xeon 2.4GHz as client. I verified that the client was more than capable of saturating a 100MBit link that was used to connect the two boxes. Then I've installed syslog-ng on the server, using the simplest configuration possible: fetching messages from the TCP/UDP socket and writing everything to disk into a plain, text file without macros in the filename.
Syslog-ng 2.o OSE performed somewhat better than I had anticipated. When using TCP it could successfully process messages at about 44000 messages/sec without losing a single message. Each message was 150 byte long (I've started with 200, but the 100MBit link proved to be the bottleneck) Some funny findings:
I've found some things that improved performance even further, the most important bottleneck was the time related functions in libc (localtime, mktime, strftime, etc.) For some reason they reread /etc/localtime upon every invocation. I'm going to file a ticket in their bugzilla as it's completely unnecessary to do that, especially if the value of the TZ environment variable does not change.
At the end of the day I finished with syslog-ng chewing messages at around 68500 messages/sec which is a 55.9% improvement. I can see some further possibilities, but I doubt I could increase performance over 75000 msg/sec. This means that syslog-ng can process messages at about wirespeed of a 100MBit/sec ethernet link. (68500*150 = 10275000 bytes/sec)
I was very satisfied at this point, even explained my findings to my wife and my elder brother :)
This of course does not apply to legacy UDP based syslog traffic directly, unless a really large socket buffer is set for syslog-ng. I'd say that you need 3-5 seconds worth of receive buffer in order to avoid losing messages, which with the above rate would be about 30MB-50MB of non-swappable kernel memory.
These changes were committed to the Premium Edition of syslog-ng, although the loggen program is GPLed, so anyone can do performance testing their own setup/configuration.
This year I felt like doing some syslog-ng performance analysis. After reading Ulrich Deppert's series about stuff "What every programmer should know about memory" on LWN, I thought I'm more than prepared to improve syslog-ng performance. Before going any further, I'd recommend this reading to any programmer, it's a bit long but every second reading it is worth it.
As you need to measure performance in order to improve it, I wrote a tool called "loggen". This program generates messages messages at a user-specifyable rate. Apart from the git repository you can get this tool from the latest syslog-ng snapshots.
Loggen supports TCP, UDP and UNIX domain sockets, so really almost everything can be measured.
Then I've put together a test environment, which consisted of a 4-way Opteron box as a server (two dual-core CPUs at 2.6GHz and 1MB of cache), where the syslog-ng center ran in 64 bit mode, and a venerable P4 Xeon 2.4GHz as client. I verified that the client was more than capable of saturating a 100MBit link that was used to connect the two boxes. Then I've installed syslog-ng on the server, using the simplest configuration possible: fetching messages from the TCP/UDP socket and writing everything to disk into a plain, text file without macros in the filename.
Syslog-ng 2.o OSE performed somewhat better than I had anticipated. When using TCP it could successfully process messages at about 44000 messages/sec without losing a single message. Each message was 150 byte long (I've started with 200, but the 100MBit link proved to be the bottleneck) Some funny findings:
- Enabling flow-control did not really make the results worse.
- Increasing log-fetch-limit to a large number (10000) made the results worse.
- Using the Glib GSlice allocator instead of malloc/free didn't improve the numbers.
I've found some things that improved performance even further, the most important bottleneck was the time related functions in libc (localtime, mktime, strftime, etc.) For some reason they reread /etc/localtime upon every invocation. I'm going to file a ticket in their bugzilla as it's completely unnecessary to do that, especially if the value of the TZ environment variable does not change.
At the end of the day I finished with syslog-ng chewing messages at around 68500 messages/sec which is a 55.9% improvement. I can see some further possibilities, but I doubt I could increase performance over 75000 msg/sec. This means that syslog-ng can process messages at about wirespeed of a 100MBit/sec ethernet link. (68500*150 = 10275000 bytes/sec)
I was very satisfied at this point, even explained my findings to my wife and my elder brother :)
This of course does not apply to legacy UDP based syslog traffic directly, unless a really large socket buffer is set for syslog-ng. I'd say that you need 3-5 seconds worth of receive buffer in order to avoid losing messages, which with the above rate would be about 30MB-50MB of non-swappable kernel memory.
These changes were committed to the Premium Edition of syslog-ng, although the loggen program is GPLed, so anyone can do performance testing their own setup/configuration.
Comments
This might be the wrong forum to ask all this, but I'd appreciate your help!
The fixed facility number is <38>, which means auth.info, it was basically randomly chosen.
You can change this by changing the line 57 in the source code.
Is there somewhere I can download just the loggen? (I'm on Ubuntu if that helps)
Or you'd like a binary of loggen? I'm afraid we only supply that as syslog-ng binaries, which is a paid-for subscription, or you can compile your own copy.
Compiling is really not that difficult, but you can always ask on the mailing list, if you have trouble.
options { time_reopen(60); dns-cache-hosts('etc/hosts'); dns_cache(yes); use_dns(no); };
source s_local { internal(); unix-dgram("log"); tcp(port(2000) log-fetch-limit(128)); };
destination d_file { file("/home/bazsi/smessages" log_fifo_size(100000)); };
log { source(s_local); destination(d_file); };
I found your blog and all the usefull tools/info we can find through it-> Thanks !
I've built a syslog-ng central server on solaris (thanks to a pkg, didn't compile it) and wanted to bench it.
Therefore i tried to compil loggen, but didn't succeed :(
Ok, i'm a newbie to compilation, but maybe you can show me the way ?
I tried a simple "gcc loggen.c -o loggen" and i waz told :
loggen.c:1:20 config.h:No such file or dorectory
In which package should i find this config.h file ? because a lot of other tools (like perl) have one
The make command give me another error like it doesn't find any "../../configure.in" which is normal i think.
I'm sure under linux, it should has been more easy to compil but i run a securityCostumized sunOS10 :S .. so i haven't a lot of libs :(
you might be able to
compile loggen without that, but you must be able to resolve compilation errors such as the one you mentioned.
try removing the config.h inclusion from the .c file and compile loggen using:
gcc -DHAVE_GETOPT -o loggen loggen.c
I haven't tested this, but it might work.
I didn't now this DHAVE_GETOPT flag.
it doesn't seem to work, and here are the stderr:
(root) # gcc -DHAVE_GETOPT -o loggen loggen.c
Undefined first referenced
symbol in file
nanosleep /var/tmp//ccGqQRvu.o
socket /var/tmp//ccGqQRvu.o
connect /var/tmp//ccGqQRvu.o
getaddrinfo /var/tmp//ccGqQRvu.o
freeaddrinfo /var/tmp//ccGqQRvu.o
ld: fatal: Symbol referencing errors. No output written to loggen
collect2: ld returned 1 exit status
strange ...
My LD_LIBRARY_PATH looks OK
try adding "-lrt -lnsl -lnet" command line options. This is just from the top of my head, if some of the functions are undefined, you need to specify the correct libraries.
Again, this is usually performed by the configure script.
But the compilation is still not complete :(
(root)#gcc -DHAVE_GETOPT -lrt -lnsl -lnet -o loggen loggen.c
ld : fatal: library -lnet : not found
ld : fatal: File processing errors. No output written to loggen
Of course i tried the configure.in script but it contains to many calls to dnl which doesn't exist in my system
Unfortunatly, i've got a special OS
-lsocket -lrt -lnsl
Please post questions like this on the mailing list, it is much easier to post detailed information there, than a small box like this I'm typing this reply to.
It now Works !!!
great and many thanks for that :)
I actually did a mail to syslog-ng@lists.balabit.hu & php-syslog-ng-support@googlegroups.com, but you first answered in this page.
I'll inform those who couldn't compile it in the php-syslog-ng-support@googlegroups.com, with you're information.
thks again !