I have finally tracked down the issue I was writing about in my previous blog post. It turned out to be a problem with OpenSSL on Linux 2.6 and NPTL. The default implementation of CRYPTO_thread_id() assumes that getpid() returns a unique value for each thread, however with NPTL each thread has the same pid and only their tid (thread id) value differ.
This made OpenSSL to hash its thread-specific error state to the same memory area, thus possibly overwriting memory concurrently freed in another thread. This caused a heap corruption which in turn caused crashes every now and then, of course showing a backtrace completely unrelated to the original problem.
The funny part is that I had a suspicion (see the yesterday's post) about this error state allocation I just have not seen the obvious reason: I had the impression that getpid() returns a different value for threads just like it did with LinuxThreads. Knowing the exact reason makes the whole issue trivial :).
I have posted this as an email on openssl-dev, I'm wondering what the reactions are going to be.
Take care.
This made OpenSSL to hash its thread-specific error state to the same memory area, thus possibly overwriting memory concurrently freed in another thread. This caused a heap corruption which in turn caused crashes every now and then, of course showing a backtrace completely unrelated to the original problem.
The funny part is that I had a suspicion (see the yesterday's post) about this error state allocation I just have not seen the obvious reason: I had the impression that getpid() returns a different value for threads just like it did with LinuxThreads. Knowing the exact reason makes the whole issue trivial :).
I have posted this as an email on openssl-dev, I'm wondering what the reactions are going to be.
Take care.
Comments