Static class member value destroyed before the instance of the class itself with GCC but not with Clang?

RJVB

Ars Tribunus Militum
2,906
Consider a simple class of which I create only a single instance (global variable), and that has a number of static member variables, one of which (s_lmdbEnv) is an instance of a C++ wrapper class of LMDB]. These members are all declared with initialisation (most to nullptr or false) and set properly in a dedicated init() member function which also toggles a switch indicating that initialisation was succesfull.
The class dtor checks that switch to determine whether the LMDB connection has to be closed and a compression buffer deallocated.

Code:
class LMDBHook 
{
public:
    ~LMDBHook()
    {
      if (s_envExists) {
          s_lmdbEnv.close();
          s_envExists = false;
          delete[] s_lz4CompState;
      } 
    }   
            
    static bool init()
    {           
        if (!s_envExists) {
            s_errorString.clear();
            try {
                s_lmdbEnv = lmdb::env::create();
                s_envExists = true;
                // do some more initialisation like allocating s_lz4CompState
            } catch (const lmdb::error &e) {
                s_errorString = lmdbxx_exception_handler(e, QStringLiteral("database creation"));
                // as per the documentation: the environment must be closed even if creation failed!
                s_lmdbEnv.close();
            }
        }
        return false;
    }
    static lmdb::env s_lmdbEnv;
    static bool s_envExists;
    static char* s_lz4CompState;
    static size_t s_mapSize;
    static QString s_errorString;
};
static LMDBHook LMDB;

lmdb::env LMDBHook::s_lmdbEnv{nullptr};
bool LMDBHook::s_envExists = false;
char *LMDBHook::s_lz4CompState = nullptr;
// set the initial map size to 64Mb
size_t LMDBHook::s_mapSize = 1024UL * 1024UL * 64UL;
QString LMDBHook::s_errorString;

The full code in question: https://github.com/RJVB/kdevelop/bl...ge/duchain/topducontextdynamicdata_p.cpp#L257


The code in question has been working for years on Mac (built with various clang versions) and Linux (idem, and as far as I thought also with GCC 9.4 and now GCC 12).

Out of curiosity I built the code using GCC12 on Mac, evidently -stdlib=libc++ to avoid C++ runtime clashing.
To my surprise I started getting a system report about freeing a bad pointer when closing the LMDB connection in the dtor mentioned above, which is called during the global destruction phase just before the application exists. It turns out that apparently s_lmdbEnv has already been destructed and either that doesn't happen with the other platform/compiler combinations or the double free is handled gracefully there. The same must happen when I use jemalloc or mimalloc. After some more specific testing it turns out that the problem can also occur on Linux.

My question: what's the normal order of events here, and what kind of explanation can you give for the observed compiler dependency? Is my "fix" (not calling s_lmdbEnv.close() since I cannot determine whether the instance is still valid) in fact the preferred way of doing things here?

I asked elsewhere before but got shot down because of a lacking minimal test case which I'd just as rather not write if there's an obvious error in my design. I did get the suggestion that variables are destroyed in reverse order of their creation.

My global instance is indeed created before initialising the static class members variables, so that could explain what I'm seeing with GCC but not why it doesn't happen with clang. It also seems counterintuitive: what if you access one the static members from the dtor for instance?
 

RJVB

Ars Tribunus Militum
2,906
LMDB is a global variable visible only in the file where it's declared and it's not allocated dynamically and there is no public code to destroy or clean it so I can't see how the dtor could be called from other threads than the thread which created the instance (I'd assume the main thread on just about every platform).

Extracting the illustration code above I realised that a demonstrator might not be much more work to create, so I did that, and gave it a version of the lmdb++.h header with trace output.

It's beginning to look that there's indeed a compiler dependency here, but one that looks a lot like a bug. This is the s_lmdbEnv::close() function called in the code above:

Code:
  /**
   * Closes this environment, releasing the memory map.
   *
   * @note this method is idempotent
   * @post `handle() == nullptr`
   */   
  void close() noexcept {
    std::cerr << __PRETTY_FUNCTION__ << " this=" << this << " handle=" << handle() << "\n";
    if (handle()) {
      lmdb::env_close(handle());
      _handle = nullptr;
//    std::cerr << " handle now " << handle();
    }
//     std::cerr << "\n";
  }

The trace output is mine. The order in which the various functions are executed is the same between clang and GCC (and libc++ vs libstdc++) but GCC seems to optimise the _handle = nullptr; assignment away unless I build with -O0. Or unless I uncomment the 2 trace expression (or replace them with something else that cannot be optimised away.
I've seen that with GCC 7.2 and 12.3.0 on Mac for now.

Personally I would have had this function null out _handle before calling the close function on a copy of its value, and I certainly would not use the accessor function here. But AFAIK those changes shouldn't matter for the final behaviour, right??
 
Are you sure that the double free isn't happening on this line?

delete[] s_lz4CompState;

Looking at the LMDB++ code I think it's doing all the right stuff and can't cause a double free. On the other, you're doing a delete[] and not resetting the pointer so there's definitely a possibility of a double free, although I don't know how the destructor is getting called twice.

Personally your wrapper class has some code smell to me. If it were my code I'd design it as a Singleton and get rid of all the static stuff (except for the singleton instance). If after that redesign I still needed the lzvCompState buffer I'd wrap that in a unique_ptr to avoid double frees.
 

RJVB

Ars Tribunus Militum
2,906
Are you sure that the double free isn't happening on this line?

delete[] s_lz4CompState;

In the LMDHook dtor? Do we agree I only create a single instance of that class and that the dtor thus should be called once only? If this were happening we'd probably still have a very likely compiler bug.

I'm positive, and I think that you'd be too if you look at the trace output I added to the stackoverflow post. Besides, the post-mortem backtrace is pretty clear about it, too.

Looking at the LMDB++ code I think it's doing all the right stuff and can't cause a double free.

No, the code itself is clean in that sense, even my own admittedly slightly flaky code which really depends on there being a single instance only. (If I were to use this in multiple projects I'd probably make it more robust.)

I have tried to look at the assembly code of lmdb::env::close() in a debugger, but I've never gotten my head wrapped around the x86 instruction set. Clang generates code I can understand and that has a recognisable reset of _handle. GCC generates different code:
Code:
   0x000000000040166a <+262>:   mov    %rbx,%rdi
   0x000000000040166d <+265>:   call   0x4010c0 <_ZdlPv@plt>

   0x0000000000401672 <_ZN4lmdb3envD2Ev+270>:   add    $0x8,%rsp
   0x0000000000401676 <_ZN4lmdb3envD2Ev+274>:   pop    %rbx
   0x0000000000401677 <_ZN4lmdb3envD2Ev+275>:   pop    %rbp
   0x0000000000401678 <_ZN4lmdb3envD2Ev+276>:   ret

The 1st 2 lines correspond to delete env; (inlined version of lmdb::env_close(handle()); which GCC apparently does even with -O1). The following lines may or may not do something besides returning. I'm beginning to wonder if x86 CPUs don't have a feature to handle instructions just before the return from a function "cleverly". (Can't find the proper term right now but I'm thinking something not unlike tail recursion.) That could explain why the same sequence of instructions doesn't always reset the _handle pointer, and why adding "un-ignorable" code after the reset fixes the issue.
That is, if there's only 1 machine implementation of the function. Setting a breakpoint on it mentions there are 6 locations but I have not yet figured out what those correspond to.