dlopen RTLD_LOCAL or RTLD_GLOBAL?

Discussion area about developing or extending OGRE, adding plugins for it or building applications on it. No newbie questions please, use the Help forum for that.
Post Reply
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1279
Contact:

dlopen RTLD_LOCAL or RTLD_GLOBAL?

Post by dark_sylinc »

I've been annoyed by a crash in Sample Browser using Ubuntu 13.10

After some debugging, I found out that replacing this line in OgreDynLib.h:

Code: Select all

#    define DYNLIB_LOAD( a ) dlopen( a, RTLD_LAZY | RTLD_GLOBAL)
for this one:

Code: Select all

#    define DYNLIB_LOAD( a ) dlopen( a, RTLD_LAZY | RTLD_LOCAL)
fixes the problem.

This crash is quite silent, as it happens after closing the app and no crash dialog appears, so it goes largely unnoticed unless you're with a debugger or running from command line.

Apparently, the global variable "SamplePlugin* sp;" from each sample (AtomicCounters.cpp; BezierPatch.cpp; CameraTrack.cpp; etc) becomes shared for all loaded so (???) so its content get overwritten with each load/unload.
So when AtomicCounter sample gets unloaded first, it's actually deleting last sample that was deleted (usually Water sample) and the pointer becomes dangling. When BezierPatch tries to unload, sp is now dangling and crashes.

Changing RTLD_LOCAL fixed the problem. But since I'm not a Linux guru, I'm asking here to see if anyone knows of any other side effect of this change?
Furthermore, this flag is also used in other platforms (i.e. Android) so I have no idea if the problem also affects those platforms too or if changing to LOCAL would break them.

Or may be RTLD_GLOBAL needs to stay but an extra change is needed to make SamplePlugin* sp; self contained.

Any Linux guru here?
User avatar
c6burns
Beholder
Posts: 1512
Joined: Fri Feb 22, 2013 4:44 am
Location: Deep behind enemy lines
x 138

Re: dlopen RTLD_LOCAL or RTLD_GLOBAL?

Post by c6burns »

Just as a knee-jerk I would say RTLD_LOCAL is the behaviour any plugin system would want to enforce, to avoid exactly the kind of problem you describe (accidentally removing or overwriting symbols from the global relocation scope). I've been using APR's dso wrapper for years and I learned it uses RTLD_GLOBAL (hardcoded) from the school of hard knocks.

Android build forces OGRE_STATIC TRUE so I don't think anyone even using DYNLIB_LOAD there, but nacl and flashcc I have no idea :)
Post Reply