Multithread in python: root.startRendering()

A place for users of OGRE to discuss ideas and experiences of utilitising OGRE in their games / demos / applications.
andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Multithread in python: root.startRendering()

Post by andrecaldas »

I have successfully ran a modified multi-threaded version of this sample:
https://github.com/OGRECave/ogre/blob/m ... /sample.py

It was not easy, and OGRE python binding needed to be modified.

First Attempt

Instead of the main() function, I put the code in a Thread::run().

Code: Select all

import Ogre
import Ogre.Bites
import Ogre.RTShader

import threading

class KeyListener(Ogre.Bites.InputListener):
    def keyPressed(self, evt):
        if evt.keysym.sym == Ogre.Bites.SDLK_ESCAPE:
            Ogre.Root.getSingleton().queueEndRendering()
        return True

class App(threading.Thread):
    def run(self):
        ctx = Ogre.Bites.ApplicationContext("PySample")
#
        ctx.initApp()
#
        root = ctx.getRoot()
        scn_mgr = root.createSceneManager()
#
        # register for input events
        klistener = KeyListener() # must keep a reference around
        ctx.addInputListener(klistener)
#
        shadergen = Ogre.RTShader.ShaderGenerator.getSingleton()
        shadergen.addSceneManager(scn_mgr)  # must be done before we do anything with the scene
#
        # without light we would just get a black screen
        scn_mgr.setAmbientLight((.1, .1, .1))
#
        light = scn_mgr.createLight("MainLight")
        lightnode = scn_mgr.getRootSceneNode().createChildSceneNode()
        lightnode.setPosition(0, 10, 15)
        lightnode.attachObject(light)
#
        # create the camera
        cam = scn_mgr.createCamera("myCam")
        cam.setNearClipDistance(5)
        cam.setAutoAspectRatio(True)
        camnode = scn_mgr.getRootSceneNode().createChildSceneNode()
        camnode.attachObject(cam)
#
        # map input events to camera controls
        camman = Ogre.Bites.CameraMan(camnode)
        camman.setStyle(Ogre.Bites.CS_ORBIT)
        camman.setYawPitchDist(0, 0.3, 15)
        ctx.addInputListener(camman)
#
        # and tell it to render into the main window
        vp = ctx.getRenderWindow().addViewport(cam)
        vp.setBackgroundColour((.3, .3, .3))
#
        # finally something to render
        ent = scn_mgr.createEntity("Sinbad.mesh")
        node = scn_mgr.getRootSceneNode().createChildSceneNode()
        node.attachObject(ent)
#
        root.startRendering() # blocks until queueEndRendering is called
        ctx.closeApp()

Then, I executed python3, loaded sample.py, instantiated App and executed the thread:

Code: Select all

  $ python3
import sample
app = sample.App()
app.start()
  .... lots of output from OGRE ....
  .... But no command prompt :-( ....

Unfortunately, when root.startRendering() is executed, the so called GIL is locked during the whole call. This is because we are not supposed to access python data concurrently. So, I created a method that releases the lock before calling startRendering in C++.

Of course, this caused a crash! :mrgreen:

The reason for the crash is the "KeyListener" python class that is registered as an InputListener callback. When SWIG executes the KeyListener keyPressed() method, it does not lock the GIL.

I do not know anything about SWIG. I am using pybind11 in my code. In pybind11, virtual methods that are overridden by python code, they do lock the GIL. So, at the end what was needed (in C++) was:

  1. A python bound method that releases the GIL before calling Root::startRendering()
  2. An InputListener wrap that does lock the GIL before calling the python code.

The final python code became:

Code: Select all

import Ogre
import Ogre.Bites
import Ogre.RTShader

import my_bindings as mb

import threading

#class KeyListener(Ogre.Bites.InputListener):
class KeyListener(mb.InputListener):
# changed the python method name just to make it easier...
    def escPressed(self, evt):
        Ogre.Root.getSingleton().queueEndRendering()
        return True

class App(threading.Thread):
    def run(self):
        ctx = Ogre.Bites.ApplicationContext("PySample")
#
        ctx.initApp()
#
        root = ctx.getRoot()
        scn_mgr = root.createSceneManager()
#
        # register for input events
        klistener = KeyListener() # must keep a reference around
#        ctx.addInputListener(klistener)
        mb.addInputListener(ctx, klistener)
#
        shadergen = Ogre.RTShader.ShaderGenerator.getSingleton()
        shadergen.addSceneManager(scn_mgr)  # must be done before we do anything with the scene
#
        # without light we would just get a black screen
        scn_mgr.setAmbientLight((.1, .1, .1))
#
        light = scn_mgr.createLight("MainLight")
        lightnode = scn_mgr.getRootSceneNode().createChildSceneNode()
        lightnode.setPosition(0, 10, 15)
        lightnode.attachObject(light)
#
        # create the camera
        cam = scn_mgr.createCamera("myCam")
        cam.setNearClipDistance(5)
        cam.setAutoAspectRatio(True)
        camnode = scn_mgr.getRootSceneNode().createChildSceneNode()
        camnode.attachObject(cam)
#
        # map input events to camera controls
        camman = Ogre.Bites.CameraMan(camnode)
        camman.setStyle(Ogre.Bites.CS_ORBIT)
        camman.setYawPitchDist(0, 0.3, 15)
        ctx.addInputListener(camman)
#
        # and tell it to render into the main window
        vp = ctx.getRenderWindow().addViewport(cam)
        vp.setBackgroundColour((.3, .3, .3))
#
        # finally something to render
        ent = scn_mgr.createEntity("Sinbad.mesh")
        node = scn_mgr.getRootSceneNode().createChildSceneNode()
        node.attachObject(ent)
#
#        root.startRendering() # blocks until queueEndRendering is called
        mb.no_gil_render(root) # blocks until queueEndRendering is called
        ctx.closeApp()

Of course, we could have a loop and call the method to render only one frame. However, the GIL would be locked during each frame rendering, and I want python to be doing other things while rendering... 8)

Are we interested in this kind of thing?

PS: I have written an SDL3 ApplicationContext for OgreBites. My changes, however, are too drastic and if ever merged cannot be done in the current release. Since I am doing "drastic" and "useless" changes, I could try to replace SWIG by pybind11. Is it welcome? :-)

paroj
OGRE Team Member
OGRE Team Member
Posts: 2106
Joined: Sun Mar 30, 2014 2:51 pm
x 1132

Re: Multithread in python: root.startRendering()

Post by paroj »

did you see Ogre.Root.getSingleton().allowPyThread()? in:
https://www.ogre3d.org/2024/09/22/ogre- ... for-python

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

paroj wrote: Wed Oct 02, 2024 9:04 am

did you see Ogre.Root.getSingleton().allowPyThread()? in:
https://www.ogre3d.org/2024/09/22/ogre- ... for-python

I couldn't find any "allowPyThread()", I suppose it is not in master, yet. Even the name "allowPyThread()", it seems to suppose people will be calling this method without really knowing what they are doing. They are "allowing a python thread"... why not? :( :shock: :?

Anyway, I find it way too hackish. Locks need to be dealt with with great care and awareness.

  1. Locks should use RAII. (temporary acquire or release)
  2. General C++ code should not have to be aware of python. The bindings need to be aware of python. Otherwise, it is too messy.

The ideal way to implement it is to have python bindings release the lock when they know it will not be needed.

The biggest challenge in doing this is that the general C++ code can end up calling python code inadvertently because python callbacks might be registered. In ogre, (python) callbacks are registered as a (python) descendant of some virtual class, like OgreBites::InputListenter. The solution is however, simple, the binding code needs to temporarily acquire the GIL whenever a virtual method bound to python code is called. It seems to me SWIG does not do that. Of course, this can be worked around even in SWIG. It is just not as clean.

I actually do not see much need for the HighPy module. It seems to me it avoids calling back from C++ simply by reimplementing the whole startRendering or renderOneFrame in python. The only advantage seems to be the fact that the GIL is not held for a long time. I suppose it is still "fake multitasking" (without the allowPyThread) because no code is executed in concurrent threads.

I suggest getting rid of allowPyThread() and, instead, implementing python bindings properly.

paroj
OGRE Team Member
OGRE Team Member
Posts: 2106
Joined: Sun Mar 30, 2014 2:51 pm
x 1132

Re: Multithread in python: root.startRendering()

Post by paroj »

allowPyThread() is part of the 14.3 release and is not related to the highpy module. I just put it there to group the python stuff.
I agree that documentation can be improved here.

Note that SWIG can be configured to globally handle the GIL, but the constant GIL un/-locking results in a 60% overhead in the bindings:
https://www.swig.org/Doc4.0/Python.html ... erformance

allowPyThread() avoids that by only releasing it during swap() when no callbacks can happen

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

paroj wrote: Wed Oct 02, 2024 1:44 pm

allowPyThread() is part of the 14.3 [...]

Sorry, I did not search ".i" files. :-)

paroj wrote: Wed Oct 02, 2024 1:44 pm

Note that SWIG can be configured to globally handle the GIL, but the constant GIL un/-locking results in a 60% overhead in the bindings:
https://www.swig.org/Doc4.0/Python.html ... erformance

I do not really understand how SWIG works internally, nor how this magic is done.

I do believe locks need great awareness from those who deal with them:

  1. People who deal with locks need awareness.
  2. People who do not really understand those locks shall not deal with them.

The SWIG magic will probably check for GIL lock and try to acquire it every time a binding is "used". This is "magical" and a waste. The reason is the lack of awareness the SWIG has about your code. It is designed so things work without SWIG and SWIG users needing to understand what is happening.

The only time you need to acquire a GIL lock is in the sequence of events:

  1. The C++ code gets a pointer "ptr" to a virtual base. (eg: InputListener* ptr)
  2. The "ptr" is actually a python class that derives from InputListener.
  3. The C++ code calls a virtual method that was implemented in python.

It is only at step 3 that GIL needs to be locked.

paroj wrote: Wed Oct 02, 2024 1:44 pm

allowPyThread() avoids that by only releasing it during swap() when no callbacks can happen

I am taking a look at "allowPyThread()". I does not release during swap(). What it actually does is to register a frame event listener for "frameRenderingQueued()" that releases the GIL and for "frameEnded()", that re-acquires the GIL.

This is a hack. I see no guarantees that no other callbacks will be triggered between those two.

If you need to go too deep to demonstrate the code is correct and not just "works most of the time", it looks like a bad idea.

Another Example

Suppose I want to write an Ogre application that has a python console. I do believe, for example, that many would find useful to have a python console inside their Ogre app. :-)

I do everything in C++. But the application user can register an InputListener in python. Or some other type of callback (if there is any - I don't know and I shouldn't be required to).

This will cause a crash when any key is pressed or released, or the mouse is moved, etc, etc, etc.

Just the rendering

You are assuming that only frame rendering needs protection and only frame rendering will be executed without the GIL.

If this is the case, then instead of this "allowPyThread()", you could simply release the GIL and re-acquire it before executing the frame callbacks. This would be much cleaner and less hackish.

But still, as long as the C++ part is thread safe, the solution I propose works for everything, not just rendering.

PS: I do not actually know what rendering means. :mrgreen:
PPS: I do not understand what is "during swap()". :mrgreen: :mrgreen:

paroj
OGRE Team Member
OGRE Team Member
Posts: 2106
Joined: Sun Mar 30, 2014 2:51 pm
x 1132

Re: Multithread in python: root.startRendering()

Post by paroj »

If this is the case, then instead of this "allowPyThread()", you could simply release the GIL and re-acquire it before executing the frame callbacks. This would be much cleaner and less hackish.

this is actually what allowPyThread() aims for. As it only works "most of the time" it is a opt-in solution. You can try to use it and check whether it breaks for you. It should cover most use-cases.

A proper solution would need special support in the C++ API, which I tried to avoid for now as the GIL will become optional in the near future: https://peps.python.org/pep-0703/

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

paroj wrote: Wed Oct 02, 2024 5:13 pm

If this is the case, then instead of this "allowPyThread()", you could simply release the GIL and re-acquire it before executing the frame callbacks. This would be much cleaner and less hackish.

this is actually what allowPyThread() aims for.

I don't get it.
From what you said, you could simply release the GIL while executing this:
https://github.com/OGRECave/ogre/blob/c ... #L268-L275
Of course, you would better use RAII for the lock. But you could simply release and restore it as well.

paroj wrote: Wed Oct 02, 2024 5:13 pm

A proper solution would need special support in the C++ API,

Not necessarily. As far as I understood, you could enable threads in SWIG and disable it by default using the "nothreadallow" feature. Then you can enable it just for OgreBites::InputListener.

If I understood how SWIG, which I met a couple days ago, handles the GIL, this would suffice to cover much than "allowPyThreads()" does. And it would require almost nothing.

paroj wrote: Wed Oct 02, 2024 5:13 pm

which I tried to avoid for now as the GIL will become optional in the near future: https://peps.python.org/pep-0703/

I do not know the proper meaning of "near future". :mrgreen:

Jokes apart, everything I told you demands the C++ programmer and the python programmer knowing absolutely nothing about the GIL.

The only person that has to be aware of the GIL is the python binding developer.

paroj
OGRE Team Member
OGRE Team Member
Posts: 2106
Joined: Sun Mar 30, 2014 2:51 pm
x 1132

Re: Multithread in python: root.startRendering()

Post by paroj »

andrecaldas wrote: Wed Oct 02, 2024 6:32 pm

I don't get it.
From what you said, you could simply release the GIL while executing this:
https://github.com/OGRECave/ogre/blob/c ... #L268-L275

you need listeners to do so - otherwise OgreMain will depend on libPython.
Incidentally, this is exactly what allowPyThreads() does using the existing listeners:
https://github.com/OGRECave/ogre/blob/c ... .cpp#L1155

andrecaldas wrote: Wed Oct 02, 2024 6:32 pm

Not necessarily. As far as I understood, you could enable threads in SWIG and disable it by default using the "nothreadallow" feature. Then you can enable it just for OgreBites::InputListener.

you would need to re-acquire the GIL for all callbacks that you identified above, not just InputListener.

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

paroj wrote: Wed Oct 02, 2024 11:20 pm

you would need to re-acquire the GIL for all callbacks that you identified above, not just InputListener.

Do you mean all those?

Ogre::FrameListener;
Ogre::LodListener;
Ogre::RenderObjectListener;
Ogre::RenderQueueListener;
Ogre::RenderTargetListener;
Ogre::MeshSerializerListener;
Ogre::ResourceLoadingListener;

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

Or even those:

Ogre::RenderSystem::Listener

In swig, it would be anyone who has a "director"...

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

paroj wrote: Wed Oct 02, 2024 11:20 pm

you would need to re-acquire the GIL for all callbacks that you identified above, not just InputListener.

Now I understand the problem. There are too many of them.
I have submitted a feature request to SWIG. :-)
https://github.com/swig/swig/issues

I suppose Listeners should be protected by SWIG (acquire GIL when executed from C++), so they could be called by any C++ code. The only problem here, is that SWIG demands you to list all virtual methods for all those *::Listener. If I could specify a class, instead, it would be as easy as it is with the "director" feature. We would have two lines per "director":

Code: Select all

%feature("nothread", "0") *::Listener;
%feature("nothreadallow") *::Listener;

I have listed lots of listeners and made a patch that replaces "allowPyThread()". I do not expect you will like it... but I did it anyway. :mrgreen:

As you can see, the problem is having to enumerate all virtual methods... for all those *::Listener. Otherwise, it works like a charm, and could be used in many other places, because this way it is safe to callback those listeners from inside C++, not just the rendering parts. (although I do not really understand what the rendering is, exaclty)

Code: Select all

diff --git a/Components/Bites/include/OgreBites.i b/Components/Bites/include/OgreBites.i
index 5a40d3dd0..c506a8fe8 100644
--- a/Components/Bites/include/OgreBites.i
+++ b/Components/Bites/include/OgreBites.i
@@ -1,4 +1,4 @@
-%module(package="Ogre", directors="1") Bites
+%module(package="Ogre", directors="1", threads="1") Bites
 %{
 /* Includes the header in the wrapper code */
 #include "Ogre.h"
@@ -15,6 +15,7 @@
 #include "OgreImGuiInputListener.h"
 %}
 
+%feature("nothread");
 %include std_vector.i
 %include std_string.i
 %include exception.i 
@@ -26,6 +27,39 @@
 %include "OgreSGTechniqueResolverListener.h"
 %template(InputListenerList) std::vector<OgreBites::InputListener*>;
 %feature("director") OgreBites::ApplicationContextBase;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") OgreBites::InputListener::frameRendered;
+%feature("nothread", "0") OgreBites::InputListener::keyPressed;
+%feature("nothread", "0") OgreBites::InputListener::keyReleased;
+%feature("nothread", "0") OgreBites::InputListener::touchMoved;
+%feature("nothread", "0") OgreBites::InputListener::touchPressed;
+%feature("nothread", "0") OgreBites::InputListener::touchReleased;
+%feature("nothread", "0") OgreBites::InputListener::mouseMoved;
+%feature("nothread", "0") OgreBites::InputListener::mouseWheelRolled;
+%feature("nothread", "0") OgreBites::InputListener::mousePressed;
+%feature("nothread", "0") OgreBites::InputListener::mouseReleased;
+%feature("nothread", "0") OgreBites::InputListener::textInput;
+%feature("nothread", "0") OgreBites::InputListener::axisMoved;
+%feature("nothread", "0") OgreBites::InputListener::buttonPressed;
+%feature("nothread", "0") OgreBites::InputListener::buttonReleased;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") OgreBites::InputListener::frameRendered;
+%feature("nothreadallow") OgreBites::InputListener::keyPressed;
+%feature("nothreadallow") OgreBites::InputListener::keyReleased;
+%feature("nothreadallow") OgreBites::InputListener::touchMoved;
+%feature("nothreadallow") OgreBites::InputListener::touchPressed;
+%feature("nothreadallow") OgreBites::InputListener::touchReleased;
+%feature("nothreadallow") OgreBites::InputListener::mouseMoved;
+%feature("nothreadallow") OgreBites::InputListener::mouseWheelRolled;
+%feature("nothreadallow") OgreBites::InputListener::mousePressed;
+%feature("nothreadallow") OgreBites::InputListener::mouseReleased;
+%feature("nothreadallow") OgreBites::InputListener::textInput;
+%feature("nothreadallow") OgreBites::InputListener::axisMoved;
+%feature("nothreadallow") OgreBites::InputListener::buttonPressed;
+%feature("nothreadallow") OgreBites::InputListener::buttonReleased;
+#endif
 %feature("director") OgreBites::InputListener;
 %include "OgreInput.h"
 
@@ -73,4 +107,4 @@ JNIEnv* OgreJNIGetEnv();
 #ifndef SWIGCSHARP
 %include "OgreTrays.h"
 %include "OgreAdvancedRenderControls.h"
-#endif
\ No newline at end of file
+#endif
diff --git a/OgreMain/include/Ogre.i b/OgreMain/include/Ogre.i
index d7850673f..6badcce1d 100644
--- a/OgreMain/include/Ogre.i
+++ b/OgreMain/include/Ogre.i
@@ -1,4 +1,4 @@
- %module(package="Ogre", directors="1") Ogre
+ %module(package="Ogre", directors="1", threads="1") Ogre
  %{
  /* Includes the header in the wrapper code */
 #include "Ogre.h"
@@ -19,6 +19,7 @@
 #include "OgreDefaultDebugDrawer.h"
 %}
 
+%feature("nothread");
 %include stdint.i
 %include std_shared_ptr.i
 %include std_string.i
@@ -54,6 +55,65 @@ typedef long int time_t;
 // should be turned on globally if all renames are in place
 %feature("flatnested") Ogre::MaterialManager::Listener;
 %feature("flatnested") Ogre::SceneManager::Listener;
+
+
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::MaterialManager::Listener::handleSchemeNotFound;
+%feature("nothread", "0") Ogre::MaterialManager::Listener::afterIlluminationPassesCreated;
+%feature("nothread", "0") Ogre::MaterialManager::Listener::beforeIlluminationPassesCleared;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::MaterialManager::Listener::handleSchemeNotFound;
+%feature("nothreadallow") Ogre::MaterialManager::Listener::afterIlluminationPassesCreated;
+%feature("nothreadallow") Ogre::MaterialManager::Listener::beforeIlluminationPassesCleared;
+
+
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::RenderSystem::Listener::eventOccurred;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::RenderSystem::Listener::eventOccurred;
+
+
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::Camera::Listener::cameraPreRenderScene;
+%feature("nothread", "0") Ogre::Camera::Listener::cameraPostRenderScene;
+%feature("nothread", "0") Ogre::Camera::Listener::cameraDestroyed;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::Camera::Listener::cameraPreRenderScene;
+%feature("nothreadallow") Ogre::Camera::Listener::cameraPostRenderScene;
+%feature("nothreadallow") Ogre::Camera::Listener::cameraDestroyed;
+
+
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::SceneManager::Listener::preUpdateSceneGraph;
+%feature("nothread", "0") Ogre::SceneManager::Listener::postUpdateSceneGraph;
+%feature("nothread", "0") Ogre::SceneManager::Listener::preFindVisibleObjects;
+%feature("nothread", "0") Ogre::SceneManager::Listener::postFindVisibleObjects;
+%feature("nothread", "0") Ogre::SceneManager::Listener::sceneManagerDestroyed;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::SceneManager::Listener::preUpdateSceneGraph;
+%feature("nothreadallow") Ogre::SceneManager::Listener::postUpdateSceneGraph;
+%feature("nothreadallow") Ogre::SceneManager::Listener::preFindVisibleObjects;
+%feature("nothreadallow") Ogre::SceneManager::Listener::postFindVisibleObjects;
+%feature("nothreadallow") Ogre::SceneManager::Listener::sceneManagerDestroyed;
+
+
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::MaterialSerializer::Listener::materialEventRaised;
+%feature("nothread", "0") Ogre::MaterialSerializer::Listener::techniqueEventRaised;
+%feature("nothread", "0") Ogre::MaterialSerializer::Listener::passEventRaised;
+%feature("nothread", "0") Ogre::MaterialSerializer::Listener::gpuProgramRefEventRaised;
+%feature("nothread", "0") Ogre::MaterialSerializer::Listener::textureUnitStateEventRaised;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::MaterialSerializer::Listener::materialEventRaised;
+%feature("nothreadallow") Ogre::MaterialSerializer::Listener::techniqueEventRaised;
+%feature("nothreadallow") Ogre::MaterialSerializer::Listener::passEventRaised;
+%feature("nothreadallow") Ogre::MaterialSerializer::Listener::gpuProgramRefEventRaised;
+%feature("nothreadallow") Ogre::MaterialSerializer::Listener::textureUnitStateEventRaised;
 #endif
 
 %ignore *::operator=;  // needs rename to wrap
@@ -487,6 +547,13 @@ ADD_REPR(Plane)
 %ignore Ogre::Log::Stream; // not useful in bindings
 %ignore Ogre::Log::stream;
 %feature("director") Ogre::LogListener;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::LogListener::messageLogged;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::LogListener::messageLogged;
+#endif
 %ignore Ogre::Log::setLogDetail;
 %include "OgreLog.h"
 %ignore Ogre::LogManager::stream; // not useful in bindings
@@ -515,19 +582,124 @@ SHARED_PTR(FileHandleDataStream);
 %include "OgreCodec.h"
 %include "OgreSerializer.h"
 %include "OgreScriptLoader.h"
+
 // Listeners
 %feature("director") Ogre::FrameListener;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::FrameListener::frameStarted;
+%feature("nothread", "0") Ogre::FrameListener::frameRenderingQueued;
+%feature("nothread", "0") Ogre::FrameListener::frameEnded;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::FrameListener::frameStarted;
+%feature("nothreadallow") Ogre::FrameListener::frameRenderingQueued;
+%feature("nothreadallow") Ogre::FrameListener::frameEnded;
+#endif
 %include "OgreFrameListener.h"
+
 %feature("director") Ogre::LodListener;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::LodListener::prequeueMovableObjectLodChanged;
+%feature("nothread", "0") Ogre::LodListener::postqueueMovableObjectLodChanged;
+%feature("nothread", "0") Ogre::LodListener::prequeueEntityMeshLodChanged;
+%feature("nothread", "0") Ogre::LodListener::postqueueEntityMeshLodChanged;
+%feature("nothread", "0") Ogre::LodListener::prequeueEntityMaterialLodChanged;
+%feature("nothread", "0") Ogre::LodListener::postqueueEntityMaterialLodChanged;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::LodListener::prequeueMovableObjectLodChanged;
+%feature("nothreadallow") Ogre::LodListener::postqueueMovableObjectLodChanged;
+%feature("nothreadallow") Ogre::LodListener::prequeueEntityMeshLodChanged;
+%feature("nothreadallow") Ogre::LodListener::postqueueEntityMeshLodChanged;
+%feature("nothreadallow") Ogre::LodListener::prequeueEntityMaterialLodChanged;
+%feature("nothreadallow") Ogre::LodListener::postqueueEntityMaterialLodChanged;
+#endif
 %include "OgreLodListener.h"
+
 %feature("director") Ogre::RenderObjectListener;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::RenderObjectListener::prequeueMovableObjectLodChanged;
+%feature("nothread", "0") Ogre::RenderObjectListener::postqueueMovableObjectLodChanged;
+%feature("nothread", "0") Ogre::RenderObjectListener::prequeueEntityMeshLodChanged;
+%feature("nothread", "0") Ogre::RenderObjectListener::postqueueEntityMeshLodChanged;
+%feature("nothread", "0") Ogre::RenderObjectListener::prequeueEntityMaterialLodChanged;
+%feature("nothread", "0") Ogre::RenderObjectListener::postqueueEntityMaterialLodChanged;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::RenderObjectListener::prequeueMovableObjectLodChanged;
+%feature("nothreadallow") Ogre::RenderObjectListener::postqueueMovableObjectLodChanged;
+%feature("nothreadallow") Ogre::RenderObjectListener::prequeueEntityMeshLodChanged;
+%feature("nothreadallow") Ogre::RenderObjectListener::postqueueEntityMeshLodChanged;
+%feature("nothreadallow") Ogre::RenderObjectListener::prequeueEntityMaterialLodChanged;
+%feature("nothreadallow") Ogre::RenderObjectListener::postqueueEntityMaterialLodChanged;
+#endif
 %include "OgreRenderObjectListener.h"
+
 %feature("director") Ogre::RenderQueueListener;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::RenderQueueListener::preRenderQueues;
+%feature("nothread", "0") Ogre::RenderQueueListener::postRenderQueues;
+%feature("nothread", "0") Ogre::RenderQueueListener::renderQueueStarted;
+%feature("nothread", "0") Ogre::RenderQueueListener::renderQueueEnded;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::RenderQueueListener::preRenderQueues;
+%feature("nothreadallow") Ogre::RenderQueueListener::postRenderQueues;
+%feature("nothreadallow") Ogre::RenderQueueListener::renderQueueStarted;
+%feature("nothreadallow") Ogre::RenderQueueListener::renderQueueEnded;
+#endif
 %include "OgreRenderQueueListener.h"
+
 %feature("director") Ogre::RenderTargetListener;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::RenderTargetListener::preRenderTargetUpdate;
+%feature("nothread", "0") Ogre::RenderTargetListener::postRenderTargetUpdate;
+%feature("nothread", "0") Ogre::RenderTargetListener::preViewportUpdate;
+%feature("nothread", "0") Ogre::RenderTargetListener::postViewportUpdate;
+%feature("nothread", "0") Ogre::RenderTargetListener::viewportAdded;
+%feature("nothread", "0") Ogre::RenderTargetListener::viewportRemoved;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::RenderTargetListener::preRenderTargetUpdate;
+%feature("nothreadallow") Ogre::RenderTargetListener::postRenderTargetUpdate;
+%feature("nothreadallow") Ogre::RenderTargetListener::preViewportUpdate;
+%feature("nothreadallow") Ogre::RenderTargetListener::postViewportUpdate;
+%feature("nothreadallow") Ogre::RenderTargetListener::viewportAdded;
+%feature("nothreadallow") Ogre::RenderTargetListener::viewportRemoved;
+#endif
 %include "OgreRenderTargetListener.h"
+
 %feature("director") Ogre::MeshSerializerListener;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::MeshSerializerListener::processMaterialName;
+%feature("nothread", "0") Ogre::MeshSerializerListener::processSkeletonName;
+%feature("nothread", "0") Ogre::MeshSerializerListener::processMeshCompleted;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::MeshSerializerListener::processMaterialName;
+%feature("nothreadallow") Ogre::MeshSerializerListener::processSkeletonName;
+%feature("nothreadallow") Ogre::MeshSerializerListener::processMeshCompleted;
+#endif
+
 %feature("director") Ogre::ResourceLoadingListener;
+#ifdef SWIGPYTHON
+/* Locks SIG when python is called from c++ */
+%feature("nothread", "0") Ogre::ResourceLoadingListener::resourceLoading;
+%feature("nothread", "0") Ogre::ResourceLoadingListener::resourceStreamOpened;
+%feature("nothread", "0") Ogre::ResourceLoadingListener::resourceCollision;
+
+/* Do not release lock when calling from python */
+%feature("nothreadallow") Ogre::ResourceLoadingListener::resourceLoading;
+%feature("nothreadallow") Ogre::ResourceLoadingListener::resourceStreamOpened;
+%feature("nothreadallow") Ogre::ResourceLoadingListener::resourceCollision;
+#endif
+
 // More Data Types
 %ignore Ogre::ColourValue::getHSB; // deprecated
 %include "OgreColourValue.h"
@@ -925,33 +1097,14 @@ SHARED_PTR(Mesh);
 %ignore Ogre::Root::createSceneManager(uint16, const String&);
 %ignore Ogre::Root::getMovableObjectFactoryIterator;
 #ifdef SWIGPYTHON
-%{
-class ThreadAllowFrameListener : public Ogre::FrameListener {
-    PyThreadState* _save = 0;
-public:
-    bool frameRenderingQueued(const Ogre::FrameEvent& evt)
-    {
-        if(!_save)
-            _save = PyEval_SaveThread();
-        return true;
-    }
-    bool frameEnded(const Ogre::FrameEvent& evt)
-    {
-        if(_save) {
-            PyEval_RestoreThread(_save);
-            _save = 0;
-        }
-        return true;
-    }
-};
-%}
-%extend Ogre::Root {
-    void allowPyThread()
-    {
-        static ThreadAllowFrameListener listener;
-        $self->addFrameListener(&listener);
-    }
-}
+/* Unlocks SIG when called from python */
+%feature("nothread", "0") Ogre::Root::startRendering;
+%feature("nothread", "0") Ogre::Root::renderOneFrame;
+%feature("nothread", "0") Ogre::Root::_updateAllRenderTargets;
+
+%feature("nothreadblock") Ogre::Root::startRendering;
+%feature("nothreadblock") Ogre::Root::renderOneFrame;
+%feature("nothreadblock") Ogre::Root::_updateAllRenderTargets;
 #endif
 %include "OgreRoot.h"
 // dont wrap: not useful in high level languages
 
andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

paroj wrote: Wed Oct 02, 2024 11:20 pm

you need listeners to do so - otherwise OgreMain will depend on libPython.
[...]
you would need to re-acquire the GIL for all callbacks that you identified above, not just InputListener.

I knew how to do it in pybind11. Now I know how to do it in SWIG. :mrgreen:

It turned out to be very simple and I believe there will be no performance penalty as compared to allowPyThread().

Python methods called from C++ code through virtual functions (ie: Listeners) will lock the GIL. This is a very cheap operation, specially if compared to the whole process of calling python code from C++.

As with allowPyThread(), a few methods will release the GIL while processing occurs through the expensive pair PyEval_SaveThread/PyEval_RestoreThread. If you use startRendering(), this will be done only once, not every frame. If you call one python binding per frame, performance will be no worse than with allowPyThread().

Using the SWIG technique, the period the GIL is released is longer and therefore allows smoother multi-tasking. The GIL will be released at the begining of the Root methods startRendering, renderOneFrame or _updateAllRenderTargets, and reacquired at the end.

I have submitted a PR:
https://github.com/OGRECave/ogre/pull/3238

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

andrecaldas wrote: Fri Oct 04, 2024 5:17 am

If you call one python binding per frame, performance will be no worse than with allowPyThread().

This was not accurate. The allowPyThread() method has zero impact when not used.

paroj
OGRE Team Member
OGRE Team Member
Posts: 2106
Joined: Sun Mar 30, 2014 2:51 pm
x 1132

Re: Multithread in python: root.startRendering()

Post by paroj »

additionally, even with %feature("nothread") there are some checks added to all function calls, not only the thread enabled

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

Dear @paroj,

Please, forgive me for being so insistent. :-)
Please, feel very free to tell me:

  • Sorry, not interested.

Anyway... do you think it is a bad idea to have some cmake OPTION to use one or the other? I could work on my PR... what could the option be?...

Code: Select all

$ cmake -DPYTHREAD_STYLE="allowPyThread"
$ cmake -DPYTHREAD_STYLE="pure_swig"

Of course, the "pure_swig" should have a dummy "allowPyThread()" so not to break the API.

Maybe we could even have a list of PROS and CONS for each style.

paroj wrote: Fri Oct 04, 2024 9:07 pm

additionally, even with %feature("nothread") there are some checks added to all function calls, not only the thread enabled

It would be nice to have some benchmarks. But I have no idea what to test for. :mrgreen:

I keep saying I don't really know what "rendering" means. So, last, I'd like to ask you a (two paragraph) question...

If I use startRendering() with "pure swig", the GIL is released once and only acquired if there are registered python listeners. Otherwise, it is never reacquired and the thread becomes totally independent of the python thread. If I use renderOneFrame(), the GIL is released since the very beginning of the method call, and reacquired at the end (or in the event of registered python listeners).

I realize that in case of "allowPyThread()", the GIL is released when the frame rendering is queued, and reacquired when the frameEnded listener is called. The unlocked period was thought as to avoid python listener calls. What is the part of "renderOneFrame()" that the GIL is locked in this case? I mean, when I call renderOneFrame():

  1. Many things happen.
  2. The GIL is released.
  3. Rendering occurs. (I don't really know what this is)
  4. The GIL is locked again.
  5. Many things happen.

I'd like to know what happens in 1, 3 and 5. :mrgreen:

paroj
OGRE Team Member
OGRE Team Member
Posts: 2106
Joined: Sun Mar 30, 2014 2:51 pm
x 1132

Re: Multithread in python: root.startRendering()

Post by paroj »

no, actually I am interested in learning how people use threading in python to see whether allowPyThread is sufficient or not.
Currently, I do not see any advantages of your approach but the cost of having it, even when threads are not used.

If you take a look at where the most time is spent, it is for most applications swapBuffers:
Image

the listener in allowPyThread releases the GIL during this period. swapBuffers is when rendering happens. As GPUs are asynchronous, it is more efficient to wait foe the final image to be presented than a single render operation to complete. This and waiting for vsync is what happens in swapBuffers.

Before that we are busy submitting work to the GPU (internally and with listeners) so any interruptions with another python thread would be undesirable - besides having to allow the listeners to acquire the GIL.

The open questions for me are right now: is threading widespread enough to allow it by default (I assume no) and is allowing threading within renderOneFrame sufficient (I dont know).

andrecaldas
Halfling
Posts: 54
Joined: Mon May 06, 2024 1:06 am
x 3

Re: Multithread in python: root.startRendering()

Post by andrecaldas »

Disclaimer: I have actually no idea on what I talk about. I have never written a game nor used Ogre seriously. I am very new to all of this.

paroj wrote: Sat Oct 05, 2024 9:01 am

no, actually I am interested in learning how people use threading in python to see whether allowPyThread is sufficient or not.

Good!
I guess you mean how people will (if they do) use threading in ogre-python.

paroj wrote: Sat Oct 05, 2024 9:01 am

Currently, I do not see any advantages of your approach but the cost of having it, even when threads are not used.

I have a very naive vision of how games should work. Most people will not agree. And I am probably wrong. People advocate for too much synchronization between rendering, user input processing, physics, AI and networking. I don't see it like that. To me, in a sense, the game exists without the rendering. Rendering means taking a snapshot and presenting it to the user. The same way I press a button when I want, I don't see why AI has to be that synchronized either. There is also, no need, IMO, to sync user input processing and frame rendering or any other part.

Physics engine should be constantly updating the game. User input should be constantly being processed and changing the current state of some objects. AI should be constantly being processed and acting like the controlled entities where really thinking. Networking should be syncing things between computers as fast as possible.

I do naively believe that this need for synchronization is because:

  1. Doing things in a single thread is easier.
  2. When using multiple threads people actually lock too much.
  3. Besides locking too much, people lock it wrong (they do not offer the warranties they believe they do).

Ogre has its own "GIL", as far as I know. I actually do not know what is and what is not safe to do in Ogre when running a separate thread that manipulates the scene. I do not know what is the proper way safely manipulate the scene. This means that when rendering is happening, any one trying to do things (properly) in some other thread shall be blocked. This happens because locks are globalized instead of localized.

So, as far as I am concerned, a multi-threaded Ogre application should have one thread just rendering. In my scenario, I would have a "startRendering()" call that would release the GIL to never acquire it again! I do not really see (but I do not have enough experience, either) any need for python callbacks for things related to rendering. But if people really want, they can have it.

Now, manipulating the scene tree is something else. I would love to change the scene without worries about when rendering is happening. If possible, I would lock some nodes for a very short period of time and that would be the only thing I would be worried... not holding a lock for too long.

Manipulating the scene tree... yes... this could be done in python. But always, without holding any lock for too long.

Personally, in Ogre, I would like to have:

  1. One rendering thread.
  2. One event processing thread (no frame events). (capable of calling python callbacks)
  3. One physics thread. (capable of calling python callbacks)
  4. One AI thread. (capable of calling python callbacks)
  5. One network thread. (capable of calling python callbacks)
  6. One python thread where I could be doing stuff in python while everything happens.
paroj wrote: Sat Oct 05, 2024 9:01 am

If you take a look at where the most time is spent, it is for most applications swapBuffers:

I find it hard to believe. Too many people say that rendering requires lots of processing and frames get dropped (not in Ogre... in general). But I suppose swapping buffers is almost instantaneous. In this example, the application spends most of its time waiting for the vsync. I don't believe this is what happens in an intensive application.

If you do have an application that spends most of its rendering time parsing the scene tree, then you will be holding the GIL for all that time. Another thread could be pythoning stuff while the renderer parses the scene. Ogre does not need the GIL for anything. Ogre is C++ code.

paroj wrote: Sat Oct 05, 2024 9:01 am

the listener in allowPyThread releases the GIL during this period. swapBuffers is when rendering happens. As GPUs are asynchronous, it is more efficient to wait foe the final image to be presented than a single render operation to complete. This and waiting for vsync is what happens in swapBuffers.

I am probably misunderstanding what "rendering" means. Or, what "swapping buffers" means.

paroj wrote: Sat Oct 05, 2024 9:01 am

Before that we are busy submitting work to the GPU (internally and with listeners) so any interruptions with another python thread would be undesirable - besides having to allow the listeners to acquire the GIL.

That is my point. There should not be "another python thread", because the rendering should not be a python thread. Rendering should only have to wait for the very quick scene locks.

paroj wrote: Sat Oct 05, 2024 9:01 am

The open questions for me are right now: is threading widespread enough to allow it by default (I assume no) and is allowing threading within renderOneFrame sufficient (I dont know).

First, I would like to know if the cost of the swig approach is actually of any significance for a single threaded application.

I also assume no one uses multi-thread in python-ogre applications. Maybe now that they have "allowPyThreads()". :mrgreen:

As for "renderOneFrame()"... I would advocate for something like "startRendering()", instead. But I would not process user input events inside it.

PS: Sorry for speaking too much about things I do not really understand. :mrgreen: