[Ogre 1.9+] Threading redesign

Discussion area about developing or extending OGRE, adding plugins for it or building applications on it. No newbie questions please, use the Help forum for that.
User avatar
m2codeGEN
Halfling
Posts: 52
Joined: Tue Apr 26, 2011 9:13 am
Location: Russia, Tver
x 2

[Ogre 1.9+] Threading redesign

Post by m2codeGEN »

After reading the Ogre 1.9/2.0 thread I decide show community our OgreMain improvements.
1. Parallel update scene graph.

Code: Select all

void CRootSceneNode::_update(bool updateChildren, bool parentHasChanged)
{
   //return SceneNode::_update(updateChildren, parentHasChanged);   // <-- this is serial update

   if (m_bChildsDirty)  // from hash_map to std::vector
   {
      m_bChildsDirty = false;
      m_vecChilds.resize(mChildren.size());

      ChildNodeMap::const_iterator it = mChildren.begin(), itend = mChildren.end();
      for (size_t i = 0, nCount = m_vecChilds.size(); i != nCount; ++i, ++it)
         m_vecChilds[i] = it->second;
   }

   updateOgreNodeImpl(false, parentHasChanged);

   // parallel update for children boost about ~3.5 - 4 for i7 (8 threads) Intel TBB
   {
      class Updater
      {
         Ogre::Node**         m_arrNodes;
         bool                 m_bUpdateChildren;
         bool                 m_bParentHasChanged;

      public:
         AxisAlignedBox       m_aabb;

         Updater(Ogre::Node **nodes_, bool updateChildren, bool parentHasChanged)
            : m_arrNodes(nodes_), m_bUpdateChildren(updateChildren), m_bParentHasChanged(parentHasChanged),
            m_aabb(AxisAlignedBox::BOX_NULL)
         {
            // empty
         }

         /// Splitter constructor
         Updater(const Updater &u_, tbb::split)
            : m_arrNodes(u_.m_arrNodes),
            m_bUpdateChildren(u_.m_bUpdateChildren),
            m_bParentHasChanged(u_.m_bParentHasChanged),
            m_aabb(AxisAlignedBox::BOX_NULL)
         {
            // empty
         }


         /// parallel update
         void operator() (const tbb::blocked_range<size_t> &range_)
         {
            Ogre::Node **nodes = m_arrNodes;
            bool bChild = m_bUpdateChildren, bParent = m_bParentHasChanged;
            AxisAlignedBox aabb = m_aabb;
            SceneNode *node;

            for (size_t i = range_.begin(), iEnd = range_.end(); i != iEnd; ++i)
            {
               node = static_cast<SceneNode*>(nodes[i]);
               node->_update(bChild, bParent);
               node->_getFullTransform();       // additional calculate Matrix4 full transform of node
               aabb.merge(node->_getWorldAABB());
            }

            m_aabb = aabb;
         }

         /// merge parallel calculations
         void join(const Updater &u_)
         {
            m_aabb.merge(u_.m_aabb);
         }
      };

      // let`s TBB chose grainsize.
      Updater updater(&m_vecChilds[0], updateChildren, parentHasChanged);
      tbb::parallel_reduce(tbb::blocked_range<size_t>(0, m_vecChilds.size()), updater);

      mWorldAABB = updater.m_aabb;  // 

     // additionally your can merge AABB with attached to Root node objects. We don`t attach any objects to root node
   }
}
2. STL containers in ogre must have thread safe allocator (we use nedmalloc).
3. _updateBounds impovement. while (++child != itEnd && !aabb.isInfinite()); loop can be finished faster

Code: Select all

   AxisAlignedBox aabb;
   assert(aabb.isNull() && "Ogre change logic!. Need code correction");

   // Update bounds from own attached objects
   if (!mObjectsByName.empty())
   {
      ObjectMap::iterator i = mObjectsByName.begin(), iEnd = mObjectsByName.end();
      do
      {
         // Merge world bounds of each object
         aabb.merge(i->second->getWorldBoundingBox(true));
      } while (++i != iEnd);
   }

   // Merge with children
   if (!mChildren.empty() && !aabb.isInfinite())
   {
      ChildNodeMap::iterator child = mChildren.begin(), itEnd = mChildren.end();
      do
      {
         aabb.merge(static_cast<SceneNode*>(child->second)->_getWorldAABB());
      } while (++child != itEnd && !aabb.isInfinite());
   }
   
   mWorldAABB = aabb;
User avatar
syedhs
Silver Sponsor
Silver Sponsor
Posts: 2703
Joined: Mon Aug 29, 2005 3:24 pm
Location: Kuala Lumpur, Malaysia
x 51

Re: [Ogre 1.9+] Threading redesign

Post by syedhs »

What is the stage of the design? Is it usable ie apply the patch and it works, and more importantly has there been any test showing any performance increase due to multithreading design? Probably a sample with 5000 objects visible within camera frustrum..
A willow deeply scarred, somebody's broken heart
And a washed-out dream
They follow the pattern of the wind, ya' see
Cause they got no place to be
That's why I'm starting with me
User avatar
m2codeGEN
Halfling
Posts: 52
Joined: Tue Apr 26, 2011 9:13 am
Location: Russia, Tver
x 2

Re: [Ogre 1.9+] Threading redesign

Post by m2codeGEN »

Yes, it may be useful for 1.7 and above.
We attached to root about 2000 child node (our game entity with 50-100 sub nodes).
On Core i7 3930K (12 threads) boost is about 4 times (build with msvc 2005)

But scene management may be very specific and this is may not be universal solution
Transporter
Minaton
Posts: 933
Joined: Mon Mar 05, 2012 11:37 am
Location: Germany
x 110

Re: [Ogre 1.9+] Threading redesign

Post by Transporter »

I've added an threading example to the wiki: http://www.ogre3d.org/tikiwiki/tiki-ind ... =Threading