Forward 3D vs Forward Clustered

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


User avatar
TaaTT4
OGRE Contributor
OGRE Contributor
Posts: 267
Joined: Wed Apr 23, 2014 3:49 pm
Location: Bologna, Italy
x 75

Forward 3D vs Forward Clustered

Post by TaaTT4 »

As the title says, what are the differences between these two techniques? I'm asking this from a user perspective. Which algorithm consumes more memory? Which algorithm is more performant? Which algorithm gives better results in terms of quality? Trying the OGRE sample (in RelWithDebInfo mode) it seems that Forward Clustered is both the lightweightest and the less artifact prone.
I guess Forward Clustered algorithm subdivide the space as explained here (both algorithms share the majority of initialization parameters), but what is the purpouse of decalsPerCell and cubemapProbesPerCell parameters?

Senior programmer at 505 Games; former senior engine programmer at Sandbox Games
Worked on: Racecraft EsportRacecraft Coin-Op, Victory: The Age of Racing

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5511
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1379

Re: Forward 3D vs Forward Clustered

Post by dark_sylinc »

Aside from the subdivision strategy which you pointed out:
  1. F. 3D is (usually) lighter on the CPU. F. Clustered (usually) consumes more CPU resources but is highly parallel (more CPU cores = higher perf), while F3D doesn't take advantage of multithreading
  2. F.3D consumes more GPU, F. Clustered consumes less GPU. It's not because the algorithm makes less calculations, it's simply because lights are culled more tightly against the clusters (i.e. F3D has more false positives, hence pixels may end up processing more lights than they have to)
  3. Because of the same reason, F. Clustered may produce fewer artifacts (because fewer false positives means there's more available room for genuine lights in the limited space of lights per cluster)
  4. F3D probably consumes a less VRAM than Clustered, but it shouldn't be a considerable amount (we would have to do the math)
  5. Per pixel cubemap reflections (see LocalCubemaps sample in 2.2) and decals (see Decals sample) are only supported in F. Clustered
If you're CPU bound and don't have many CPU cores, F3D may result in faster performance (e.g. I prefer to use F3D in debug builds because of the lack of optimizations + by default we restrict the number of threads to 1 for easier debugging).
If you're GPU bound F. Clustered will be a win.

Overall F. Clustered is often a win, but I prefer to use F3D during debug builds.
but what is the purpouse of decalsPerCell and cubemapProbesPerCell parameters?
It's for Per pixel cubemaps (see LocalCubemaps sample in 2.2) and decals (see Decals sample). If you don't use these features, leave it as 0.

Like lights, they determine the max number of cubemaps per pixel/decals per cluster. If this limit has to be exceeded during render, then artifacts may appear (the same as with lights)