If the talented and generous Eric Matyas doesn't mind me getting in his way for some moments...
Some great news for our team, we finally have a tool that can (mostly) automate this entire procedure, so I felt compelled to talk about it a whole bunch. It's automated to the point that we make a few simple adjustments to our XMLs before running them through a script that automatically appends the stages into the vertex buffer section of the scripts, increments the indexes of the submeshes and appends those submeshes below with the appropriate LOD information in the headers, all into 1 XML file. More or less, that's all there is to it. I just wanted to update this thread one more time and explain why this is so important and a bit more about it. I'm not a technical wizard by any stretch of the imagination but I will do my best to explain it to the extent of my knowledge...
First of all, if you think LOD is somehow antiquated or you've been told so by other "experts", maybe computers are powerful enough now (which game developers make sure they never will be, let's be honest) here's the deal with that point of view - on one hand it's true if you're working on a very closed-quarters game like Doom3 or such, where anything at a certain medium-distance would be around a corner or behind a door before any LOD is required. Pretty obvious situation, really, although I would argue that Doom3 could have made use of some LOD for character models anyway, with a bit of care with the normalmaps, and actually it did make use of LOD with its patch-meshes anyway. Another situation might be something like a top-down RTS or something like that with a fixed viewpoint, with everything at some fixed depth, where LOD doesn't really get a chance to be very appropriate. Finally, the more technical theory I've heard is that down through the z-buffer/depth-buffer (same thing, AFAIK) vertices are more heavily "approximated" and at a certain point/distance get crunched together and end up sharing the same (tighter) coordinates, thus requiring less calculations (floating point accuracy) thus saving some performance anyway, because floating-point accuracy decreases down through the z-buffer/depth-buffer. This is exactly why you get more z-fighting the further away tighter geometry is. I know that's true to some degree, but how much you can weigh that against performance benefits, I really don't know. Some people obviously believe it's magnificent (enough to neglect LOD completely) but I'm not so quick to presume that. I just imagine that it can't really beat perfectly-implemented LOD stages when you're talking about bringing a mesh down from like a quarter million triangles at the highest to several hundred at the lowest. It also means you can't give your user an option to "lock" the LOD stage to prevent excessive polycounts from being reached, for lower-end systems/higher framerate desires. So that wraps up my general understanding of everything. Any "experts" are very welcome to chime in about all of that.
If you have many stages of LOD and it's being swapped in and out, on-the-fly, as separate meshes (a method for manually-created LOD stages I've seen, and was stuck with in various engines) then you end up with an extra batch for each stage, and a little bit of overhead while the engine is swapping them in and out. Might not seem like a big deal for 1 character, but when you multiply this for every single object being LOD'd along with every single instance the LOD stages need to be swapped (could be any moment), you end up with a much bigger number of batches to the total batch count, and much greater overhead which makes your framerate unstable and lag/chug/stutter a whole heap. This is (the difference between) what we call "inline LOD" and "streamed LOD". Much like all of game development terminology, others may call it something different.
What we've been trying to do is not only LOD our characters, their "accoutrements" (weapons, etc.), pickups and other things like physics objects and other interactibles, but the entire level itself, by breaking our maps into smaller pieces we call "sectors", because we need to do this anyway for real-time lighting, distance-culling and giving them all individual lightmaps. It made a lot of sense to also try to LOD it all, if we could. However, there is a big trade-off to do doing this, and I've asked a lot of people and never really got a clear answer to it - it makes frustrum culling operate per-object instead of per-vertex(/face) as it should with static objects. What this means is - if you're at the very edge of a sector and all you can see of that sector in your viewport is a tiny edge of a triangle and a tiny texel, the entire sector is still going to be rendered "behind your back", as it were. For smaller LOD'd stuff like characters this is also true but that's to be expected, if you can only see the pinky toe of a character, the whole character is still there because you can't expect frustrum culling to work like that with smaller dynamic objects, but for our "interior sectors" it's a different story, it's a real penalty. One approach is to force the LOD down to the next stage underneath _while_ the player is still inside the sector, which we can do, and just design around that by keeping heavy-duty geometry away from the peripheries of every sector, as much as we can, so any LOD changes are not right in the player's face. If there was some way we could solve this that would increase performance even more so, we'd definitely love it, but like I said, I never have gotten a clear answer to this out of anyone I've mentioned it to. For our approach right now, we think it's worth it. We still get per-object culling which means any other sector out of the viewport is gone, and because we LOD our interiors to such an extreme degree, we're saving so much memory and batches that it justifies this frustrum culling penalty as a side-effect. I'm convinced there probably is a solution to it anyway, to get frustrum culling to behave again, and one day we may have the best of both worlds, but as for right now, we do unfortunately have this issue.
So, the end result is that we basically LOD everything, can use many millions of triangles per-scene in our projects, have an extremely scalable set of assets that can be adjusted by the user with an LOD-lock setting. Keep in mind, when we talk about scalability and LOD settings, it's not just for lowering the polycount but also for driving the polycount up, for posterity. We all know it takes such a long time to develop a game of this complexity, it's suicide to not aim that far ahead. I didn't mention our workflow concerning how we actually make these carefully-crafted LOD stages, but it involves some other automated tools and making almost everything using "spline-cages" and other patch-like geometry (Radiant is our software of choice for "mapping") and can increase/decrease the complexity of most of the geometry with almost no effort, without turning our meshes into retarded garbage and messing up all of the UV layouts, like automate LOD tools usually do. I'm not saying auto-LOD tools are completely useless, but really, I've never had experience with any one-click solution tool that gave you something that you could present to the player up-close using some LOD-lock setting. If you've made it through this whole rant this far, thanks and well done, but sorry to say that I've decided not to make any of this stuff public, but if you contact me privately then I will probably be generous enough to discuss more and provide scripts/software for you to do the same thing. I've just had so much neglect and been involved in terrible flamewars with arrogant folks suffering from severe superiority complexes over this stuff, I don't really feel like just lumping everything into a ZIP file for everyone, but that doesn't mean we are not willing to share it entirely. I also just basically wanted to type all of this out so that, soon, when we release some of our screenshots to the world, if we get accused of "cheating", instead of needing to explain everything over and over again, we could just point folks to this thread so they might get a deeper understanding of how we squeeze so many damn triangles into our scenes without dipping into single digit framerates, or even "SPF" - seconds-per-frame.
Many indie-devs know what I'm talking about...
I will wrap up by saying that, while we have done a LOT of testing of this, it's still not quite 100% and we won't really know until we get a game-full of assets together in a few scenes, to stress-out with and do some comparison testing, which is so much work, but coming very soon! Because issues like batch-count and overdraw are exponential penalties, you never fully know until you get that deep into game development and push things that far, but, it really does look like we have a promising approach to everything from here.
Any comments welcome and please don't hesitate to contact me privately with any questions. If you'd like to get involved, we're always looking for new folks to dev with.