Over the past couple of evenings, I have been tracking down a bug that drastically impacted my game's performance. In the end, it was caused by an Instancer template that recursively instanced itself. So every frame, a new child node would be added, quickly giving 1000s of children.

The offending file ThickFlameRing.xml can be seen here:

<?xml version="1.0" ?>
<node id="1">
    <attribute name="Is Enabled" value="true" />
    <attribute name="Name" value="ThickFlameRing" />
    <attribute name="Tags" value="EMPTY" />
    <attribute name="Position" value="0 0 0" />
    <attribute name="Rotation" value="0.5 0 -0.866025 0" />
    <attribute name="Scale" value="1 1 1" />
    <component id="1" type="Instancer">
        <attribute name="Object" value="Objects/ThickFlameRing.xml" />
    </component>
</node>

Unfortunately, my (modified) Blender3D exporter subtly doesn't handle certain collection naming arrangements somehow I had two collections, one inside Collection1 and one outside, and the outside one was just a linked copy of the other (or something like that I'm not sure what I did). But both where named ThickFlameRing, and the one simply instanced the other, hence the file above. Deleting that second (extra) collection and just having the first fixed the problem after I re-exported. However, tracking down the cause of the problem was a lot more complicated, as I made several incorrect guesses along the way.

I noticed the problem after I would let the game run for a few minutes. The performance would drop from >60FPS to <10FPS. Turning on the profiler's output, I saw that the issue was in UpdatePhysics. "Naturally," I assumed I had some physics issue (though I believe now that such an issue should show up in StepSimulation. I tracked down the issue to be caused by the CharacterController (by removing all of them from the scene). So I suspected the problem was in some of the physics code with that, perhaps SphereCastMultiple. Using callgrind, I saw that the bullet3d broadphase was fairly costly, so I thought maybe it was some bug in that, or maybe I was somewhere creating physics objects in an extra scene, or that were just falling forever, etc.

But I was hopeful that the performance was actually just caused by the debug build, which is slower. So I tried switching to a RelWithDebInfo build, which nearly immediately crashed. It turns out the AddDirs directories were not being added to the ResourceCache, which caused a crash when a resource that was needed was not found (yes, I should fix that bug later so it doesn't crash). But It seemed like it was actually adding them, I even added logging that showed that to be the case. Well, actually it didn't. It showed that I thought they were being added, but I was doing that in an assert(). And assert() is replaced with a no-op ((void)0) or similar on NDEBUG builds. Which meant I wasn't actually adding them. So I added my own do_assert() macro that will be used to always call the code, even on Release builds (and for now it throws an exception for use in debugging with the RelWithDebInfo build).

And with that, I could actually run the RelWithDebInfo build successfully. Unfortunately, performance was still bad! I thought "Well, maybe it's just my laptop is heating up as I run it, and that is causing it to throttle and killing performance." But that isn't acceptable, as I have a decent processor, and want ordinary computers to be able to play my game. I would need to fix it some how.

So I threw in some new URHO3D_PROFILE markers to my code, which did show me the problem was in CharacterController::FixedUpdate(). I eliminated the SphereCastMultiple from being the cause by commenting it out still the performance degradation (aside: an annoying bug, I had to wait many seconds to confirm that the bug was still there every time I ran it). But I couldn't figure out what was the actual cause.

Then, praise the Lord, last night I realized it could have been due to new Nodes being created. Every frame there were 2 InstantiateXML calls, after all (a bug I had not bothered to track down when I was looking at the profiler earlier). Since there was a GetComponentRecursive call in the FixedUpdate, this could be the cause! Indeed, further investigation showed that it was the cause there were new children being created every frame, coming from the Instancer component. And, looking at the file, I found what I have copied above a recursive instance template. Which quickly swelled the number of children to search through from about 50 to thousands in seconds.

So, I made some fixes in Blender, re-exported, and no more bug. I also added some safeguards in the Instancer component to track the instancing depth so I will receive some error messages should such a situation arise again in the future (even if it is a more subtle case where it has several templates in the recursive loop, rather than just a single one).


Lessons:

  • Remember that assert(x()) only calls x() in a debug build.
  • Don't assume the fault is in the library you've been using successfully up to now without good cause.
  • Pay attention to what is labeled as the cause (UpdatePhysics) and what is not (StepSimulation).
  • Don't blindly trust the tools I think callgrind caused me more of a delay than help here, as I was worried about the UI text batches, and the Sound, and the Physics, and not about the actual problem (GetComponent recursive). Not sure why that didn't show up. I do think the tools are very helpful and good in general, but here it didn't help.

Previous Post Next Post