Improving Performance with and Support for Large Numbers of Buildables

dGr8LookinSparky · March 12, 2018, 1:50am

I noticed a few familiar names participating in that thread that I remember from GPP .

That is an interesting test. I noticed in their screenshots, the ones that had less teslas in view had higher frame rates (which makes sense, stuff that isn’t in the scene wouldn’t be rendered). From their screenshots, the displayed frame rates were 30 fps with the largest number of teslas in view, to 50 fps with less in view.

There is a claim made by mopython in the original post in that thread:

Perhaps that was true with that benchmark in 2013, without breaking compatibility in the trem engine, on an older version of trem. I don’t know which Tremulous client he might have used in comparison, maybe the stock gpp client from 2009, maybe the 1.1 modded client tremfusion, maybe the gpp modded client segfault, maybe some other client. I assume that most likely a client with the old renderer was used, and was missing out on a ton of other improvements/enhancements/fixes present in the Trem 1.3 repo, including merges from ioquake3’s continued development.

Also, I don’t know which Trem map may have been used, and I don’t know what kind of specs the computer that was used in that demonstration had, nor if it was the same machine used that had the claimed “0-4 frames per second” with over 50 buildables on trem (it would be helpful to know those specs, and to see the frame rates on that machine with the same views on the map without any teslas present as a base comparison).

I know that Tremulous’ engine can perform better than it does now just by breaking backwards compatibility with old clients/servers. And it can perform even better with various optimizations.

One basic example is that currently the engine limits the maximum number of game entities to 1024, and it limits the number of entities that can be broadcasted to clients for updated info to 256 “snap shot entities.” What that means is if you enter an area of a map with a large number of buildables that you didn’t see in that match before, you often would get a huge lag spike, because you did not have any of the info for any of those buildable prior.

If, however, the same number of “snap shot entities” is broadcasted as the maximum number of entities, clients would only have to receive the information that changed (in other words, no lag spikes from entering rooms with a lot of buildables). But, those values can only be changed if backwards compatibility with old clients is broken.

I started to try out changing various values in trem’s engine, and see if I can test with a large number of teslas. Some things to note:

I increased the max game entities to 2048 (by increasing GENTITYNUM_BITS from 10 to 11), and I increased MAX_SNAPSHOT_ENTITIES and MAX_ENTITIES_IN_SNAPSHOT both from 256 to 2048.
I disabled the tesla’s idle sounds for this experiment, to not have to worry about making adjustments to support the large number of sounds yet. But the animations (with the sparking effects) were still enabled.
Each tesla buildable has its own range marker (even when not drawn), that takes up an entity slot, so for however many tesla buildable entities there are, there are that many range marker entities. It is planned to move the range markers out of the entity slots at some point, allowing for that much more room for additional entities.
I decreased the cost of the tesla to 1 bp (to help with the spamming), and increased the human build points
I conducted this experiment on @Xembie 's Epic5 map.
I was making these changes based on our current game logic repo, which at the moment uses an older version of the game engine (maybe about 2 years behind the game engine for the latest 1.3 alpha release client, so performances for this kind of test may be better based on that latest release).
I tried this on my 3 year old ASUS 64 bit laptop with an integrated Intel® Haswell Mobile graphics card with an Intel® Core™ i5-4210U CPU @ 1.70GHz × 4 processor (on this laptop my frame rate for trem is usually around 90 fps). Frame rates would likely be much higher on my new pc I built a couple of months ago that has a 3GB GDDR5 RAM Geforce GTX 1060 dedicated graphics card (on that computer, I can usually get 333 fps playing Trem).

Results of the Preliminary Experiment on the Trem 1.3 engine:
I was able to spam about 250 teslas in addition to the one reactor, one defense computer, and the default alien and human bases (which I did not decon). after that amount, when I built more teslas sometimes they would disappear, sometimes they didn’t have that issue, but it seems that some more adjustments are going to have to be made for larger numbers of teslas to be reliably visible, and to reach the 1100 teslas mark. Anyways, with the 250 teslas built, when all of them are in view (as well as the human default base in the baclground), I was getting 48 frames per second, and if I only view a portion of the teslas, I was getting 88 to 90 frames per second.

dGr8LookinSparky · March 12, 2018, 2:43am

I should add that most of these kinds of improvements discussed in this thread would be for the next major version after 1.3, as compatibility for old clients would generally have to be broken, and one of the core features we are keeping in 1.3 is backwards compatibility.

With that said, we did implement a command for 1.3 servers to improvise addressing the problem of lag spikes when entering a room with a large number of buildables. What it does is refreshes your connection to the server, and rebroadcasts the information on all of the entities. This is not seamless though, as you see the loading screen when it is executed, and you have to manually use this command whenever you know you are going to enter such a room. This command is called /delag. Also note that delag is availble on 1.3 servers regardless of which trem client you use, but generally not available on non-1.3 servers even when using the 1.3 client (unless someone backports delag to an older server).