Creating a custom TCP protocol

Published: Mar 30, 2024

Creating a protocol for TCP with binary data to get live game data from game servers.

Background

To start with some background, I have a few L4D2 servers that periodically send a JSON POST to my admin panel, sending “live data”, such as online players, their healths, state (dead, incapped, etc.), so that you can basically see what’s happening in a game, live. Now this system is definitely not efficient, as I’m working with already made sourcemod extenstions. This system is constantly fetching the states on demand, serializing it in JSON, and creating an entire new HTTP request, every 5 seconds.

Now, that system had some optimizations - when there is no one viewing it, the interval increased but I never liked that it’s sending these huge blobs of data. The data never had to travel in the internet, both the panel and servers are hosted on the same server, but still, it’s inefficient.

The new system

So I started with a system that I already had some goals in mind. I already had my panel connected to rcon, and I could host a server, but I didn’t want the panel to have to remember each server’s “live” port.

Goals

  • The panel itself hosted the server, so only one port was needed (to be remembered that is)
  • Partial changes could be sent (player took damage, map changed, an enemy spawned, etc)

So the basic implementation worked simply, the game server would establish a TCP connection to the panel, and then anytime an event happened, it would send a packet which consisted of a header and then 1 or more records. The header contained a byte for the version, and then a null-terminated string for the server’s ID that gets mapped on the panel.

I intially hoped to use UDP, as well I didn’t care if a packet was dropped as in less than a minute, more than likely, it will be stale. Unfortunately, UDP did not seem to work and was very hard to debug issues with (oh, and there were plenty of issues).

The records would start with an byte of the record type enum, then had specific range of data, separated by a 0x1e record separator, so that many records could be sent per packet.

Binary

It was fun working with binary, but I don’t think it was really necessary. I could have simply sent snapshots as small JSON payloads, or sent CSV/TSV records, but I felt that binary would be a good experience and it was. The new system is definitely way more efficient, all the extra ASCII characters to just describe JSON versus just compact sequence of raw bytes.

Problems

Now this system worked, somewhat, but I had lots of issues with the “partial updates” aspect. Lots of times, the game info would not be sent or received from the panel, or players would not be listed. As there was no periodic full refresh, these issues would persist until I forced the correct records to be sent.

But the protocol itself seemed to work fine. There were some parsing issues, sometimes invalid records were sent, but that forced me to acknowledge that bad data can be sent.

Expanding the protocol

The next feature I added to the admin panel is being able to specify actions. The panel was built for sourcemod, but I wanted to list a minecraft server I was hosting, but the issue is there was built in actions that did not apply to minecraft servers.

So I wanted some more goals:

Goals

  • Add builtin commands that should work regardless of the actual server type (minecraft, source engine, etc)
  • Can return booleans, integer, or float instead of just a string response

The previous system just simply used commands I knew and RCON, but I always felt that RCON had some issues, at least with the Node.JS clients I’ve seen. It also just returned a string response message, which is usually meant for a user, and not for a machine to understand.

Example

An example of a command I want is builtin:request_stop, which only restarts the server if there are no players online.

I already had a system, some command sm_request_stop that returned some message such as players are online, and the panel simply had response.includes('players are online') as a way to check, which is quite ugly and inflexible. If the response ever changed (translated, made more clear), the entire system breaks.

Expansion

So, why couldn’t I just use the live system I already implemented? The server is already connected to the admin panel, and already receives responses from the panel.

So that’s what I did. I added simple authentication, the plugin sends a JWT token (generated from the panel ahead of time) to the panel, the panel decodes and gets the server id for it, and links it the server instance. This removes the need of a header for every packet.

Problems

Now there is only one problem, that I unfortunately found out after I finished the entire implementation: The game servers hibernate, ending the TCP socket connection.

I could just stop the servers from hibernating, but hibernation is useful, there’s no need for these servers to be ticking and processing when for most of the day they are sat empty.

So unfortunately, I had to reimplement RCON support as a fallback on my servers. It’s an unfortunate limitation of sourcemod, that I would need to implement my own extenstion or have the socket extenstion work around the issue.

Conclusion

The “plugin socket” feature (I still need a better name), is very uncecessary but it was a great experience working with binary, being as efficient as possible instead of bloats of JSON and unecessary waste of memory.

Documentation

I also did document my entire specification, which was quite fun to also do. View it here