You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. CPU usage (and thus load) spiking and staying at a high baseline
This is a long-standing issue and is of lesser concern than point 2. This issue rears its head typically ~2 weeks after a game server start. The update (and restart) predating this was December 3rd, see spiking begin December 11th. The usage is directly caused by the game server process, which sits at 100% usage on a single core, thus 50% baseline spike.
There isn't any measurable impact on player experience when this issue appears.
This can also be seen via htop:
Oddly, the process is reporting uptime of 8 days, which lines up roughly when the spiking began. Coincidence?
2. Memory usage continually marching up
Memory has typically been a slight concern. Though, from my memory, we first started having significant problems leading up to the most recent wipe (3rd November). We put this down to the map size and waited for the wipe. The rise of memory early in this map's lifetime was concerning, but we also hoped this would level out, it appears this is not the case.
This does inevitably cause the host machine to run out of memory and kill the game server process. Adding swap in previous testing appears to mitigate this to some degree. Suspected memory leak, now to find it 🙃
Action
I've taken a backup of the live server and will run it without any player load on a smaller server configuration (1vCPU, 1GB). I'll allow this to run for at least the remainder of the year to allow time for the CPU issue to appear, but will focus on debugging memory.
I'll add a 2GB swap, given memory appears to be growing at 5 % points a day (80% current).
The text was updated successfully, but these errors were encountered:
Looks like I was initially wrong about the testing server results, Valgrind did find a notable memory leak which should soon be handled, as linked above.
A continuation of #151
See past 14 day view of metrics:
1. CPU usage (and thus load) spiking and staying at a high baseline
This is a long-standing issue and is of lesser concern than point 2. This issue rears its head typically ~2 weeks after a game server start. The update (and restart) predating this was December 3rd, see spiking begin December 11th. The usage is directly caused by the game server process, which sits at 100% usage on a single core, thus 50% baseline spike.
There isn't any measurable impact on player experience when this issue appears.
This can also be seen via htop:
Oddly, the process is reporting uptime of 8 days, which lines up roughly when the spiking began. Coincidence?
2. Memory usage continually marching up
Memory has typically been a slight concern. Though, from my memory, we first started having significant problems leading up to the most recent wipe (3rd November). We put this down to the map size and waited for the wipe. The rise of memory early in this map's lifetime was concerning, but we also hoped this would level out, it appears this is not the case.
This does inevitably cause the host machine to run out of memory and kill the game server process. Adding swap in previous testing appears to mitigate this to some degree. Suspected memory leak, now to find it 🙃
Action
The text was updated successfully, but these errors were encountered: