Rebuilding the entire Infrastructure

The Dark Ages​

Until not long ago, NetherGames was using the same infrastructure design we had back in 2020. While that wouldn't sound bad, a lot has happened since then. NetherGames has rapidly grown, and we had to get used to higher demand.

In the old design of the infrastructure, we had several physical nodes. We call them nodes in this blog post. You can imagine a node as a physical server in a data centre. And we had a few of these. (See on the right)
Screenshot 2022-03-23 at 21.32.51.png



Now, what's the issue, you might ask. Well, we managed all of these servers separately. If we wanted to schedule more Bedwars Squads servers, we would have to manually start them after calculating which node would be the best to start them on.

Imagine the pain of manually restarting servers on all of these nodes separately when pushing an update. You had to essentially open every node's control panel and restart the servers there. That usually took us a couple of minutes.

Changing settings on, say, all Bedwars servers was a huge deal back then. It probably took us 20-30 minutes to update all the individual servers' settings.

All this mess with tons of servers led to us creating beautiful big charts with server locations to know where these servers are.

The Big Change​

Now, that was the dark age. Rolling out updates was slow, updating nodes tedious, and scaling game modes up was a pain.

So what did we change? We moved our entire infrastructure into a Kubernetes cluster in a month-long preparation project.

Now, What Is Kubernetes?​

That's a question that one of our executives, Callum, has asked me recently. And explaining it in its full feature set to a non-techie would be quite a challenge. So I responded with this:

"Kubernetes is essentially a way to connect all of our nodes into one cluster." While Callum was satisfied with this explanation, it's much deeper than that, and we will explore that depth today!

Kubernetes connects our servers in a single network. As such, nothing from the outside can reach them unless otherwise specified.

You can imagine this like a castle. Our game servers are rooms inside the castle, hidden away from public view. And then you have the entrance, our game proxies. Through our game proxies, you enter the court and can move between rooms under the guidance of the proxy.

Now, that's very useful. It solves a few security challenges. All our game servers and other services can seamlessly communicate with each other.

Scaling​

And yes, there's another issue solved. Kubernetes essentially lets us provide templates for our servers. We say any Bedwars-Solo server should look like this, enabling it to create any amount of replicas of these servers with ease among the entire cluster. We can specify everything in a simple configuration file (per server type).

As such, the scaling part goes from being a considerable undertaking down to this little slider:
Screenshot 2022-03-13 at 23.31.43.png

Scheduling​

And that's something else that Kubernetes does for us. Say, we want 12 US BW-Squads servers. We give that instruction, and Kubernetes does the rest for us. It calculates which nodes in our cluster fit the resource requirements of the service we want to schedule, it automatically spreads them accordingly, so no node handles all the load at once, and it automatically respects region boundaries, so no US servers are spawned on the AP region.

I Tripped Over a Power Cord​

And yes, that has happened before. So what if a node completely goes offline for one of a vast number of possible reasons? Kubernetes will automatically shift all the services on this node to the other available nodes, so they won't be completely offline while services restart. Kubernetes will move them to a node that is indeed online.

The Downside​

Yes, as with everything, there's a downside. Kubernetes is better suited for stateless services. Those services do not persist any data, for example, to disk. However, looking at Skyblock or Factions, we face the issue of having to persist data. While it is difficult to get to work, there are many solutions to resolve this issue by providing network-synchronized, highly available storage to services that need it.

At the same time, Kubernetes works well on the cloud, so on Google Cloud Platform, Amazon Web Services, IBM Cloud or similar. However, NetherGames brings its servers, so the configuration and integration are more challenging.

Effect on Players​

Little to none. It should reduce the number of restarts of game servers you should experience, but that's about it. But for us developers, it's a massive enhancement from previous architectures, making it easier to bring more features and experiences to the players.

Graphs and Charts​

Everyone loves graphs, so do we. So here are some:


Screenshot 2022-03-13 at 23.45.28.png

Screenshot 2022-03-13 at 23.46.06.png

Screenshot 2022-03-13 at 23.47.18.png
 

Huy Enter

Mod
Staff member
Mod
Appeals & Disputes Division

Wow. So that could explain a little bit about updating our server procedure and it make time required for updating the server more shortened and make it more easier to do so. Impressive. That could be a huge achievement for Developer Team. Hope we will have more cool Developer product to reduce server lag, especially in Faction and Skyblock where player keep bow spamming and make server have significant lag and help our player have better experience when playing on our server. :D <3:mc_137-0:
 

New

New Member

The Dark Ages​

Until not long ago, NetherGames was using the same infrastructure design we had back in 2020. While that wouldn't sound bad, a lot has happened since then. NetherGames has rapidly grown, and we had to get used to higher demand.

In the old design of the infrastructure, we had several physical nodes. We call them nodes in this blog post. You can imagine a node as a physical server in a data centre. And we had a few of these. (See on the right)
View attachment 12611


Now, what's the issue, you might ask. Well, we managed all of these servers separately. If we wanted to schedule more Bedwars Squads servers, we would have to manually start them after calculating which node would be the best to start them on.

Imagine the pain of manually restarting servers on all of these nodes separately when pushing an update. You had to essentially open every node's control panel and restart the servers there. That usually took us a couple of minutes.

Changing settings on, say, all Bedwars servers was a huge deal back then. It probably took us 20-30 minutes to update all the individual servers' settings.

All this mess with tons of servers led to us creating beautiful big charts with server locations to know where these servers are.

The Big Change​

Now, that was the dark age. Rolling out updates was slow, updating nodes tedious, and scaling game modes up was a pain.

So what did we change? We moved our entire infrastructure into a Kubernetes cluster in a month-long preparation project.

Now, What Is Kubernetes?​

That's a question that one of our executives, Callum, has asked me recently. And explaining it in its full feature set to a non-techie would be quite a challenge. So I responded with this:

"Kubernetes is essentially a way to connect all of our nodes into one cluster." While Callum was satisfied with this explanation, it's much deeper than that, and we will explore that depth today!

Kubernetes connects our servers in a single network. As such, nothing from the outside can reach them unless otherwise specified.

You can imagine this like a castle. Our game servers are rooms inside the castle, hidden away from public view. And then you have the entrance, our game proxies. Through our game proxies, you enter the court and can move between rooms under the guidance of the proxy.

Now, that's very useful. It solves a few security challenges. All our game servers and other services can seamlessly communicate with each other.

Scaling​

And yes, there's another issue solved. Kubernetes essentially lets us provide templates for our servers. We say any Bedwars-Solo server should look like this, enabling it to create any amount of replicas of these servers with ease among the entire cluster. We can specify everything in a simple configuration file (per server type).

As such, the scaling part goes from being a considerable undertaking down to this little slider:
View attachment 12613

Scheduling​

And that's something else that Kubernetes does for us. Say, we want 12 US BW-Squads servers. We give that instruction, and Kubernetes does the rest for us. It calculates which nodes in our cluster fit the resource requirements of the service we want to schedule, it automatically spreads them accordingly, so no node handles all the load at once, and it automatically respects region boundaries, so no US servers are spawned on the AP region.

I Tripped Over a Power Cord​

And yes, that has happened before. So what if a node completely goes offline for one of a vast number of possible reasons? Kubernetes will automatically shift all the services on this node to the other available nodes, so they won't be completely offline while services restart. Kubernetes will move them to a node that is indeed online.

The Downside​

Yes, as with everything, there's a downside. Kubernetes is better suited for stateless services. Those services do not persist any data, for example, to disk. However, looking at Skyblock or Factions, we face the issue of having to persist data. While it is difficult to get to work, there are many solutions to resolve this issue by providing network-synchronized, highly available storage to services that need it.

At the same time, Kubernetes works well on the cloud, so on Google Cloud Platform, Amazon Web Services, IBM Cloud or similar. However, NetherGames brings its servers, so the configuration and integration are more challenging.

Effect on Players​

Little to none. It should reduce the number of restarts of game servers you should experience, but that's about it. But for us developers, it's a massive enhancement from previous architectures, making it easier to bring more features and experiences to the players.

Graphs and Charts​

Everyone loves graphs, so do we. So here are some:


View attachment 12614
View attachment 12615
View attachment 12616
great work!
 

Tektion

Supervisor
Staff member
Supervisor
Applications Department
Player Services Department
Discord Moderation Division
Designer

Excellent, I understood around 40% or less of that article. I did understand one major point though: you updated the infrastructure so it is now easier to push updates. Congrats and thank you!
 
  • Like
Reactions: Huy Enter

Aestrio

New Member
Huh what does mean by skyblock persist please help me with that does that mean skyblock will reset???
 

exegolbrine

Member
Great work.
NG has come so far.

It’s really good to inform the community of what actually goes into running NG. This should make the community more aware and understanding; and
hopefully less complaining.
 

CyberGenius

New Member
Huh what does mean by skyblock persist please help me with that does that mean skyblock will reset???
When they say that in gamemodes such as skyblock, data is "persisted", they mean it is written to some sort of storage (such as a disk) on their servers. The reason this was troublesome was that Kubernetes is not easily suited for persisting data (as Tobias said, Kubernetes works better with "stateless services", where data doesn't have to be saved in some sort of storage).