Network 2 players co-op lag issues

Started by
4 comments, last by hplus0603 2 years, 2 months ago

Hi there,

I'm currently adding a multiplayer co-op mode to my game and so far I've got some results but I'm experiencing lag issues and I was wondering if there is a better way to implement it.

A little bit of context

The game is a single screen puzzler, you move on a grid in 4 directions but it's not turn based, meaning time and latency are very important parts of the gameplay.

The logic is computed every frame just before rendering at 60fps and is 100% deterministic (no RNG)

Normal 1 player game loop

  • get inputs from keyboard or gamepad and compute game command (NONE, UP, DOWN, LEFT or RIGHT)
  • compute game logic
  • render frame

The approach I used to add multiplayer (2 players)

Player 1:

  • read received messages with command from Player 2
  • if there are more than 1 message, discard all but the latest game command from Player 2
  • compute self game command
  • send both its own game command and the one from Player 2 back to it
  • compute game logic
  • render frame

Player 2:

  • compute game command
  • send it to Player 1
  • read messages from Player 1 containing both game commands, there might be multiple messages for some reason
  • compute game logic for every message or don't if none
  • render frame

This way I'm sure both P1 and P2 have the exact same commands to compute the next frames and there are no desync. If P2 got multiple messages, it catches up to the latest state before rendering the frame.

Here's what it looks like (recording of player 1):

Issue

when there are multiple messages on P1 side, it really is not a problem for the gameplay and animation because it's not waiting at all and the game is doing one game logic pass per rendered frame, but when it happens on P2 side, the animation is choppy.

Thing is that I expect to have maybe 2 or 3 messages delayed and received at the same time but sometimes it a lot more like around 10 and it's really noticeable on P2 side.

I'm using ISteamNetworkingMessages interface for network communication and I send messages with k_nSteamNetworkingSend_ReliableNoNagle flags to try to minimize latency.

And all this happens while I'm using two computers connected locally to the same router in my office.

Any help or advice would be greatly appreciated

Thanks!

Advertisement

A couple things immediately jump to mind.

First, you have an extremely chatty interface. You're sending a lot of messages, one per frame. Instead, you probably only need one per event. I'm not sure what your graphics frames are, maybe 60 Hz, 75 Hz, 90 Hz, 144 is increasingly common, some games are even faster. If you're sending out perhaps 90 updates per second the vast majority will be “NONE”. It's far better to only send when state changes happen. Assume if they were pushing “LEFT” in one frame, they'll continue pushing “LEFT” 11 milliseconds later.

Second, you're not taking into account the nature of networking. You might be on a local network with a high quality switch and good network cards and get your messages delivered in around 50 microseconds, less than one graphics frame but still long enough to straddle a frame boundary. You might be across the Internet at a distance where times are much greater. Someone located across the country may be multiple graphics frames of latency, because you cannot beat the speed of light or speed of electric signals. There can be lag introduced at all levels from a variety of other factors, like sharing the network with other devices, congestion at switches and hubs, and interference due to sun spots. It often isn't your program in particular, but your program's contribution to the network load. You can never rely on network packets to arrive in the way you've described.

You're already turning off the Nagle flag, which is a double-edge sword in this. While it does prevent coalescing of small packets, it also means your chatty interface will generate a terrible signal-to-noise ratio. The minimum length of the packet is 46 octets, but depending on your networking options each thing you send may even be larger than that. If the only thing you're sending is a single byte, that's a lot of data and network processing to handle the single byte. That just increases congestion on the network equipment even more.

You don't mention what you're doing to account for time differences in the game, how you know how far in the past the message was sent. If you aren't doing it already, send a time mechanism like the game's simulation time so you can monitor the flow on the opposite side. Exactly how you work that out will make a difference in your game logic, since two players will be seeing different things you'll need to resolve that somehow. For example if a player pushed a boulder while a network message was in transit, one side simulates with the boulder in the new position, the other with it in the old position, and they get out of sync. Lockstep would certainly resolve a lot of issues, but there are various options to correct for out of date data desync.

It could easily be that you're upsetting the network hardware from the chatty interface between the two machines and seeing the result.

Two things about the deterministic networking model:

  1. if you use floating point, determinism may not be guaranteed anyway, because the rounding mode for a thread may be randomly changed by window messages, and some CPUs may use 80-bit internal precision versus 64-bit internal precision (for x87 mode) or do different things with denormals (SSE mode.)
  2. the round-trip time in general when networking will be significantly higher than the 15 milliseconds between two frame renders at 60 Hz. you need to establish a game clock, and queue commands to happen at “now plus x frames” to give them time to arrive. You need to do this even for the host, if you want the game to feel the same on both sides.

If you doubt the Steam networking implementation, you could build a simple test that calls socket() and bind() with a UDP socket on the server, poll using select() and recvfrom() (or recvfrom() on a non-blocking socket,) and just uses sendto() from the client. It's actually very easy to get going when you can hard-code 127.0.0.1 and the port number for the server. This will let you test unreliable transmission without packet merging, to see whether it makes a difference.

enum Bool { True, False, FileNotFound };

Thanks for the replies and suggestions.

So if I understood correctly the network is saturated with a lot of small messages (60 messages per second of 1 byte + packet size in both ways) and most of them are unnecessary as you mentioned, it's true I can just send the commands when they change state.

I rewriting the code to minimize the number of messages from P2 to P1 and I will also try to merge multiple commands in one message (probably 5 frames) from P1 to P2, it will introduce a little bit more latency but my tests showed that on P2 side if there is up to 10 frames of latency it's totally acceptable.

As for the determinism of the game logic I'm not using floating point calculation at all so it should be fine ?

I will post my progress here and hopefully I will get satisfying results

Thanks again

I would aggregate commands within the same frame, but probably not aggregate across many frames. 5 frames seems like it might be too much, but if it works for you, great!

And, yes, using integer is the best way to ensure determinism! Especially important in worlds where you might have mobile or macOS clients that may run ARM CPUs together with your other clients/servers that may run x64 CPUs (or even ARM servers, using Amazon Graviton instances!)

enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement