Torque/Networking

From TDN

Torque Networking Overview

Introduction

The Torque Networking Layer is a core system within any Torque application. Even single player projects (no connection to other computers) utilize the networking system, and it is critical to have a strong basis of understanding in how the system works for successful projects.

There are 5 basic areas of interest:

Each of these areas are detailed in separate sections, and I highly suggest following up this article with additional research and detail they provide.

For those new to networking, This article is aimed at answering what a server and a client are, the differences between the two, and how to write your code and script appropriately.

[edit]

Overview

Torque Networking evolved from the orginal Dynamix Starsiege: TRIBES networking model, which was designed to handle many of the pitfalls of networked multiplayer games across the Internet.

[edit]

Architecture

All Torque projects are organized by default as a client-server topology (yes, even single player games!). There are several reasons for this design decision, the most important of which lies with the concept of having a single authoritative owner of all simulation information. This gives the developer the ability to only send information to the clients that are important at that time to the player--which means that the ability of players to "hack the client" and gain advantages within your project is limited.

Since the server is the authoritative data owner, only the server is allowed to make direct changes to simulation objects (with the exception of the ControlObject, described in detail below). This means that players cannot directly manipulate game changing information--they are only given the minimum amount of information needed to properly observe the simulation from their player object's perspective, and have no mechanism for directly accessing the authoritative data on the server. The server maintains control of the simulation, and in effect the client simply replicates a subset of the simulation appropriate from their current position.
In any Internet based network architecture, there are several issues that must be handled, some of the most critical of which include:

Limited Bandwidth
Packet Loss
Latency

(A detailed discussion of these issues can be reviewed here: OpenTNL Introduction to Network Programming Concepts.)

As Torque networking evolved, many refactoring steps were performed to both expand functionality and continue to optimize bandwidth use for networked gaming. These refactoring steps are what provides the award winning capabilities of the model.

[edit]

Networking Strategies

There are several strategies that the Torque Networking model uses to overcome the critical issues:

[edit]

1: Bandwidth Limitation

From both an infrastructure capability perspective, as well as a cost of bandwidth perspective it is extremely important to minimize bandwidth as much as possible and still achieve your networking requirements. Torque uses several tactics to meet this strategic networking goal.

Send static data once, or not at all

We use a couple of strategies for ensuring that we only transmit static data when it is appropriate, which combine to satisfy this design requirement.

Torque projects are built around the concept of Datablocks, which are loosely defined as information about a grouping of simulation objects which will not change during runtime of the application. Since this data is completely static, we can network that data to each client at the beginning of the session when the user has a much larger margin of "wait time acceptability", instead of having to stream the content to each client on demand, when the margin for acceptance of delays is much lower. Once all of the datablocks are transmitted from the server to the client as they log in, we no longer need to transmit this data during the session, and can instead focus on the information which is important for the client's ability to replicate the simulation.

Torque also provides the capability of using a String Table Entry which keeps track of all strings that are delivered to the client (as long as the developer uses the syntax for tagged strings, and if that string ever needs to be transmitted again, the server can simply transmit a string index, resulting in a much smaller information transmit. Since the client has already received the tagged string and stored it in it's own String Table, it can use this index number to reference the string itself.

Compress data to the minimum space necessary

When conserving bandwidth, it is critical to use the absolutely minimum number of bits that can represent the data during the delivery of information. Torque implements the BitStream class to write both standard and (with developer enhancement) custom datatypes with the fewest possible bits for maximum compression. For example, boolean data values are delivered with just a single bit, integers are configured by the developer to use only the amount of bits necessary to represent the maximum value needed for that data element, and real numbers can be compressed to a 0..1 range (or -1..1 in the case of signed reals) with developer assigned compression sizes for maximum compression. Finally, we utilize Huffman compression for strings, as well as additional compression techniques for both 3D positions and surface normals.

Only send information that is relevant to the client

In a client-server model, the server maintains information on every single object in the simulation--but not every client needs to be updated on each of those objects all of the time. We use the concept of "scoping" to dynamically track all objects in the simulation that are "important" to each client, and only deliver updates to a particular client for those objects that are "in scope". This scoping calculation is fully controllable by the developer, and based on any measureable simulation criteria. By default, scoping is performed based on 3D distance from the client's control object to each of the objects in question, but is fully customizable.

Prioritize Updates

Since the server commonly tracks a lot more information than the connection between the server and client can handle at any given time, Torque networking allows for prioritization of updates to each specific client to make sure that the player has access to the information most important to him at any one time. This prioritization is fully controllable by the developer, and can even change dynamically based on simulation states, ensuring that the player has the highest frequency updates for the most important information.

Partial state updates

In most simulations, only certain groupings of data within an object will change due to actions within the simulation. For example, an FPS could logically group a player's position and velocity, knowing that when one of these changes it is very highly probable that the second data element in the logical grouping will change as well. Since there is no statistically probable chance that simply because the position changes the health would change as well, there is no reason to send an update to a client of the health when only the position has changed, so Torque gives you the capabilitiy to define and update up to 32 (stock) logical groupings of data to trim down the size of your updates.

[edit]

2: Packet Loss

In any networked simulation, it is extremely important to plan for and handle packet loss. Standard networking programming practices would normally indicate that a TCP/IP protocol should be used in an application where packet delivery must be guaranteed, but unfortunately the TCP/IP RFC has additional requirements (including guaranteed ordered delivery for all packets) that make it a poor choice as a delivery protocol for a networked simulation. There are certain types of data where we want guaranteed delivery and order, but there are even more types of data where these properties will actually cause us severe problems in our networked simulation.

There are a couple of different strategies that could be used for handling our requirements, and Torque has elected to build on top of the UDP protocol. Our implementation gives us total control over the type of packet we want to send and it's delivery parameters, giving us the ability to have at least 5 different delivery policies--which is more than enough to handle the vast majority of networked simulation needs.

Guaranteed Ordered Delivery

When a specific dataset must be guaranteed to be delivered in an ordered manner, we can utilize the NetEvent class, as well as the RPC (commandToXX) functionality to perform the delivery. This policy ensures that all updates of the same policy are processed (delivered and then handled by the client) before any new updates of this policy type are processed.

Guaranteed Delivery

This policy is very similar to Guaranteed Ordered, but does allow the client to immediately process any updates that are received even if the client is aware of sent but not received events within the same policy type.

Unguaranteed Data

At first this may not seem useful, but there are cases where it isn't application critical that data is guaranteed to be received, and where guaranteed ordered is too interruptive to the player's experience. A good example of this is voice communications: we don't want to hold up an entire voice stream just because a single packet is held up (Guaranteed Ordered), but we also don't want to have a particular voice packet show up out of order (Guaranteed Delivery), as it would make no sense to the player.

Current State Data

There are many cases in a networked simulation where data may be important to the client, but have a "lifetime" where it may become stale if transmission and processing are delayed. For example, if a position update for a player is sent to a client, but the client never received the packet, by the time the server is informed that the packet was lost, the information within the packet is stale (we assume that if a player is moving, they tend to continue to move during our time scale of packet delivery). Instead of simply retransmitting that previous state data update, the server will send the most current state of the object (updated with any additional movement information the server is aware of in our example) so that the client receives the most accurate information.

Quickest Delivery data

The final policy we have in Torque networking is for data that is so important to a client that we don't want to have to deal with latency issues involved with lost packets and the time required to resend that data once the server becomes aware that the packet was lost. In this case, we send the update along with every packet delivered to that client until the client acknowledges that the packet was received and processed. In other words, we "spend" bandwidth by repeatedly sending the same data every update, and we "buy" faster delivery by ensuring that there is no delay between a client losing a packet, asking for retransmit, and the server retransmitting--if the client loses a packet, our "Quickest Delivery" data is guaranteed to be in the next packet (and every future packet until the client processes the data) for immediate utilization. An example of using this policy is how controlled objects are updated--it's critical that the client simulation does it's best to know about the server's authoritative position of his player object, so we use this policy to guarantee high fidelity updates of the object.

[edit]

3: Latency

Latency derives from the time required to physically transmit data across a physical network connection. Total latency is dependent upon several factors, from the physical distance between the server and the client, to the speed and throughput of intermediate steps along the route, to packet loss and retransmission (transport and protocol layer dependent). Latency can be an extremely immersion-disruptive issue in a networked simulation, so we utilize several tactics to overcome the issue:

Interpolation

Interpolation is a tactic to overcome unsychronized states between the server and client. It is commonly used to handle "warping", which occurs when the client thinks an object is at a certain location within the simulation, but then receives an update from the server (authoritative) that is noticeably different. Instead of instantly changing the client's version of the simulation, we utilize interpolation to quickly but smoothly transition the observed object's out of synchronization data...instead of watching another player jump across your screen, we rapidly transit interpolated positions between the "current" one, and the one the server tells us is correct. Note: Used by itself, client side interpolation can actually exacerbate an out of synchronization state, because the client must spend time simulating the lost data instead of actually correcting the out of synch state, so it must be combined with other techniques.

Extrapolation

To enhance the benefits of Interpolation, as well as help to correct the problems interpolation causes, we use a combination of the two: We make guesses based on known historical states of an object, and extrapolate expected future states of that object. This, combined with interpolation of our client side guess towards the server's authoritative update gives us a smooth and controlled capability that overcomes many of the artifacts of a high latency connection.

For example, if we know both the current position (client side), as well as the last reported velocity of a particular object, we can extrapolate a future position of that object using our simulation physics. Once we get an update back from the server, our position should be pretty close, and we can then interpolate between our best guess position, and the authoritative server position, resulting in only a very small change.

This combined tactic works well for observed objects, because the client is doing it's best to not only predict the server's authoritative state, but is constantly correcting it's guesses by interpolating to each authoritative update as it is received. Unfortunately, when the player is directly applying control inputs to an object, even these two tactics aren't enough, so we introduce a third.

Client Side Prediction

Fortunately, when a client is controlling an object, the client is also the very first part of the total networked simulation that is aware of the user move inputs. Therefore, we can allow the client to directly utilize the inputs the player has made to more accurately predict the extrapolated position of the control object. As long as there aren't forces on the server that may affect our control object that we aren't aware of yet at the controlling client, we can directly apply user inputs to our client side simulation of the control object--in effect getting ahead of the server since the moves haven't even been delivered to the server yet! Of course, the server is still authoritative, so we must temper our predicted states with actual authoritative updates the server will send us once it performs the move in the server simulation, but client side prediction gives us a very powerful tool for the best "feel of control" to the player.

[edit]

Utilization from a developer's perspective

[edit]

Control Object--the "Player"

...

[edit]

Ghosted Objects--what the "Player" sees

The primary method for creating objects that a player can interact with is to make the object Ghostable. A Ghostable object is one where any state or data changes (that are networked via the object's packUpdate() and unpackUpdate() methods) will be transmitted via the Ghosting System to every client that has this object within Scope.

The important things to keep in mind when determining if an object should be ghosted, and how it should be ghosted are:

Is this object one that any and all clients should be able to observe?

For example, Player objects are ghosted, but a player's inventory is not--the inventory is only important to that particular player, and no other client should be aware of the state of the inventory.

Is this object one that a client should only see when nearby?

Some objects need to be seen always, regardless of where the client's control object is located within the world. However, there are objects that should always be updated to all clients regardless of where the control object is--for example, the terrain.

In a standard Torque project, you will not have to decide the "ghostability" of every object...the stock inheritible classes pre-define by both ancestry (inherited from NetObject) as well as having the mNetFlags.set(Ghostable); call in the default constructor for the base class. However, if you need to override the default, or are creating a completely new networkable object in your project, you will need to follow the guidelines for making a ghostable object.

[edit]