Persistence

Learn how to manage persistent worlds with 24/7 always online deployments on Private Fleets.

Many genres (MMOs, Sandboxes, Social Games) leverage persistent worlds to let players:

  • meet and socialize with new friends; to nurture organic player communities,

  • explore a living open world filled with user-generated content placed by players,

  • engage in epic raid battles lasting hours with large groups or entire guilds.

Explore strategies to provide the best possible player experience, keep cost under control, and remove player frustration due to outages or rollbacks. Enhance the traditional server model by bringing the advantages of edge computing packaged for easy use by game developers.

✔️ Preparation

To enable persistent, uninterrupted 24/7 always online deployments:

If you need help, please reach out to us over Discord. For live games support see our ticketing system.

🔑 Server Ownership

Explore pros and cons of modern and traditional ownership models with an edge computing twist.

Studio Hosting

Server hosting is traditionally managed by the studio, covering cost of hosting from game revenue.

👍 Advantages

  • transparent product pricing - cost of hosting is covered by license/subscription of player,

  • strong client/server compatibility with loose coupling of clients, services, and scaling,

  • more resilient to cheating and reverse engineering due to closed source nature of servers.

👎 Disadvantages

  • community modding support is limited to ensure server integrity and stability.

Community Servers

Let your players host and fund their own servers and remove need for third party rental services. Funnel revenue through your studio instead of third party lacking insight into quality of end user experience.

👍 Advantages

  • enhanced modding support through curated list of modded Apps and Versions,

  • improved player feedback loop due to closer collaboration with community,

  • reduced financial risk due to players covering cost of hosting.

👎 Disadvantages

  • more operations for studio - moderating player requests and collecting payments,

  • weaker client/server compatibility due to increased number of modded versions,

  • prone to cheaters due to distributed codebase and possibility of reverse engineering.

🥛 Capacity & Scaling

Learn advanced techniques to optimize server availability, hosting cost, and quality of service.

Capacity

Deployments don't track or manage active player connections after you 1. Start a Deployment to give you absolute control and freedom to implement any design.

Implement capacity management to ensure your servers:

  • maximize cost savings - benchmark and utilize server resources efficiently,

  • provide smooth gameplay - prevent overloading servers with too many concurrent players,

  • prevent bad reviews due to crashes - catch and handle unexpected exceptions.

To ensure efficient server capacity management:

  • release player slots if players matched to game server don't connect within a few seconds,

  • frequently send a minimal heartbeat message from clients to server to keep track of activity,

  • disconnect clients and release player slots if no activity is detected for several seconds,

  • prevent players from being added to servers with full capacity and no available player slots.

Scale Up

Scaling persistent servers doesn't require "guesstimating" regional traffic or server cost. Reserve Private Fleets capacity for and automatically overflow to cloud during unexpected peaks.

Implement server scaling strategies to:

  • enable large scale hosting while carefully protecting against abuse,

  • minimize wasted server cost due to empty standby servers,

  • prevent long queue times by responding to increased player demand quickly.

Auto-scaling Reference Architecture

Integration Key Points

  1. Clients integrate Scaling Authority - Matchmaking, Server Browser, or a custom solution.

    1. Discover other players, reserve capacity in running servers, or request new servers if needed.

  2. Scaling Authority assigns running servers or starts new servers to serve players.

  3. Server notifies Scaling Authority in real time about server start/stop and player connection changes.

    1. Scaling Authority deletes (expires) stale records of unresponsive (crashed) servers.

  4. Clients connect and establish game sessions with Server directly, proceed to gameplay.

We strongly recommend scaling based on number of connections instead of physical load (CPU & RAM), since momentary fluctuations in physical load may result in unpredictable availability.

Scale Down

Efficient scale-down policies are key to optimizing cost, but shutting down servers without caution may impact player experience negatively. Consider these factors and test changes before releasing:

Is your detection of player activity / disconnection reliable?

  • Does absence of input reliably indicate player inactivity? Players often use bots, macros, and other techniques to fake activity and maintain active connection to avoid queue times.

  • Are there any actions taken by active players often which are hard to fake?

  • Is using bots or macros an issue or a feature with Persistence servers?

Is shutting off servers easily and quickly reversible (scale back up)?

  • Once reaching 3. Deployment Ready, your server may require additional time to perform engine initialization and State Management (restoration of state). Do you incur any additional costs for compute or data transfer with game services? Does this wait time impact player experience?

  • Can you hide server loading with a loading scene, mini-game, a lobby, or through other means?

Are players bound to specific server instances or can they migrate easily?

  • How does connecting to a different server influence player's account, purchase history, social experience, progression, inventory, and other gameplay aspects?

  • Review your Recovery Objectives and ensure critical data isn't lost.

  • Implement automated methods or player tools for restoring critical data.

  • Provide human support and communicate with your community about outages and issues.

🔎 Discoverability

To find active servers accepting new players, implement one or more discovery methods:

💭 Configuration & State

Integrate services to define initial server requirements and manage player and server state.

Configuration Management

Configuration refers to the initial data passed to your server during deployment:

Configuration is immutable - it's read once after starting your server and doesn't change later on.

State Management

State refers to data describing the result of a series of previous player actions and server events:

State data changes frequently. Clients are selectively updated on relevant information by server authority.

Game objects typically designate an owner who controls them, this can be either server or a player.

Server Owned Objects

Server owned objects can be manipulated only by server. Connected players have limited read access to server owned objects. Server owned objects are usually not shared with other servers.

Player Owned Objects

Player owned objects can be manipulated both by players and server. Assigning ownership of persistent objects to players makes migration to other servers easier later on.

Prevent cheating by validating changes with server authority. Authority and ownership can be separate.

Recovery Objectives

In case of issues, some categories of data may be more sensitive to data loss, for example:

  • account, subscription, purchase, and microtransaction data - critical,

  • progression, achievement, leaderboards, and inventory data - important,

  • cheat detection, moderation, performance, and error tracking data - important,

  • player behavior, social, chat data - low importance.

We highly recommend discussing the following amongst your team:

  • categories of data handled in your game clients and servers,

  • importance and sensitivity of each category for your business and players,

  • Recovery Point Objective (RPO) - acceptable amount of data loss before serious harm occurs,

  • Recovery Time Objective (RTO) - acceptable amount of downtime before serious harm occurs.

👀 Observability

Long running persistent servers bring new observability challenges, specifically detecting anomalies in monitoring, logging, and bug tracking.

We strongly recommend implementing alerts for server restarts to gain more visibility over issues.

Our Endpoint Storage log integration only transfers logs after 5. Deployment Stopped, add custom logs and bug tracking (such as Sentry) to troubleshoot partial failures and bugs.

Last updated

Was this helpful?