In the early days of game design, before high-quality commercial game engines like Unity and UnrealEngine were widely and cheaply available to as many demographics as today, each new studio would write an engine for the types of games they were going to make with it. This allows a high degree of reusability of code between projects and makes things like cross-platform enhancements easier to implement. The idea of commercial game engines is to act as an environment of tools where all types of games can be created. Both in theory and in practice, these generalist game engines have succeeded in the market, especially due to frequent and substantial updates to each of the different code libraries that make up the engine. Specialized game engines are still common especially when an idea or core mechanic either hasn’t been well explored in games or when a next-generation game is being optimized to a bleeding-edge standard and existing engines have already been evaluated to not meet requirements.
In all software design, there is a balance between developing a specialized solution and developing a general solution. Popular commercial game engines may seem to disprove that assertion, but it’s not the design of those engines that make them general. It’s the idea that with large amounts of code and many implementations of solutions to common problems, and the ability to extend those solutions, that you can solve any of the typical problems that have been found common in the games industry. It’s not just the workbench-like approach that makes these engines prevalent though, it’s that you can make games quickly and easily compared with a bespoke engine. One idea tantamount to the philosophy of commercial engines is that a common solution to a common problem that’s been widely accepted is usually good enough, and of course, letting someone else build your engine lets the game developers focus on their game.
Whether composing a game engine from third-party components, establishing a workbench of tools for your asset pipeline, or evaluating commercial engines, there are a lot of similar considerations. Oftentimes dozens of hours of research or training time may go into this step of a game project and the close examination of your project’s requirements upfront will always pay off.
I’ll break down the core systems of a game engine and explain how each of the attributes of those systems affects the different resources that are used and tools that must be created when an engine is designed. Entire books have been written on each of the topics I’ll present here, this overview is intended to be high-level and provide an abstract on the theory of game engine design. Though there many ways to write these systems and tools in software and most software implementations are not publically available.
Game Engine Systems
The different systems in a game engine are broken down by the task that they need to perform at a high-level. Broadly categorized those are input, rendering, physics, audio, networking, and often scripting, logic and AI systems will be separate and present. The game itself is built by providing the correct configurations when initializing and resources when running these systems, especially scripting or some other logic system it’s common for files and assets to be managed through a specialized system such as resource manager if the other system’s don’t load resources on their own. I won’t include the resource management system description here, instead, I’ll include descriptions of the resources each other system requires. It’s also common to separate UI into its own system. Rather than assume UI is built strictly using input, rendering, audio, and scripting it’s usually built on top of them as its own system. I won’t describe UI as a separate system, I’ll instead assume the appropriate functions exist within those other systems, and UI implementation is designed on a per-game basis. There are other ways to slice this list when looked at from the perspective of other paradigms of software design, but this is how I’ll describe them.
The input system is responsible for accepting input from devices like controllers, keyboards, mice and joysticks, and mapping it to respective actions provided by the scripting system. Commonly an input system behaves as a message scheduling process, and inputs are actually queued and processed that way instead. This opens many doors such as the ability of AI systems to be built by scripting systems simply queueing events just as if a controller had done it. That abstraction between player-controlled actor and game-controlled actor can be useful for many types of games such as those implementing squad mechanics or others where player control switches between different heroes throughout the game. It can be useful for input systems to publish the controller type that their players are using so that the correct images of buttons can be shown in the UI for example.
- A configuration file for key and button action mappings for each input type, where actions are named for the given game context, those action names may be in an input system configuration file as well.
- (Potentially) user profile configurations loaded by the network or scripting systems for modern controllers.
The rendering system is responsible for loading and placing textures, shader programs, and point data for 3D meshes into GPU buffers and using them to compose all visual aspects of a corresponding view. A pure software view won’t have 3D hardware acceleration and then only textures and raw screen coordinates are available. Sometimes the view is a separate object so there may be multiple, otherwise the rendering system itself is responsible for splitting up a single view for split-screen and multiple render targets. In addition to the primitives that the rendering hardware uses to construct geometry from points and triangles, the rendering system usually has geometric primitives added such as planes, spheres, cones, cubes. These primitives are useful for debugging and for Constructive Solid Geometry (CSG) composition and are typically available from a physics system anyway, as those primitives are required for broad-phase collision optimizations.
- The most important resource in a modern game rendering system is a shader program. An entire game’s visual feature-set could be created in a single, sufficiently sophisticated shader program. Usually, several shader programs will be written for a game, and often several for a single object or effect. Compiling shader programs should happen once as needed, and reused where possible. Loading and binding many shader programs can be expensive, so a common optimization is to put as much functionality in a single shader as possible, but the usefulness of this approach varies with the complexity of the shader, and hardware optimizations are likely to make this technique obsolete in the near future.
- Textures are any 2D color data that can be sent to a GPU and used by a shader program. In a shader, program textures are usually referred to as surfaces, because their color and transparency information is used for creating light and thus the entire visual aspect of surfaces on 3D models and 2D primitives.
- Materials are a way of binding together and composing a set of textures and shader programs that together create the desired effect, and are often a separate file.
- A mesh is any 3D point data and is rarely a separate file from a model. When meshes are separate from models, the model may refer to those meshes to create its model data. A mesh when stored as a separate file, typically does not have a reference to any materials embedded within it.
- Models are 3D vector data files and are usually comprised of several meshes and materials. A model usually embeds file links or references to meshes, textures, shader programs, and materials but it may be the sole file containing the data for all of those resource types, and act more like a scene or archive of them instead.
The physics system is usually a black-box style constraint solver with primitives and functions which calculate solutions to the most fundamental physics problems such as collision detection and collision response. We use these constraint-solving functions in a series of several passes over shape data or point cloud information, to integrate each successive step of a physics simulation with respect to time. By integrating with respect to time, the values of the point clouds are repositioned proportionately to the desired step along the curve function which in these cases is position with respect to time, or the first of the derivatives of displacement. By integrating positions at further time steps and fitting a function with more accurate approximation of the curve you create a more stable solution, but it takes more steps to approach a solution, and so takes longer to simulate a time step in the physics system itself. If two masses of point cloud information are moved along their curves should collide by the physics system’s interpretation of the information. In cases where the point clouds are intended to bound a solid object, then a function must be used to test for collisions and will result in either a truth value or an object containing penetration information such as a manifold or simply a penetration vector.
- Physics systems rarely have very many different types of resources associated with them, but it’s very common for materials (see Rendering system) to have within them or another similar file, the information used by the physics system to specify a difference in density, friction, restitution, and other physical properties.
- Physics systems are usually written as middleware or a third-party library and most of their functionality takes place by passing point cloud and material information to functions known as integration functions, then using partially integrated results to test for collisions in order to then respond with another set of functions. The first pass known as the integration function is provided to a constraint solver which dictates constraints like movement along a path, or distance from some object, or the extent that a spring can stretch to, etc.. Collisions can also be seen as a sort of spatial constraint where point cloud information is constrained to not penetrate another point cloud. When a collision has been determined to have occurred, response information is prepared and passed to a collision resolution function which will prevent the collision ahead of time by integrating several steps ahead or correct it in another timestep.
The audio system is conceptually simple. The goal is to play sounds and music through the audio hardware on the desired device. This also means that the resources are conceptually simple. Usually, they’re just audio files. However sometimes more complex types of layered audio resources are present, or composition files describing how associated files are to be layered either of which provides values to generators for either procedural sample generation, or procedural track production. Until recently it would have been impractical for most kinds of procedural audio to be played back live during simulation time, however, there are several software and games in recent years which use some generative sounds or export the results as a static sound file to be used as usual. In practice, audio hardware has a series of different data lines it simultaneously computes. Traditional software access to these data lines is by way of queueing and activating them individually or capping how many can be played simultaneously to what the hardware can handle, these are usually referred to as channels. The approach used for computing the final signal will vary from hardware to hardware, but the Fourier Transform function is universally used for this kind of operation.
- Sound files make up the most common kind of resource used by the audio system, their processing and loading are far too complicated to discuss for the scope of this article. The Fourier Transform is used in audio signal processing in the hardware to mux and demux the multiplexed signals.
- Composition files associate other audio files with a generative track or a hierarchical set of rules on how the audio is to be generated. The generation of music composed with this kind of information is typically similar to the kind of procedural generation performed by an L-system grammar.
The network system is not simple and its structure depends on the entire data model of the engine at hand. A unifying approach such as data-driven design becomes extremely useful in networking systems because it can mean little to no data manipulation is required for entities shared over the network. The network system typically separates host and client communication so that various kinds of structures can be developed using the same functionality, this is especially common because most network systems are built on middleware or third-party libraries. The host system is where all of the information for every entity in the simulation is stored, and it’s also where almost all of the data processing by the physics and scripting systems takes place. This structure heavily simplifies how the processing of entities that conceptually exist on different machines can actually interact, and heavily simplifies security in that clients can’t easily lie about physics information since their device did not yield the results. For these security reasons, only input system signals are usually sent by the clients to the host. The host system then uses that information for each client to feed the physics system and the host process issues the new physics information to clients, whose systems accept it as granted accurate when online. There are several client-side prediction techniques used to correct directions and positions in case of high latency or low throughput communication, the most common is known as dead reckoning and primarily affects the rendering system but may also affect the audio system in some engines.
The scripting system is an embedded programming language compiler or in the case of an interpreter is a virtual machine. The most important layer on top of the virtual machine is an adapter to each of the other engine systems, allowing the scripting system alone, to operate all of the rest of the functionality within the engine. This allows a unifying formal semantics and way of reasoning about a project created using some engine. Another common reason people choose to implement or embed scripting systems into their engine is to be able to change the functionality of a project while it’s currently running without having to recompile. This is still a valid reason today, but improvements to compilation techniques and modern language design such as that seen in Jai or Zig can largely eliminate the need for a scripting language. There are languages that already support this sort of hot-swappable, change code while running approach, though their use in performance-critical software such as games is practically absent.
- The only resource consumed by a scripting system is script files, which are usually text files but can be precompiled depending on the language interpreting them.
The structure of a game engine can vary wildly depending mostly on the data model chosen. This is because data is the very currency of the functions which make up a simulation, and so its structure dictates the type of functions that operate on it, and so also drives style decisions around their processing.
This isn’t a comprehensive coverage of all of the ways an engine can be designed, just a good overview of the different systems that must be developed in order to have a decent simulation or game engine working. Hopefully, this article helps to clarify some of the questions you might have about simulation and game engine design!