All game developers from indie to lifetime professional will, at some point, benefit from code that someone else wrote. Whether it is to understand a new feature, fix a previously unsolvable problem, or saving time rather than writing it from the ground up, permissively licensed sample code is an invaluable tool. Intel provides a wealth of game sample code at the Game Developer section of the Intel® Developer Zone.
Figure 1: Intel provides game sample code at https://software.intel.com/en-us/gamedev/code-samples
Over the past few years, Intel has worked with many game developers to optimize the performance of their games on Intel® hardware. Our engagements often produce insights or functions that should be shared with the world in the form of sample code. Sometimes we create the samples based on requirements from developers to perfectly fit their needs. In the past few years, samples we’ve created have been adapted for use in games published by Blizzard and Codemasters – specifically Adaptive Volumetric Shadow Maps (AVSM), Conservative Morphological Anti-Aliasing (CMAA), and Software Occlusion Culling.
AVSM Gives GRID* 2 a Boost
Codemasters and Intel engineers have been working together on titles for several years. For GRID 2, Codemasters was looking for a way to provide more visual oomph in their game on Intel hardware. They brainstormed with Intel engineers and decided to pursue using Intel’s PixelSync feature for more realistic smoke. This was considered a high visibility effect for a racing game where the user can donut the car creating large smoke trails. The initial starting point was the work of Intel engineer Marco Salvi who created an AVSM implementation using DirectX 11 and presented it at Siggraph 2010. The Intel sample code used atomic operations to ensure Order-Independent Transparency (OIT). To adapt this work to GRID 2, Codemasters and Intel engineers worked together to modify the algorithm using PixelSync so it could run in a bounded amount of memory. This modified version of AVSM in bounded memory was also released as an Intel sample.
Integration happened over 14 days with Intel engineers on-site at Codemasters. The initial implementation was considered complete when Codemasters had a working test level that used the game’s own particle effects to generate the AVSM textures and apply self-shadowing. Once that was complete, Codemasters engineers expanded the system to support animated textures to better match the look and feel of the game. At the same time, the artists at Codemasters designed particle effects that complemented the new technique, using a larger numbers of smaller particles instead of large billboards showing multiple smoke particles. Codemasters engineers discovered that the improved lighting drew attention to sorting issues in the additive-blending particle system, requiring rework to create a more robust CPU particle sort.
After ensuring that the effect looked right, the engineering team then looked for any problems that could arise in extreme conditions. Since the game supported a player-controlled camera, it was possible for the camera to get so close to the smoke effects that the screen was covered with smoke. This introduced enormous overdraw, which was not handled by the AVSM sample. The engineering team combined AVSM with screen space tessellated per vertex, instead of per pixel, lighting. This new method handled significant amounts of overdraw to cover the worst-case scenario the game could face.
Intel sample code played multiple roles in this story. The initial research work inspired Codemasters to add a new feature. Codemasters then drove modifications to make the sample a better fit for their game. Intel followed up by updating and republishing the improved sample for other game developers to use.
Figure 2: Codemasters GRID* 2 applied the Intel® AVSM sample to achieve better visuals
CMAA Smooths Things Over in World of Warcraft*
With the more graphically stunning World of Warcraft expansion, Warlords of Draenor*, a new anti-aliasing algorithm was introduced to the game's graphics options. CMAA, intended to provide fast, effective anti-aliasing for mainstream hardware, is an image-based post-processing technique developed by Intel engineer Filip Strugar. By operating on the final frame buffer, it can perform anti-aliasing isolated from any other changes in the rendering pipeline. As some of World of Warcraft's newest deferred rendering techniques show, this approach allows anti-aliasing to operate independent of the chosen shading model enabling more developer flexibility.
CMAA is also an easily modifiable algorithm, allowing developers a measure of freedom to repurpose it for specific needs and make their own enhancements. The 6.1 content patch for World of Warcraft contains another new anti-aliasing mode called SSAA 2x + CMAA. This pairs naïve super-sampling with post-process anti-aliasing by performing a CMAA calculation on the 2x frame buffer object before sampling down to native resolution. This combination of algorithms provides the highest fidelity anti-aliasing for power users.
Developers always start with a decision as to whether a technique is worth exploring or not. In the case of World of Warcraft the decision to try out CMAA was made much easier since Intel provides a CMAA sample. The sample has a default test scene, but also allows developers to insert their own image to preview CMAA's level of effectiveness and measure its added cost in milliseconds for any resolution or test case. This then enables developers to make a well-considered and educated decision. The World of Warcraft development team was able to drop a screen capture of the then current Pandaren raid content into the sample and see exactly how much edge smoothing they would be getting for the performance cost.
Figure 3: The Intel® CMAA sample has a default scene, and also allows the effect to be tested on a custom image
Once the decision was made to add CMAA to the game, some changes needed to be made to the World of Warcraft engine to support DirectX 11 features used by CMAA. While the technique does nicely drop into the tail end of a rendering pipeline, data still needs to be prepared in a specific way. The algorithm requires a read-only depth buffer view, which means some engines may need to add an optional read-only flag to texture and FrameBuffer objects. It also relies on Unordered Access Views (UAVs, a.k.a. ImageBuffers) for some of its functionality and performance. While many DirectX 11 engines already have support for this, others will need to be updated to add UAV support. Outside of these supporting additions, the sample's shader code could be reused almost wholesale with only minor modification to some structures.
CMAA strikes an important balance of value for cost and minimal invasiveness to the overall process. This allows it to conservatively provide better image quality and stability than FXAA 3.8, for between 90%-120% of the cost. “Enhanced Subpixel Morphological Antialiasing“ (SMAA) is another popular post-processing anti-aliasing option; the least costly version – SMAA 1x – offers more anti-aliasing and produces fewer overall graphical artifacts but causes more blurring, shape distortion, and is more affected by small, frame-to-frame changes (temporal instability), all while running at a 30%-120% higher cost than CMAA. Leigh Davies and Filip Strugar’s analysis of these algorithms is available on IDZ.
Unlike MSAA, CMAA’s smoothing will also be applied to alpha-tested textures providing a more complete anti-aliasing of the frame. Warlords of Draenor has even demonstrated that CMAA can be paired with SSAA to provide a more beautiful and accurate level of anti-aliasing than is available in any other option provided. The algorithm was designed to remain under 3ms when running at 1600x900 resolution on a 15W 4th generation Intel® Core™ processor. From an algorithm complexity standpoint, its cost can be calculated as 3 passes at ½ resolution + 1 final native resolution pass.
Figure 4: Blizzard’s World of Warcraft* uses CMAA to achieve great looking results on mainstream PCs
Software Occlusion Culling Reduces Unneeded Rendering in World of Warcraft*
Another Intel sample that was attractive for World of Warcraft is Software Occlusion Culling. By only rendering objects that the camera can actually see, rendering time is greatly reduced with little if any impact on the outcome. Fabien Giesen wrote a multi-part blog series analyzing Intel’s sample (which has since been updated), and Blizzard decided it was a good fit.
As often is the case, the sample code needed to be rewritten to suit the game engine. Blizzard engineers adopted the kernel of this sample and built the rest on their own. The entire occlusion process ran in a low-cost range of 0.2 to 1.5ms when first implemented in March 2013. Since then, it has become even more an integrated and beneficial part of the game.
The use of these techniques has helped World of Warcraft continue to run smoothly while Blizzard engineers improved the game’s visual effects without leaving behind users with mainstream hardware. The freedom to include the full spectrum of machines opens doors for engineers to pursue new opportunities, and things are already looking good.
Intel® Sample Code is for Everyone
The game sample code team at Intel strives to support the needs of game developers to identify real-life requirements and build useful implementations for all developers to use. The Intel® Code Samples license will not slow down your development or encumber your game’s release. The examples given in this article are just a few of the ways the sample code at the Intel Developer Zone can help your game increase graphics fidelity and improve performance.
References
Code Samples at the Intel Game Developer Community - https://software.intel.com/en-us/gamedev/code-samples
Adaptive Volumetric Shadow Maps - https://software.intel.com/en-us/blogs/2013/03/27/adaptive-volumetric-shadow-maps
Conservative Morphological Anti-Aliasing (CMAA) - March 2014 Update - https://software.intel.com/en-us/articles/conservative-morphological-anti-aliasing-cmaa-update
Edge Detection Based Post Processing In Warlords of Draenor (GDC presentation by Blizzard and Intel®) - https://software.intel.com/sites/default/files/managed/4a/38/Edge-Detection-based-Post-Processing-in-Warlords-of-Draenor.pdf
Engineer’s Workshop: Engine Evolution in Warlords of Draenor - http://us.battle.net/wow/en/blog/15936285/
Intel® Code Samples License Agreement - https://software.intel.com/en-us/articles/code-samples-license-5/
Order-Independent Transparency Approximation with Pixel Synchronization - https://software.intel.com/en-us/articles/oit-approximation-with-pixel-synchronization-update-2014
Software Occlusion Culling Update 2 - https://software.intel.com/en-us/blogs/2013/09/06/software-occlusion-culling-update-2
About the Authors
Brad Hill is a Software Engineer at Intel in the Developer Relations Division. Brad investigates new technologies on Intel hardware and shares the best methods with software developers via the Intel® Developer Zone and at developer conferences. He is also the Engineering Director of Student/Indie Hackathons, running Code for Good hackathons and gamejams at colleges and universities around the country.
John Hartwig is a Software Engineer in Intel’s Developer Relations Division. John’s focus is on enabling game developers in the PC client and Android mobile spaces by working with developers on optimizations and unique hardware features. John has worked at Intel since 2010 where he began as a graphics driver developer for GPGPU and media drivers. He makes DIY art toys and received a bachelors in Game Development from DePaul University.