Monday, March 1, 2010

Remote Performance Monitor II

In the previous post, we began a discussion on the XNA Framework Remote Performance Monitor. To recap, the Remote Performance Monitor exposes 2x common game code scenarios that unnecessarily generate garbage on the XNA platform:
  1. Unnecessary string object creation
  2. Unnecessary boxed value types
In the previous post, we discussed the first scenario: Unnecessary string object creation. Now, let's complete this discussion with the second scenario: Unnecessary boxed value types.

Unnecessary boxed value types
In a typical game, there are often enemies to kill, obstacles to avoid, gems to collect etc. Usually game code stores, for example, a list of enemy sprites to kill, in one collection variable:
IList<Sprite> Enemies { get; set; }
During game play, game code may need to iterate through the list of enemy sprites every single frame to invoke the Update() and/or Draw() methods accordingly. In .NET, iterating through an IList<T> can typically be done either using the for statement or the foreach.

Code Optimization Demos measure the performance of the for statement compared to the foreach: the for statement is generally more performant although the foreach statement provides better readability in code. However, if used incorrectly, the foreach statement can unnecessarily generate garbage, impact performance and potentially drop frames.

Let's check out an example using the foreach statement in more detail.
Consider the following Sprite class:
public class Sprite()
{
 public Sprite()  {}  // ctor.
 public void Update(GameTime gameTime) {}
 public void Draw(GameTime gameTime) {}
}
As above, game code may store, for example, a list of enemy sprites to kill, in one collection variable and construct the collection accordingly:
IList<Sprite> Enemies { get; set; }
Enemies = new List<Sprite>();
During game play, game code may iterate through the list of enemy sprites and update each sprite accordingly:
public void Update(GameTime gameTime)
{
 foreach (Sprite Enemy in Enemies)
 {
  Enemy.Update();
 }
}
The previous game code snippet may seem harmless enough, however, the Remote Performance Monitor reveals a single managed object allocated on the heap and a single value type is boxed every single frame. When game code executes this Update() method at 60fps then 60x value types are boxed every second:


What happened? Why is this simple game code snippet generating so much garbage?

The problem begins with our collection variable declaration: the collection is declared as an IList<T> but game code actually constructs a new List<T>.
IList<Sprite> Enemies { get; set; }
Enemies = new List<Sprite>();
The problem then manifests itself with the foreach statement: the foreach statement requires an enumerator to iterate through each enemy sprite in the list.
foreach (Sprite Enemy in Enemies)
{
 Enemy.Update();
}
In .NET, both List<T> and IList<T> implement the IEnumerable<T> interface. The IEnumerable<T> interface has one method: GetEnumerator(), which returns the enumerator required to iterate through each object in the list.

However, the implementation of the GetEnumerator() method differs between List<T> and IList<T>: List<T> GetEnumerator() method returns Enumerator<T>, which is a struct: a value type stored on the stack. Whereas IList<T> GetEnumerator() method returns IEnumerator<T>, which is an interface, a reference type stored on the heap.

Therefore the previous game code snippet initially returns an Enumerator<T>, as a value type for the List<T>, but then boxes the enumerator value type to a reference type because the collection is actually declared as an IList<T>!

Therefore, there are 2x potential solutions to resolve this issue with Unnecessary boxed value types:
  1. Update the collection variable declaration to List<T>
  2. Replace foreach with the for statement altogether

The first solution simply updates the collection variable declaration thus no boxing will be necessary:
List<Sprite> Enemies { get; set; }
Enemies = new List<Sprite>();

foreach (Sprite Enemy in Enemies)
{
 Enemy.Update();
}
The second solution simply replaces the foreach with the for statement thus no enumerator will be required:
IList<Sprite> Enemies { get; set; }
Enemies = new List<Sprite>();

for (Int32 index = 0; index < Enemies.Length; index++)
{
 Enemies[index].Update();
}
Either way, the results in the Remote Performance Monitor are the same:


To summarize, when using an object in which Enumerator<T> is a value type, like List<T>, game code can employ either the foreach or for statement and not generate garbage. When using an object in which IEnumerator<T> is a reference type, like IList<T>, the foreach statement may generate garbage whereas the for statement will not.

In conclusion, the XNA Framework Remote Performance Monitor a simple tool to detect if game code is generating garbage on the XNA platform. Typically, there are 3x static statistics that require the most attention during performance testing:
  • Managed String Objects Allocated
  • Managed Objects Allocated
  • Boxed Value Types
However, there is also one final statistic that is important to monitor: "Exceptions Thrown". In a perfect game, the "Delta" column in the Remote Performance Monitor will be zero at all times.