As fast as computers are, heavy use of floating point functions can still slow them down. One way around this, is to use a lookup table instead of calculating floating point values at runtime. But keeping a generated table up-to-date is annoying work. In this post, I'll show you how to create a lookup table automatically using a custom build rule, making an OpenGL animation 5 times faster in the process.

## Introduction

In this post, I'm going to work with a modified version of Apple's default OpenGL ES 1.0 program. In this program, a colored square will spin in the center of the window:

To artificially create a situation where this simple program is bound by CPU performance, I will calculate the location of the square thousands of times for every frame.

## Initial code and performance

The initial code for updating the model matrix before drawing is:

``````static float transX = 3.14159265f * 0.5;
static float transY = 0.0f;
for (int i = 0; i < 2e5; i++)
{
translateX = sinf(transX)/2.0f;
translateY = sinf(transY)/2.0f;
doNothingff(&translateX, &transX);
doNothingff(&translateY, &transY);
}
transX += 0.075f;
transY += 0.075f;

glTranslatef((GLfloat)(translateX), (GLfloat)(translateY), 0.0f);``````

This is a very simple calculation: it translates out to the edge of the unit circle and progresses around the circle by 0.075 radians per frame.

The code loops 2e5 (i.e. 20,000) times and passes its variables by reference into a dummy function (to prevent the compiler optimizing the loop away).

On my iPhone 3G, this runs between 5 and 6 frames per second.

## Faster with tables

The reason this code is slow, is the `sinf()` function — the iPhone simply isn't meant for heavy floating point arithmetic.

However, this code only uses 84 steps to get around the circle. We can precalculate all of those points, store them in a table and use the table instead.

``````static int tableIndexX = -1;
static int tableIndexY = 0;
if (tableIndexX == -1)
tableIndexX = SinTableSize / 4;

for (int i = 0; i < 2e5; i++)
{
translateX = SinTable[tableIndexX];
translateY = SinTable[tableIndexY];
doNothingif(&tableIndexX, &translateX);
doNothingif(&tableIndexY, &translateY);
}
tableIndexX = (tableIndexX + 1) % SinTableSize;
tableIndexY = (tableIndexY + 1) % SinTableSize;

glTranslatef((GLfloat)(translateX), (GLfloat)(translateY), 0.0f);``````

This code is the same as the previous code except instead of calculating each point, it simply steps through an array of values.

The `SinTable` just looks like this:

``````const float SinTable[84] =
{
0.000000f,
0.037365f,
0.074521f,
0.111260f,
//... 80 more entries ...
};``````

With the change to a lookup table, the code now runs at 26-28 frames per second. A 5 times speed increase.

## Using code to generate code with a build rule

If you've ever used lookup tables, you'll know that they can be annoying to maintain. If you want to change them, you need to regenerate them and then reintegrate them into your project.

With a custom build rule, the regeneration and reintegration can be done automatically.

The code that generates the `SinTable` looks like this:

``````enum
{
ProcessName = 0,
OutputFile,
TableSteps,
ArgumentCount
};

int main (int argc, const char * argv[])
{
if (argc != ArgumentCount)
return 1;

FILE *outputFile = fopen(argv[OutputFile], "w");
long tableSteps = strtol(argv[TableSteps], NULL, 10);

fprintf(outputFile, "const float SinTable[%ld] = \n", tableSteps);
fprintf(outputFile, "{\n");
for (long i = 0; i < tableSteps; i++)
{
fprintf(outputFile, "    %ff,\n", 0.5f * sinf(i * 2.0f * 3.14159265f / tableSteps));
}
fprintf(outputFile, "\n};\n");
fprintf(outputFile, "const long SinTableSize = %ld;", tableSteps);

return 0;
}``````

I put it in a file named SinTable.gen.c then I added a custom build rule to the target (Project→Edit Active Target→Rules).

The custom build rule looks like this:

What this does is:

1. Takes any file that ends in .gen.c
2. Compiles it to a .gen file in the Derived files directory
3. Executes the .gen program to generate a gen.result.c file in the same directory
4. Tells Xcode that it produces the gen.result.c file and Xcode will then build it according to Xcode's standard C compilation rules, integrating it with the rest of the program.

## Custom Build Rules versus Custom Build Phases

Many people are familiar with Custom Build Phases for running scripts during their builds. It is important to understand the differences between the two:

 Custom Build Rule Run Script Custom Build Phase Is an arbitrary /bin/sh script Is an arbitrary /bin/sh script Runs once for input of a given type Runs exactly once per build Only runs when an input of the relevant type changes Runs on either every build (no inputs/outputs specified) or when specifically nominated "input" files are newer than nominated "output" files. Outputs can be automatically picked up by Xcode for further processing by other build rules Xcode doesn't touch outputs — you must handle all processing yourself Only visible in the Target's Settings window Visible in the Group Tree by expanding the Target

Custom Build Rules are ideal for generating files that will be picked up by another build rule (for example, generating source code or .o files). Custom Build Phases are a better option for applying build numbers or handling the final built product.