Indexing array of uniforms by variable in GLSL on ATI/AMD graphics card

Alec Jacobson

December 22, 2011

weblog/

After a month or two of frustration I've finally understood why my GLSL vertex shader has been running so slowly on my iMac's AMD Radeon HD 6970M 2048 MB graphics card. I thought the problem was that when I index an array of uniforms using a variable (attribute), the shader switches the renderer into "software mode" (Apple Software Renderer). To get around this I had a super-hack in my shader that was very slow. I wanted to achieve:
...
attribute vec4 indices;
uniform mat4 T[LARGE_NUMBER];
void main()
{
  ... 
  mat4 t =  T[int(indices[0])];
  ...
}
But since I thought I couldn't index by a variable I just made a function to index by a variable using if statements:
mat4 T_at_i(int i)
{
  if(i==0) return T[0];
  else if(i==1) return T[1];
  else if(i==2) return T[2];
  else if(i==3) return T[3];
  else if(i==4) return T[4];
  ...
}
These if statements, of course, destroyed the efficiency of my shader. But it turns out that indexing by variable is not really the problem. It was more of a symptom. I had too many uniform components. I shirk some of the blame because ATI returns the wrong number when you ask for the number of components via GL_MAX_VERTEX_UNIFORM_COMPONENTS Rather than the correct number (for my machine 1024) it returns 4 times that number (for my machine 4096). Thus I thought I had no problem with maxing out my allotted uniform memory. This is documented in very confusing language on the OpenGL wiki and with slightly less confusing language on the answer to my question on stackoverflow. The point is that when you ask for too many uniforms and you index them by a variable your shader still compiles and you get no complaints, but secretly the graphics card is giving up and making the software renderer activate (super slow). If you don't index by variable you can use how ever many uniforms you want and the graphics card will still run the shader, but you have to use slow hacks like the one above to act like your indexing by a variable. Finally the "solution" is to use the right number of uniforms (after filtering the number you get when you ask ATI for it by a possible factor of 4) and then you're free to index them by variables and have the shader run on the card and not in software mode.