Posts Tagged ‘shader’

GLSL shaders using normal as vertex positions

Saturday, November 14th, 2015

Every few months I need to relearn how properly set up a GLSL shader with a vertex array object. This time I was running into a strange bug where my normals were being used as vertex positions. I had a simple pass-through vertex shader looking something like:

#version 330 core
in vec3 position;
in vec3 normal;
out vec3 frag_normal;
void main()
{                                      
  gl_Position = position;
  frag_normal = normal;
}

And on the CPU side I was setting up my vertex attributes with:

glVertexAttribPointer(0,3, ... );
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER,position_buffer_object);
glVertexAttribPointer(1,3, ... );
glEnableVertexAttribArray(1);
glBindBuffer(GL_ARRAY_BUFFER,normal_buffer_object);

My folly was assuming that because I’d declared position and normal in that order in the vertex shader that they’d be bound to attribute ids 0 and 1 respectively. Not so! After some head-slamming I thought to query these ids using:

glGetAttribLocation(program_id,"position");
glGetAttribLocation(program_id,"normal");

And found that for whatever reason position was bound to 0 and normal was bound to 1. Of course I then tried hacks to get these to reverse order, or I could hard code the different order. But there appears to be two correct options for fixing this problem:

The obvious one is to use glGetAttribLocation when creating the vertex array object:

glVertexAttribPointer(glGetAttribLocation(program_id,"position"),3, ... );
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER,position_buffer_object);
...

I was a little bothered that this solution requires that I know which shader is going to be in use at the time of creating the vertex array object.

The opposite solution is to assume a certain layout on the CPU-side when writing the shader:

layout(location = 0) in vec3 position;
layout(location = 1) in vec3 normal;

Now I can be sure that using ids 0 and 1 will correctly bind to position and normal respectively.

Depth peeling mini-app

Monday, May 4th, 2015

I’ve been lamenting poor transparency handling in opengl for as long as I’ve used opengl. From a graphics programmer’s perspective, it’s just so frustrating that you can’t just:

glEnable(GL_ALPHA_BLEND_CORRECTLY);

Finally I’ve gotten around to implementing a correct solution. The idea is simple, and, by now, well-known. Render the scene k times. For each pass keep track of the color and depth of the fragment with the smallest depth value which is less than the previous pass’s. In this way, each pass peels off a layer of pixels from front-to-back. Finally render all of these as composite image from back-to-front with the usual alpha blending.

If k is more than the maximum number of overlapping transparent objects then this will give the correct result. In practice if the alpha values aren’t all tiny (everything’s only a bit transparent). Then only a few passes are needed.

Here’s a little 280-line glut app demonstrating this:

// make sure the modern opengl headers are included before any others
#include <OpenGL/gl3.h>
#define __gl_h_
#include <igl/frustum.h>
#include <igl/read_triangle_mesh.h>
#include <igl/opengl/create_shader_program.h>
#include <Eigen/Core>
#include <GLUT/glut.h>
#include <string>

void init_render_to_texture(
  const size_t w, const size_t h, GLuint & tex, GLuint & dtex, GLuint & fbo)
{
  const auto & gen_tex = [](GLuint & tex)
  {
    // http://www.opengl.org/wiki/Framebuffer_Object_Examples#Quick_example.2C_render_to_texture_.282D.29
    glGenTextures(1, &tex);
    glBindTexture(GL_TEXTURE_2D, tex);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
  };
  // Generate texture for colors and attached to color component of framebuffer
  gen_tex(tex);
  glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, w, h, 0, GL_BGRA, GL_FLOAT, NULL);
  glBindTexture(GL_TEXTURE_2D, 0);
  glGenFramebuffers(1, &fbo);
  glBindFramebuffer(GL_FRAMEBUFFER, fbo);
  // Generate texture for depth and attached to depth component of framebuffer
  glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, tex, 0);
  gen_tex(dtex);
  glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT32, w, h, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
  glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, dtex, 0);
  // Clean up
  glBindFramebuffer(GL_FRAMEBUFFER, 0);
  glBindTexture(GL_TEXTURE_2D,0);
}


// For rendering a full-viewport quad, set tex-coord from position
std::string tex_v_shader = R"(
#version 330 core
in vec3 position;
out vec2 tex_coord;
void main()
{
  gl_Position = vec4(position,1.);
  tex_coord = vec2(0.5*(position.x+1), 0.5*(position.y+1));
}
)";
// Render directly from color or depth texture
std::string tex_f_shader = R"(
#version 330 core
in vec2 tex_coord;
out vec4 color;
uniform sampler2D color_texture;
uniform sampler2D depth_texture;
uniform bool show_depth;
void main()
{
  vec4 depth = texture(depth_texture,tex_coord);
  // Mask out background which is set to 1
  if(depth.r<1)
  {
    color = texture(color_texture,tex_coord);
    if(show_depth)
    {
      // Depth of background seems to be set to exactly 1.
      color.rgb = vec3(1,1,1)*(1.-depth.r)/0.006125;
    }
  }else
  {
    discard;
  }
}
)";

// Pass-through vertex shader with projection and model matrices
std::string scene_v_shader = R"(
#version 330 core
uniform mat4 proj;
uniform mat4 model;
in vec3 position;
void main()
{
  gl_Position = proj * model * vec4(position,1.);
}
)";
// Render if first pass or farther than closest frag on last pass
std::string scene_f_shader = R"(
#version 330 core
out vec4 color;
uniform bool first_pass;
uniform float width;
uniform float height;
uniform sampler2D depth_texture;
void main()
{
  color = vec4(0.8,0.4,0.0,0.75);
  color.rgb *= (1.-gl_FragCoord.z)/0.006125;
  if(!first_pass)
  {
    vec2 tex_coord = vec2(float(gl_FragCoord.x)/width,float(gl_FragCoord.y)/height);
    float max_depth = texture(depth_texture,tex_coord).r;
    if(gl_FragCoord.z <= max_depth)
    {
      discard;
    }
  }
}
)";

// shader id, vertex array object
GLuint scene_p_id=0,tex_p_id;
GLuint VAO,QVAO;
// Number of passes
#define k 4
GLuint tex_id[k],dtex_id[k],fbo_id[k];
// full width/height of window, width/height of viewports
int full_w=1440,full_h=480,w=full_w/(k+2),h=full_h/1;
// Mesh data: RowMajor is important to directly use in OpenGL
typedef Eigen::Matrix< float,Eigen::Dynamic,3,Eigen::RowMajor> MatrixV;
typedef Eigen::Matrix<GLuint,Eigen::Dynamic,3,Eigen::RowMajor> MatrixF;
MatrixV V,QV;
MatrixF F,QF;
int main(int argc, char * argv[])
{
  // Init glut and create window + OpenGL context
  glutInit(&argc,argv);
  glutInitDisplayMode(GLUT_3_2_CORE_PROFILE|GLUT_RGBA|GLUT_DOUBLE|GLUT_DEPTH); 
  glutInitWindowSize(full_w,full_h);
  glutCreateWindow("test");
  // Compile shaders
  igl::opengl::create_shader_program(scene_v_shader,scene_f_shader,{},scene_p_id);
  igl::opengl::create_shader_program(tex_v_shader,tex_f_shader,{},tex_p_id);
  // Prepare VAOs
  const auto & vao = [](const MatrixV & V, const MatrixF & F, GLuint & VAO)
  {
    // Generate and attach buffers to vertex array
    glGenVertexArrays(1, &VAO);
    GLuint VBO, EBO;
    glGenBuffers(1, &VBO);
    glGenBuffers(1, &EBO);
    glBindVertexArray(VAO);
    glBindBuffer(GL_ARRAY_BUFFER, VBO);
    glBufferData(GL_ARRAY_BUFFER, sizeof(float)*V.size(), V.data(), GL_STATIC_DRAW);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, EBO);
    glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(GLuint)*F.size(), F.data(), GL_STATIC_DRAW);
    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(GLfloat), (GLvoid*)0);
    glEnableVertexAttribArray(0);
    glBindBuffer(GL_ARRAY_BUFFER, 0); 
    glBindVertexArray(0);
  };

  // Read input mesh from file
  igl::read_triangle_mesh(argv[1],V,F);
  V.rowwise() -= V.colwise().mean();
  V /= (V.colwise().maxCoeff()-V.colwise().minCoeff()).maxCoeff();
  vao(V,F,VAO);
  // square
  const MatrixV QV = (MatrixV(4,3)<<-1,-1,0,1,-1,0,1,1,0,-1,1,0).finished();
  const MatrixF QF = (MatrixF(2,3)<< 0,1,2, 0,2,3).finished();
  vao(QV,QF,QVAO);

  // Main display routine
  glutDisplayFunc(
    []()
    {
      // Projection and modelview matrices
      Eigen::Matrix4f proj = Eigen::Matrix4f::Identity();
      float near = 0.01;
      float far = 3;
      float top = tan(35./360.*M_PI)*near;
      float right = top * (double)w/(double)h;
      igl::frustum(-right,right,-top,top,near,far,proj);
      Eigen::Affine3f model = Eigen::Affine3f::Identity();
      model.translate(Eigen::Vector3f(0,0,-1.5));
      // spin around
      static size_t count = 0;
      model.rotate(Eigen::AngleAxisf(M_PI/180.*count++,Eigen::Vector3f(0,1,0)));

      glEnable(GL_DEPTH_TEST);
      glViewport(0,0,w,h);
      // select program and attach uniforms
      glUseProgram(scene_p_id);
      GLint proj_loc = glGetUniformLocation(scene_p_id,"proj");
      glUniformMatrix4fv(proj_loc,1,GL_FALSE,proj.data());
      GLint model_loc = glGetUniformLocation(scene_p_id,"model");
      glUniformMatrix4fv(model_loc,1,GL_FALSE,model.matrix().data());
      glUniform1f(glGetUniformLocation(scene_p_id,"width"),w);
      glUniform1f(glGetUniformLocation(scene_p_id,"height"),h);
      glBindVertexArray(VAO);
      glDisable(GL_BLEND);
      for(int pass = 0;pass<k;pass++)
      {
        const bool first_pass = pass == 0;
        glUniform1i(glGetUniformLocation(scene_p_id,"first_pass"),first_pass);
        if(!first_pass)
        {
          glUniform1i(glGetUniformLocation(scene_p_id,"depth_texture"),0);
          glActiveTexture(GL_TEXTURE0 + 0);
          glBindTexture(GL_TEXTURE_2D, dtex_id[pass-1]);
        }
        glBindFramebuffer(GL_FRAMEBUFFER, fbo_id[pass]);
        glClearColor(0.0,0.4,0.7,0.);
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
        glDrawElements(GL_TRIANGLES, F.size(), GL_UNSIGNED_INT, 0);
      }
      // clean up and set to render to screen
      glBindVertexArray(0);
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
      glActiveTexture(GL_TEXTURE0 + 0);
      glBindTexture(GL_TEXTURE_2D,0);

      // Get read to draw quads
      glBindVertexArray(QVAO);
      glClearColor(0.0,0.4,0.7,0.);
      glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
      glUseProgram(tex_p_id);
      // Draw result of each peel
      for(int pass = 0;pass<k;pass++)
      {
        GLint color_tex_loc = glGetUniformLocation(tex_p_id,"color_texture");
        glUniform1i(color_tex_loc, 0);
        glActiveTexture(GL_TEXTURE0 + 0);
        glBindTexture(GL_TEXTURE_2D, tex_id[pass]);
        GLint depth_tex_loc = glGetUniformLocation(tex_p_id,"depth_texture");
        glUniform1i(depth_tex_loc, 1);
        glActiveTexture(GL_TEXTURE0 + 1);
        glBindTexture(GL_TEXTURE_2D, dtex_id[pass]);
        glViewport(pass*w,0*h,w,h);
        glUniform1i(glGetUniformLocation(tex_p_id,"show_depth"),0);
        glDrawElements(GL_TRIANGLES,6, GL_UNSIGNED_INT, 0);
      }

      // Render final result as composite of all textures
      glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
      glEnable(GL_BLEND);
      glDepthFunc(GL_ALWAYS);
      glViewport(k*w,0*h,w,h);
      glUniform1i(glGetUniformLocation(tex_p_id,"show_depth"),0);
      GLint color_tex_loc = glGetUniformLocation(tex_p_id,"color_texture");
      GLint depth_tex_loc = glGetUniformLocation(tex_p_id,"depth_texture");
      for(int pass = k-1;pass>=0;pass--)
      {
        glUniform1i(color_tex_loc, 0);
        glActiveTexture(GL_TEXTURE0 + 0);
        glBindTexture(GL_TEXTURE_2D, tex_id[pass]);
        glUniform1i(depth_tex_loc, 1);
        glActiveTexture(GL_TEXTURE0 + 1);
        glBindTexture(GL_TEXTURE_2D, dtex_id[pass]);
        glDrawElements(GL_TRIANGLES,6, GL_UNSIGNED_INT, 0);
      }
      glDepthFunc(GL_LESS);
      // Render scene using naive GL_BLEND transparency
      glUseProgram(scene_p_id);
      glBindVertexArray(VAO);
      glViewport((k+1)*w,0*h,w,h);
      glDrawElements(GL_TRIANGLES, F.size(), GL_UNSIGNED_INT, 0);
      glBindVertexArray(0);
      glDisable(GL_BLEND);

      glutSwapBuffers();
      glutPostRedisplay();
    }
    );
  glutReshapeFunc(
    [](int w,int h)
    {
      full_h=h;
      full_w=w;
      ::w=full_w/(k+2);
      ::h=full_h/(1);
      // (re)-initialize textures and buffers
      for(size_t i = 0;i<k;i++)
      {
        init_render_to_texture(::w,::h,tex_id[i],dtex_id[i],fbo_id[i]);
      }
    });
  glutMainLoop();
}

Running this for an elephant mesh you get:

depth peeling vs gl-blend

The left 4 images are the individual peeled layers, rendered only for demonstration. The next image is the composite (the main result) and for comparison the nasty mess you get if you just try to render your scene with:

glBlendFunc(GL_SRC, GL_ONE_MINUS_SRC_ALPHA);
glEnable(GL_BLEND);

**Update: ** Turns out to get the right blending you should be careful to turn off GL_BLEND when doing the peeling pass and only turn it on for the texture compositing.

Old-style GPGPU reduction, average pixel color

Tuesday, April 28th, 2015

Here’s a little demo which computes the average pixel value of an OpenGL rendering. As a sanity check I compute the average value on the cpu-side using glReadPixels and then compare to computing the average value using a “mip-map” style, ping-pong, texture-buffer GPGPU reduction. Finally I render the buffers for fun.

I’m using yimg just to read in a .png file to render as an example.

On my weak little laptop, the GPU code is about 30 times faster. Can’t shrug at that! Rookie mistake. Made timings in debug mode. In release there’s hardly a speed up : – (

I was careful to compute the average in a coherent way (the images get progressively blurred out, rather than averaging the image quadrants recursively). This would be useful if, say, computing the average of 100 renderings. You could render them into a 10s x 10s texture and run the GPU reduction until the result is just a 10×10 image containing the 100 results. That’d only require a single final call to glReadPixels (rather than calling glReadPixels to read the final single pixel result of each reduction).

#include <YImage.hpp>
#include <YImage.cpp>
#include <GLUT/glut.h>
#include <iostream>
#include <string>
#include <iomanip>

// Size of image rounded up to next power of 2
size_t s;
// Shader program for doing reduction
GLuint reduction_prog_id;
// Need two textures and buffers to ping-pong
GLuint tex_id[] = {0,0};
GLuint fbo_id[] = {0,0};
GLuint dfbo_id[] = {0,0};
// Image (something to render in the first place)
std::string filename("hockney-512.png");
YImage yimg;

void init_render_to_texture(
  const size_t width,
  const size_t height,
  GLuint & tex_id,
  GLuint & fbo_id,
  GLuint & dfbo_id)
{
  using namespace std;
  // Delete if already exists
  glDeleteTextures(1,&tex_id);
  glDeleteFramebuffersEXT(1,&fbo_id);
  glDeleteFramebuffersEXT(1,&dfbo_id);
  // http://www.opengl.org/wiki/Framebuffer_Object_Examples#Quick_example.2C_render_to_texture_.282D.29
  glGenTextures(1, &tex_id);
  glBindTexture(GL_TEXTURE_2D, tex_id);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
  //NULL means reserve texture memory, but texels are undefined
  glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F_ARB, width, height, 0, GL_BGRA, GL_FLOAT, NULL);
  glBindTexture(GL_TEXTURE_2D, 0);
  glGenFramebuffersEXT(1, &fbo_id);
  glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fbo_id);
  //Attach 2D texture to this FBO
  glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_TEXTURE_2D, tex_id, 0);
  glGenRenderbuffersEXT(1, &dfbo_id);
  glBindRenderbufferEXT(GL_RENDERBUFFER_EXT, dfbo_id);
  glRenderbufferStorageEXT(GL_RENDERBUFFER_EXT, GL_DEPTH_COMPONENT24, width, height);
  //Attach depth buffer to FBO (for this example it's not really needed, but if
  //drawing a 3D scene it would be necessary to attach something)
  glFramebufferRenderbufferEXT(GL_FRAMEBUFFER_EXT, GL_DEPTH_ATTACHMENT_EXT, GL_RENDERBUFFER_EXT, dfbo_id);
  //Does the GPU support current FBO configuration?
  GLenum status;
  status = glCheckFramebufferStatusEXT(GL_FRAMEBUFFER_EXT);
  assert(status == GL_FRAMEBUFFER_COMPLETE_EXT);
  // Unbind to clean up
  glBindRenderbufferEXT(GL_RENDERBUFFER_EXT, 0);
  glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0);
}

int main(int argc, char * argv[])
{
  glutInit(&argc,argv);
  if(argc>1)
  {
    filename = argv[1];
  }
  yimg.load(filename.c_str());
  s = std::max(yimg.width(),yimg.height());
  // http://stackoverflow.com/a/466278/148668
  s--;
  s |= s >> 1;
  s |= s >> 2;
  s |= s >> 4;
  s |= s >> 8;
  s |= s >> 16;
  s++;

  glutInitDisplayString("rgba depth double stencil");
  glutInitWindowSize(2*s,s);
  glutCreateWindow("texture-reduction");
  glutDisplayFunc(
    []()
    {
      using namespace std;
      // Initialize **both** buffers and set to render into first
      for(int i = 1;i>=0;i--)
      {
        glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fbo_id[i]);
        glBindRenderbufferEXT(GL_RENDERBUFFER_EXT, dfbo_id[i]);
        glClearColor(0,0,0,1);
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
      }

      // Render something
      yimg.flip();
      glMatrixMode(GL_PROJECTION);
      glPushMatrix();
      glLoadIdentity();
      glOrtho(0,s,0,s, -10000,10000);
      glViewport(0,0,s,s);
      glRasterPos2f(0,0);
      glDrawPixels(yimg.width(), yimg.height(), GL_RGBA, GL_UNSIGNED_BYTE, yimg.data());
      glPopMatrix();

      // Even the cpu code should use a buffer (rather than the screen)
      GLfloat * rgb = new GLfloat[yimg.width() * yimg.height() * 3];
      glReadPixels(0, 0, yimg.width(), yimg.height(), GL_RGB, GL_FLOAT, rgb);
      // Gather into double: sequential add is prone to numerical error
      double avg[] = {0,0,0};
      for(size_t i = 0;i<yimg.width()*yimg.height()*3;i+=3)
      {
        avg[0] += rgb[i + 0];
        avg[1] += rgb[i + 1];
        avg[2] += rgb[i + 2];
      }
      for_each(avg,avg+3,[](double & c){c/=yimg.width()*yimg.height();});
      delete[] rgb;

      // Size of square being rendered
      assert(((s != 0) && ((s & (~s + 1)) == s)) && "s should be power of 2");
      size_t h = s/2;
      // odd or even ping-pong iteration
      int odd = 1;
      // Tell shader about texel step size
      glUseProgram(reduction_prog_id);
      glUniform1f(glGetUniformLocation(reduction_prog_id,"dt"),0.5f/float(s));
      glEnable(GL_TEXTURE_2D);
      while(h)
      {
        // Select which texture to draw/compute into
        glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fbo_id[odd]);
        glBindRenderbufferEXT(GL_RENDERBUFFER_EXT, dfbo_id[odd]);
        // Select which texture to draw/compute from
        glBindTexture(GL_TEXTURE_2D,tex_id[1-odd]);
        // Scale to smaller square
        glViewport(0,0,h,h);
        const float f = 2.*(float)h/(float)s;
        // Draw quad filling viewport with shrinking texture coordinates
        glBegin(GL_QUADS);
        glTexCoord2f(0,0);
        glVertex2f  (-1,-1);
        glTexCoord2f(f,0);
        glVertex2f  (1,-1);
        glTexCoord2f(f,f);
        glVertex2f  (1,1);
        glTexCoord2f(0,f);
        glVertex2f  (-1,1);
        glEnd();

        // ping-pong
        odd = 1-odd;
        h = h/2;
      }
      // Read corner pixel of last render
      float px[3];
      glReadPixels(0, 0, 1, 1, GL_RGB, GL_FLOAT, px);
      // Correct for size not power of 2
      for_each(px,px+3,[](float& c){c*=(float)s*s/yimg.width()/yimg.height();});
      cout<<" gpu: "<< px[0]<<" "<< px[1]<<" "<< px[2]<<" "<<endl;
      cout<<" cpu: "<<avg[0]<<" "<<avg[1]<<" "<<avg[2]<<" "<<endl;

      // Purely for vanity, draw the buffers
      glUseProgram(0);
      glBindFramebufferEXT(GL_FRAMEBUFFER_EXT,0);
      glBindRenderbufferEXT(GL_RENDERBUFFER_EXT,0);
      glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
      for(odd = 0;odd<2;odd++)
      {
        glViewport(odd*s,0,s,s);
        glEnable(GL_TEXTURE_2D);
        glBindTexture(GL_TEXTURE_2D,tex_id[odd]);
        glBegin(GL_QUADS);
        glTexCoord2f(0,0);
        glVertex2f  (-1,-1);
        glTexCoord2f(1,0);
        glVertex2f  (1,-1);
        glTexCoord2f(1,1);
        glVertex2f  (1,1);
        glTexCoord2f(0,1);
        glVertex2f  (-1,1);
        glEnd();
      }
      glutSwapBuffers();
    }
   );
  init_render_to_texture(s,s,tex_id[0],fbo_id[0],dfbo_id[0]);
  init_render_to_texture(s,s,tex_id[1],fbo_id[1],dfbo_id[1]);


  // Vertex shader is a **true** pass-through
  const std::string vertex_shader = R"(
#version 120
void main()
{
  gl_Position = gl_Vertex;
  gl_TexCoord[0] = gl_MultiTexCoord0;
}
)";
  // fragment shader sums texture of this pixel and left/bottom neighbors
  const std::string fragment_shader = R"(
#version 120
// size of a half-texel in 1/pixels units: 0.5/(full size)
uniform float dt;
uniform sampler2D texture;
void main()
{
  gl_FragColor = 0.25*(
    texture2D(texture, gl_TexCoord[0].st - vec2( 0, 0)) +
    texture2D(texture, gl_TexCoord[0].st - vec2(dt, 0)) +
    texture2D(texture, gl_TexCoord[0].st - vec2(dt,dt)) +
    texture2D(texture, gl_TexCoord[0].st - vec2( 0,dt)));
}
)";

  // Compile and link reduction shaders into program
  const auto & compile_shader = [](const GLint type, const char * str)->GLuint
  {
    GLuint id = glCreateShader(type);
    glShaderSource(id,1,&str,NULL);
    glCompileShader(id);
    return id;
  };
  GLuint vid = compile_shader(GL_VERTEX_SHADER,vertex_shader.c_str());
  GLuint fid = compile_shader(GL_FRAGMENT_SHADER,fragment_shader.c_str());
  reduction_prog_id = glCreateProgram();
  glAttachShader(reduction_prog_id,vid);
  glAttachShader(reduction_prog_id,fid);
  glLinkProgram(reduction_prog_id);

  glutMainLoop();
}

GL_COMPILE_AND_EXECUTE is slow (but apparently everybody knew that)

Thursday, October 11th, 2012

Although it doesn’t seem to be deprecated the glNewList() option GL_COMPILE_AND_EXECUTE is not properly supported by nVidia GPUs. On my linux machine, we installed a fancy pants nVidia card. So I was surprised to find out that my code ran slower there than on my dinky mac. After a long debugging session I found that it was all GL_COMPILE_AND_EXECUTE‘s fault. I was displaying a mesh each frame using code that looked like:


  if(!display_list_compiled)
  {
    dl_id = glGenLists(1);
    glNewList(dl_id,GL_COMPILE_AND_EXECUTE);
    ... //draw mesh
    glEndList();
    display_list_compiled = true;
  }else
  {
    glCallList(dl_id);
  }

I expected that this would be slow for the first time. But in fact it was significantly slower (factor of 100) for every frame! Even though it was using the display list. I guess that passing GL_COMPILE_AND_EXECUTE creates a very badly organized display list and then you’re punished every time you use it.

The solution is of course trivial:


  if(!display_list_compiled)
  {
    dl_id = glGenLists(1);
    glNewList(dl_id,GL_COMPILE;
    ... //draw mesh
    glEndList();
    display_list_compiled = true;
  }
  glCallList(dl_id);

but $#%^&*! what a waste of time.

Force vim to use specific file type

Tuesday, September 27th, 2011

I recently download the glsl syntax highlighting plugin for vim. It perfectly recognizes my .vert and .frag files. But as soon as I added the line:


#version 120

vim starting thinking the filetype was not glsl but conf.

The cause seems to be that the glsl plugin suggests a rather weak form of recognizing the filetype. Adding the following to your .vimrc file


au BufNewFile,BufRead *.frag,*.vert,*.fp,*.vp,*.glsl setf glsl 

When instead to override whatever vim is doing to think that the file is conf you must use the only slightly different


au BufNewFile,BufRead *.frag,*.vert,*.fp,*.vp,*.glsl set filetype=glsl