Posts Tagged ‘opengl’

GL_COMPILE_AND_EXECUTE is slow (but apparently everybody knew that)

Thursday, October 11th, 2012

Although it doesn’t seem to be deprecated the glNewList() option GL_COMPILE_AND_EXECUTE is not properly supported by nVidia GPUs. On my linux machine, we installed a fancy pants nVidia card. So I was surprised to find out that my code ran slower there than on my dinky mac. After a long debugging session I found that it was all GL_COMPILE_AND_EXECUTE‘s fault. I was displaying a mesh each frame using code that looked like:


  if(!display_list_compiled)
  {
    dl_id = glGenLists(1);
    glNewList(dl_id,GL_COMPILE_AND_EXECUTE);
    ... //draw mesh
    glEndList();
    display_list_compiled = true;
  }else
  {
    glCallList(dl_id);
  }

I expected that this would be slow for the first time. But in fact it was significantly slower (factor of 100) for every frame! Even though it was using the display list. I guess that passing GL_COMPILE_AND_EXECUTE creates a very badly organized display list and then you’re punished every time you use it.

The solution is of course trivial:


  if(!display_list_compiled)
  {
    dl_id = glGenLists(1);
    glNewList(dl_id,GL_COMPILE;
    ... //draw mesh
    glEndList();
    display_list_compiled = true;
  }
  glCallList(dl_id);

but $#%^&*! what a waste of time.

Compiling and running Off Screen Mesa (OSMesa) demo (osdemo.c) on mac os x

Tuesday, October 9th, 2012

I installed mesa on my mac using macports:


sudo port install mesa

But was sad to find out that the off screen renderer (the whole reason I got it), didn’t work out of the box. The problem seems to be that the things aren’t meshing well with the GLU implementation. I tried to compile the mesa demo file osdemo.c with:


gcc -o osdemo osdemo.c -I/opt/local/include/ -L/opt/local/lib/ -lOSMesa -lglu

which compiled but produced a funny result:
mesa wrong output with bad glu implementation

This picture looked ok at first, but then I realized that there should be a green cone and a blue sphere also in the image. These objects were rendered using glu*() commands in osdemo.c, but some how have had no effect, despite compiling correctly.

I also tried linking against the native GLU implementation in the mac os x OpenGL framework and the GLU implementation in my /usr/X11/lib directory. Both compiled but produced runtime errors:


Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x000000000000000c
0x00007fff8c8b2a7d in gluQuadricDrawStyle ()
(gdb) bt
#0  0x00007fff8c8b2a7d in gluQuadricDrawStyle ()
#1  0x00000001000011a6 in Cone ()
#2  0x000000010000172d in render_image ()
#3  0x0000000100000e26 in main ()

I finally fixed the problem by reinstalling GLU by itself via mesa’s repo:


./autogen.sh
./configure --enable-osmesa --prefix=/usr/local/
make
sudo make install

Now I can compile osdemo again (careful to note that I’ve places -L/usr/local/lib/ before -L/opt/local/lib/):


gcc -o osdemo osdemo.c -I/opt/local/include/ -L/usr/local/lib/ -L/opt/local/lib/ -lOSMesa -lglu

And when I run it I get the correct picture:
correct osdemo output with correct glu implementation

Note: A new version of mesa was just released yesterday and may fix this issue. I don’t know yet, because the mesa source does not compile out of the box for mac os x and the macports version has not yet been updated.

Compile and run mesa on bluehost web server

Sunday, October 7th, 2012

I want to use the off-screen renderer of Mesa in a php script on my blue host served website. Compiling Mesa on my mac was dead simple (sudo port install mesa), but doing it on the linux server without root access or repositories was a bit tricky. Here’s how I finally got it to work.

Download and compile llvm, if it’s not around already. I found that version 3.1 didn’t play nicely with Mesa but 3.0 did. LLVM installed smoothly.


./configure --prefix=[INSTALL_PREFIX]
make -j5
make install

Next, grab the latest glproto headers. As far as I can tell, there is nothing to compile as only headers are needed.

Download mesa, unzip and compile using the following:


% Set up glproto headers
export GLPROTO_LIBS=../glproto-1.4.16/;
export GLPROTO_CFLAGS=../glproto-1.4.16/;
% configure, disabling DRI support (i.e. graphics card support)
./configure --prefix=[INSTALL_PREFIX] --disable-driglx-direct --enable-xlib-glx --enable-osmesa --disable-dri
make -j5
make install

Then I got the Mesa demos and made sure I could compile src/osdemos/osdemo.c:


gcc -o osdemo osdemo.c -I[INSTALL_PREFIX]/include -L[INSTALL_PREFIX]/lib -lOSMesa -lGLU

Upon running osdemo, you might see:


./osdemo: error while loading shared libraries: libOSMesa.so.8: cannot open shared object file: No such file or directory

But this is fixed by adding you library install path to the LD_LIBRARY_PATH variable:


export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:[INSTALL_PATH]/lib/

or in php:


putenv("LD_LIBRARY_PATH=".$_ENV["LD_LIBRARY_PATH"].":[INSTALL_PATH]/lib/");

If it works then you can run the program with:


./osdemo foo.tga

and produce an image like:
output of osdemo of mesa demos running on web server

Cheap tricks for OpenGL transparency

Wednesday, September 5th, 2012

I’ve been experimenting with some hacks using basic OpenGL commands to achieve high/good-quality transparency (alpha-blending).

I’ll compare 4 methods.

  1. Traditional: For this one (and all the others) we turn on alpha blending:
    
    glEnable(GL_BLEND);
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
    

    and then just use a normal depth test without back face culling in a single pass:

    
    glDisable(GL_CULL_FACE);
    glDepthFunc(GL_LEQUAL);
    // render teapot with alpha = alpha
    

    single pass teapot
    single pass teapot
    Notice two things. The handle is not visible in the second picture and we have zebra stripe artifacts in the first picture due to alpha blending without sorting. Sorting is slow and view-dependent and won’t even guarantee correct results. So instead we’ll see how far we can get with cheap tricks.

  2. GL_GREATER: This is a 3-pass attempt.
    
    double f = 0.75; // Or some other factor
    glDisable(GL_CULL_FACE);
    glDepthFunc(GL_LEQUAL);
    // render teapot with alpha = f*alpha
    
    glDisable(GL_CULL_FACE);
    glDepthFunc(GL_GREATER);
    // render teapot with alpha = alpha
    
    glDisable(GL_CULL_FACE);
    glDepthFunc(GL_LEQUAL);
    // render teapot with alpha = (alpha-f*alpha)/(1.0-f*alpha);
    

    This often produces reasonable results but the idea is a bit strange. It’s sort like depth peeling from opposite ends. Hoping that if we only have two layers, we’ll catch both of them.
    three pass teapot greater
    three pass teapot greater
    Notice now we have the handle showing up in the second image, but we still have ordering artifacts. Admittedly a lot of these would go away with back face culling, but then we don’t get two faced rending (blue inside, gold outside).

  3. ALWAYS: This method will seem like a step back because it introduces more ordering artifacts due to the two-face rendering, but the concept is a bit simpler.
    
    double f = 0.75; // Or some other factor
    glDisable(GL_CULL_FACE);
    glDepthFunc(GL_ALWAYS);
    // render teapot with alpha = f*alpha
    
    glDisable(GL_CULL_FACE);
    glDepthFunc(GL_LEQUAL);
    // render teapot with alpha = (alpha-f*alpha)/(1.0-f*alpha);
    
  4. The idea is again the same that by taking two passes we first pick up a little bit of everything albeit not sorted in the correct order. Then we finish with one pass that ensures that the top layer is correct.
    three pass teapot greater
    three pass teapot greater
    We’re getting artifacts from the ordering in the first image, but I’ll show in a second that these are just coming from not handling the two-face rendering correctly. What we see in both cases is that the bottom of the teapot shows clearly through the bottom.

  5. QUINTUPLE PASS: This is the final solution that I arrived at. It involves five passes, which using display lists should be OK for many real-time applications.
    
    glDisable(GL_CULL_FACE);
    glDepthFunc(GL_LESS);
    // render teapot with alpha = 0, to prime the depth buffer
    
    glEnable(GL_CULL_FACE);
    glCullFace(GL_FRONT);
    glDepthFunc(GL_ALWAYS);
    // render teapot with alpha = f*alpha
    
    glEnable(GL_CULL_FACE);
    glCullFace(GL_FRONT);
    glDepthFunc(GL_LEQUAL);
    // render teapot with alpha = (alpha-f*alpha)/(1.0-f*alpha)
    
    
    glEnable(GL_CULL_FACE);
    glCullFace(GL_BACK;
    glDepthFunc(GL_ALWAYS);
    // render teapot with alpha = f*alpha
    
    // There's a trade off here. With culling enabled then a perfectly
    // opaque object (alpha=1) may be wrong. With it disabled, ordering
    // artifacts may appear
    // glEnable(GL_CULL_FACE);
    // glCullFace(GL_BACK);
    glDisable(GL_CULL_FACE);
    glDepthFunc(GL_LEQUAL);
    // render teapot with alpha = (alpha-f*alpha)/(1.0-f*alpha)
    

    The idea is that first we prepare the depth buffer with the final depth values, then we use the previous method but just to show the inside “back-facing” surface, then we do the same to show the front.
    three pass teapot greater
    three pass teapot greater

  6. Now we get two very nice renderings that even participate with the background well. We can continuously blend our alpha value to 1 and expect to see a perfectly correct opaque rendering. We can notice some small ordering artifacts in the top image near the handle. If we know we want a transparent object then we can flip back face culling on for the last pass to get rid of this. But as noticed this will give an incorrect result when alpha ≈ 1.

Small RGB texture in openGL all white

Monday, May 21st, 2012

I was trying to load a small 2 pixel by 2 pixel in opengl and damned if it didn’t keep coming out all white. My mistake was storing the texture as RGB (3 floats per pixel) and not setting:


glPixelStorei(GL_UNPACK_ALIGNMENT, 1);

since opengl expects alignment of 4 (which would be fine for RGBA for example).

Matlab face/edge alpha and phong lighting

Wednesday, February 29th, 2012

Turns out matlab’s figure renderers can’t handle transparency and phong lighting simultaneously. I noticed this when trying to have transparent edges overlaid on a phong shaded surface.
Matlab documentation reads:

You do not specify Phong lighting (OpenGL does not support Phong lighting; if you specify Phong lighting, MATLAB uses the ZBuffer renderer).

Or

Figure objects use transparency (OpenGL is the only MATLAB renderer that supports transparency).

So that’s that. Hopefully Matlab will fix this in the future. For now I will probably just composite figures in Photoshop if need be.

Indexing array of uniforms by variable in GLSL on ATI/AMD graphics card

Thursday, December 22nd, 2011

After a month or two of frustration I’ve finally understood why my GLSL vertex shader has been running so slowly on my iMac’s AMD Radeon HD 6970M 2048 MB graphics card. I thought the problem was that when I index an array of uniforms using a variable (attribute), the shader switches the renderer into “software mode” (Apple Software Renderer). To get around this I had a super-hack in my shader that was very slow. I wanted to achieve:


...
attribute vec4 indices;
uniform mat4 T[LARGE_NUMBER];
void main()
{
  ... 
  mat4 t =  T[int(indices[0])];
  ...
}

But since I thought I couldn’t index by a variable I just made a function to index by a variable using if statements:


mat4 T_at_i(int i)
{
  if(i==0) return T[0];
  else if(i==1) return T[1];
  else if(i==2) return T[2];
  else if(i==3) return T[3];
  else if(i==4) return T[4];
  ...
}

These if statements, of course, destroyed the efficiency of my shader.

But it turns out that indexing by variable is not really the problem. It was more of a symptom. I had too many uniform components. I shirk some of the blame because ATI returns the wrong number when you ask for the number of components via GL_MAX_VERTEX_UNIFORM_COMPONENTS Rather than the correct number (for my machine 1024) it returns 4 times that number (for my machine 4096). Thus I thought I had no problem with maxing out my allotted uniform memory. This is documented in very confusing language on the
OpenGL wiki and with slightly less confusing language on the answer to my question on stackoverflow.

The point is that when you ask for too many uniforms and you index them by a variable your shader still compiles and you get no complaints, but secretly the graphics card is giving up and making the software renderer activate (super slow). If you don’t index by variable you can use how ever many uniforms you want and the graphics card will still run the shader, but you have to use slow hacks like the one above to act like your indexing by a variable. Finally the “solution” is to use the right number of uniforms (after filtering the number you get when you ask ATI for it by a possible factor of 4) and then you’re free to index them by variables and have the shader run on the card and not in software mode.

glUniform invalid operation mystery

Sunday, October 2nd, 2011

I burned way too many hours tracking down a foolish bug I had in my GLSL/OpenGL code. The bug had to do with setting uniform variables in my GLSL vertex shader. My shader code looked something like:


uniform mat4 my_matrices[100];
uniform int my_int;
void main()
{
  // ... something using my_int and my_matrices
}

I would then try to set them in my C++ OpenGL code with:


glUniformMatrix4fv(my_matrices_location,100,false,my_matrices_data);
glUniform1i(my_int_location,my_int_value);

This should work as long as my_matrices_location and my_int_location are set properly.

Here’s I was setting them INCORRECTLY:


// get number of active uniforms
GLint n = 200;
glGetProgramiv(my_prog_id,GL_ACTIVE_UNIFORMS,&n);
// get max uniform name length
GLint max_name_length;
glGetProgramiv(my_prog_id,GL_ACTIVE_UNIFORM_MAX_LENGTH,&max_name_length);
// buffer for name
GLchar * name = new GLchar[max_name_length];
// buffer for length
GLsizei length = 100;
// Type of variable
GLenum type;
// Count of variables
GLint size;
// loop over active uniforms getting each's name
for(GLuint u = 0;u < n;u++)
{
  glGetActiveUniform(my_prog_id,u,max_name_length,&length,&size,&type,name);
  if(string(name) == "my_matrices[0]")
  {
    my_matrices_location = u;
  }else if(string(name) == "my_int")
  {
    my_int_location = u;
  }
}

The mistake is that I’m confusing the variable’s index which I pass to glGetActiveUniform with it’s location which cannot be retrieved by glGetActiveUniform. Instead I need to retrieve the variable’s location with glGetUniformLocation.

Here’s how I’m now setting my uniform variable locations CORRECTLY:


my_matrices_location = glGetUniformLocation(my_prog_id,"my_matrices[0]");
my_int_location = glGetUniformLocation(my_prog_id,"my_int");

Basic OpenGL Cocoa App using C

Thursday, September 22nd, 2011

I’m rewriting this tutorial for Xcode 4 on Mac OS X 10.6. For whatever reason when I tried it on my computer it gave compilation errors. The only thing I change is the code inside glView.mm and glView.h

- create a new Cocoa Application
- ceate glView.h/glView.m
- copy and paste in this code and save files
- add frameworks: QuartzCore and OpenGL
- open up MainMenu.xib
- drag Custom View into main window and resize to fill window
- set the autosizing of view to have the two inner arrows on so it resizes automatically with the window
- set the class of the Custom View to be glView
- save nib
- go back to Xcode and run project

Should show a yellow window. Here’s the code:

glView.h

#import <Cocoa/Cocoa.h>

// for display link
#import <QuartzCore/QuartzCore.h>

@interface glView : NSOpenGLView
{
  CVDisplayLinkRef displayLink;
  
  double    deltaTime;
  double    outputTime;
  float    viewWidth;
  float    viewHeight;
}

@end

glView.mm


#import "glView.h"

@interface glView (InternalMethods)

- (CVReturn)getFrameForTime:(const CVTimeStamp *)outputTime;
- (void)drawFrame;

@end

@implementation glView

#pragma mark -
#pragma mark Display Link

static CVReturn MyDisplayLinkCallback(CVDisplayLinkRef displayLink, const CVTimeStamp *now,
                                      const CVTimeStamp *outputTime, CVOptionFlags flagsIn,
                                      CVOptionFlags *flagsOut, void *displayLinkContext)
{
  // go back to Obj-C for easy access to instance variables
  CVReturn result = [(glView *)displayLinkContext getFrameForTime:outputTime];
  return result;
}

- (CVReturn)getFrameForTime:(const CVTimeStamp *)outputTime
{
  // deltaTime is unused in this bare bones demo, but here's how to calculate it using display link info
  deltaTime = 1.0 / (outputTime->rateScalar * (double)outputTime->videoTimeScale / (double)outputTime->videoRefreshPeriod);
  
  [self drawFrame];
  
  return kCVReturnSuccess;
}

- (void)dealloc
{
  CVDisplayLinkRelease(displayLink);
  
  [super dealloc];
}

- (id)initWithFrame:(NSRect)frameRect
{
  // context setup
  NSOpenGLPixelFormat        *windowedPixelFormat;
  NSOpenGLPixelFormatAttribute    attribs[] = {
    NSOpenGLPFAWindow,
    NSOpenGLPFAColorSize, 32,
    NSOpenGLPFAAccelerated,
    NSOpenGLPFADoubleBuffer,
    NSOpenGLPFASingleRenderer,
    0 };
  
  windowedPixelFormat = [[NSOpenGLPixelFormat alloc] initWithAttributes:attribs];
  if (windowedPixelFormat == nil)
  {
    NSLog(@"Unable to create windowed pixel format.");
    exit(0);
  }
  self = [super initWithFrame:frameRect pixelFormat:windowedPixelFormat];
  if (self == nil)
  {
    NSLog(@"Unable to create a windowed OpenGL context.");
    exit(0);
  }
  [windowedPixelFormat release];
  
  // set synch to VBL to eliminate tearing
  GLint    vblSynch = 1;
  [[self openGLContext] setValues:&vblSynch forParameter:NSOpenGLCPSwapInterval];
  
  // set up the display link
  CVDisplayLinkCreateWithActiveCGDisplays(&displayLink);
  CVDisplayLinkSetOutputCallback(displayLink, MyDisplayLinkCallback, self);
  CGLContextObj cglContext = (CGLContextObj)[[self openGLContext] CGLContextObj];
  CGLPixelFormatObj cglPixelFormat = (CGLPixelFormatObj)[[self pixelFormat] CGLPixelFormatObj];
  CVDisplayLinkSetCurrentCGDisplayFromOpenGLContext(displayLink, cglContext, cglPixelFormat);
  
  return self;
}

- (void)awakeFromNib
{
  NSSize    viewBounds = [self bounds].size;
  viewWidth = viewBounds.width;
  viewHeight = viewBounds.height;
  
  // activate the display link
  CVDisplayLinkStart(displayLink);
}

- (void)reshape
{
  NSSize    viewBounds = [self bounds].size;
  viewWidth = viewBounds.width;
  viewHeight = viewBounds.height;
  
  NSOpenGLContext    *currentContext = [self openGLContext];
  [currentContext makeCurrentContext];
  
  // remember to lock the context before we touch it since display link is threaded
  CGLLockContext((CGLContextObj)[currentContext CGLContextObj]);
  
  // let the context know we've changed size
  [[self openGLContext] update];
  
  CGLUnlockContext((CGLContextObj)[currentContext CGLContextObj]);
}

- (void)drawRect:(NSRect)rect
{
  [self drawFrame];
}

- (void)drawFrame
{
  NSOpenGLContext    *currentContext = [self openGLContext];
  [currentContext makeCurrentContext];
  
  // must lock GL context because display link is threaded
  CGLLockContext((CGLContextObj)[currentContext CGLContextObj]);
  
  glViewport(0, 0, viewWidth, viewHeight);
  
  // Draw something that changes over time to prove to yourself that it's really updating in a tight loop
  glClearColor(
    sin(CFAbsoluteTimeGetCurrent()),
    sin(7.0*CFAbsoluteTimeGetCurrent()),
    sin(CFAbsoluteTimeGetCurrent()/3.0),0);
  glClear(GL_COLOR_BUFFER_BIT);
  
  // draw here
  
  [currentContext flushBuffer];
  
  CGLUnlockContext((CGLContextObj)[currentContext CGLContextObj]);
}

@end

TexMapPreview: simple texture mapping utility

Wednesday, September 21st, 2011

texmappreview simple texture mapping utility working on woody
I posted the source and binary of TexMapPreview. It’s a little texture mapping utility I’ve been using to visualize texture maps on the meshes I deform. It takes as input a mesh (with texture coordinates) and an (texture) image. Then it can either write the visualization of the texture mapped mesh to an output file or display it in a GLUT window. Glut is the only dependency.

TexMapPreview itself can only read and write .tga image files. But, I include a bash script wrapper which uses ImageMagick’s convert tool to enable reading and writing of all sorts of file formats (.png, .jpg, .tiff, whatever convert can read/write).