Originally Posted by
Slicker
...just downloaded the source. There are some easy minor optimizations to be made, things like:
float c = a * b;
float d = c * c;
float e = d * c;
That's really the same as a*b * a*b * a * b, or the same as (a*b)^3 which can be done like:
float e = pow(a*b,3);
The big optimizations would come from getting rid of all the loops in the GPU code. Loops, especially ones that branch in logic internally, cause stream processors to sit idle. POEM has multiple nested loops. I'm sure that's why the utiliization is so low.