jPCT benchmark?

EgonOlsen · March 16, 2009, 07:30:34 AM

Quote from: JavaMan on March 15, 2009, 11:30:02 PM
So, the only option that is lost from going non-compiled to compiled would be ability to change texture UVs on the fly?

Plus the limitation in the number of texture layers and some more or less minor things where the pipelines differ due to the way OpenGL does some things (specular lighting looks a little bit different as well as environment mapping, the mentioned Config.lightMul=1 of the compiled pipeline...that kind of stuff. But that's nothing to go crazy about...).

JavaMan · March 16, 2009, 01:43:09 PM

Oh, yes. I forgot

about the texture layer limit and lightMul.

EgonOlsen · March 24, 2009, 02:06:21 PM

Quote from: .jayderyu on March 15, 2009, 08:57:57 PM
Also I was wondering since there is new pipeline being made. Is it possible that there might be easier to allow access to the shader tech? I would really like to see Cell shading someday

I've written a test shader (which sets every fragment's color to green...

) and hooked it into the pipeline by using the IRenderHook that i've specified for this. It worked fine. The code is pretty simple:

Code Select


import java.nio.*;
import com.threed.jpct.*;
import org.lwjgl.*;
import org.lwjgl.opengl.*;

public class MyFirstShader implements IRenderHook {

	private String myShaderSource="void main() {gl_FragColor = vec4(0.0,1.0,0.0,1.0);}";
	private int prg=0;
	private int fragShade=0;
	private boolean init=false;
	
	public void beforeRendering(int polyID) {
		if (!init) {
			init();
		}
		ARBShaderObjects.glUseProgramObjectARB(prg);
	}

	public void afterRendering(int polyID) {
		ARBShaderObjects.glUseProgramObjectARB(0);
	}

	public void onDispose() {
		ARBShaderObjects.glDeleteObjectARB(fragShade);
		ARBShaderObjects.glDeleteObjectARB(prg);
	}

	public boolean repeatRendering() {
		return false;
	}
	
	private void init() {
		prg=ARBShaderObjects.glCreateProgramObjectARB();
		fragShade=ARBShaderObjects.glCreateShaderObjectARB(ARBFragmentShader.GL_FRAGMENT_SHADER_ARB);
		
		byte[] src=myShaderSource.getBytes();
		ByteBuffer shader = BufferUtils.createByteBuffer(src.length);
		shader.put(src);
		shader.flip();
		
		ARBShaderObjects.glShaderSourceARB(fragShade, shader);
		
		ARBShaderObjects.glCompileShaderARB(fragShade);
		ARBShaderObjects.glAttachObjectARB(prg, fragShade);
		ARBShaderObjects.glLinkProgramARB(prg);
		
		Logger.log("Shader compiled!", Logger.MESSAGE);
		
		init=true;
	}
}

EgonOlsen · March 24, 2009, 08:49:28 PM

Quote from: JavaMan on March 13, 2009, 09:40:37 PM
If you keep increasing fps at this rate, I won't be able to post because of my keyboard shorting out with all my drool.

Beethoven is at 3700fps now on my machine. I don't think, it'll go much further. Maybe 4000 is possible somehow, but i think that i'm coming closer to the limit of this setup...

Edit: It's @4000fps now...

EgonOlsen · March 27, 2009, 10:35:45 AM

Small update: Beethoven is @4200fps now...

Apart from beethoven, i changed Robombs to use compiled objects for most of the entities in the game. I discovered two small bugs while doing so, which are fixed now. Now, it just works fine...it didn't help much performance wise, because Robombs is quite low poly anyway.
However, i consider this to be an important milestone for this new feature. I think we are going beta with it really soon...(fingers crossed...).

fireside · March 27, 2009, 04:37:16 PM

I'm actually a little more excited about the shader hooks, even though I know nothing about shaders at this point.

.jayderyu · March 27, 2009, 04:40:31 PM

I admit that i'm pretty excited about the shader hook and the extra FPS. I don't think my projects will ever really need it, but who knows. With the shader hook I will certainly be looking at JPCT it for a bigger project I may start working on someday.

EgonOlsen · March 27, 2009, 08:04:21 PM

...well, then...here's a screen shot of the famous "all green shader" postet above:

(That's really a screen shot of it, not just something that i made with Photoshop...)

fireside · March 28, 2009, 12:56:56 AM

All Right!!! I can see I'll have some shader studying to do for my next project after the one I'm currently working on.

raft · March 28, 2009, 01:41:52 AM

very impressive Egon

it's very nice to come back to jPCT forum from time to time and see major improvements like this

so does this mean all disadvantage of jPCT for large scenes has gone ? i mean because of the overhead of its hybrid pipeline.. i also wonder how octrees fit into this pipeline: will they loose their significance ?

again, really very impressive

r a f t

.jayderyu · March 28, 2009, 04:48:36 AM

Quote from: fireside on March 28, 2009, 12:56:56 AM
All Right!!! I can see I'll have some shader studying to do for my next project after the one I'm currently working on.

You and me both, but then again maybe someone will make a cell shader that I can use.

EgonOlsen · March 28, 2009, 12:24:18 PM

Quote from: raft on March 28, 2009, 01:41:52 AM
so does this mean all disadvantage of jPCT for large scenes has gone ? i mean because of the overhead of its hybrid pipeline.. i also wonder how octrees fit into this pipeline: will they loose their significance ?

Yes, that overhead is gone for compiled objects. Currently, a compiled object still consumes a little bit more memory than it has to in some caes, but i'll fix this in a later version.
About octrees...actually no. They are still quite useful, but not necessarily for rendering. For collision detection, they still help a lot. For rendering...i'm undecided. The tests that i made showed a performance decrease when using them for rendering, because they tend to split the batches too much. When you compile an object, it will be split in a number of batches based on states (i.e. textures, blending modes...). If you use an octree in addition, the splitting includes the tree's leaves too. On a large terrain, this may still help...on a Quake3 level, it doesn't. That's the reason why i've added the option to use an octree for collision detection only.

JavaMan · March 29, 2009, 01:13:39 AM

Hi all,
With all this excitement over the new shader hook, it must be something worth looking into. I, um, have no idea what it is. So, I did a search and found this tutorial on the lwjgl wiki site. http://lwjgl.org/wiki/doku.php/lwjgl/tutorials/opengl/basicshaders. Is GLSL what is supposed to be used with the new hook? It looks rather complete for a complete Shultz-I no nothing-of the subject, so I wanted to make sure this is what I should learn before plunging in.

Oh, and by the way Egon, I didn't notice (for some reason??) that Beethoven was up to 4300. I'm sitting very far back in my chair so I won't have to replace another keyboard

Great job!

EgonOlsen · March 29, 2009, 10:57:24 AM

Quote from: JavaMan on March 29, 2009, 01:13:39 AM
Is GLSL what is supposed to be used with the new hook? It looks rather complete for a complete Shultz-I no nothing-of the subject, so I wanted to make sure this is what I should learn before plunging in.

The hook doesn't force you to use one way or the other, but GLSL is what should be used nowadays IMHO. So does the "all-green-example" above. You can see the source code of the used shader in it. It's hard coded into the first attribute.

EgonOlsen · March 29, 2009, 09:05:28 PM

In reply to raft's question about octrees, here are some performance statistics taken on my main machine (Core2 Quad@3Ghz, Radeon HD 4870, Vista, Java 6) displaying a typical FPS-view of a Quake3 level. Using other geometry, another view and another machine, the results can be completely different. Anyway:

Default pipeline:
no octree: 305fps
coarse octree: 300fps
finer octree: 340fps

Compiled pipeline:
no octree: 950fps
coarse octree: 450fps
finer octree: 305fps

You can see that the normal pipeline benefits from the finer octree, because it takes the geometry load away from the pipeline very early. Using the octree on the normal pipeline causes additional overhead for processing the tree, but doesn't require more state changes in the pipeline than without it. The coarse tree doesn't help, because it can't discard enough geometry to make up for the additional processing needed.
The compiled pipeline on the other hand is the complete opposite. The hardware has no problems with processing the complete geometry of a Quake3-level each frame, but using the octree causes more render calls and more state changes (because of smaller render batchs), slowing it down.

Keep in mind that these results are only valid for a quite low poly object displayed using a hardware with high poly throughput. When rendering a landscape or similar on an Intel onboard chipset, it may be a different story.