Tried it...
First, your code didn't compile anything, because you didn't compile the actual object but the one that you used to be feeded in the super(Object3D)-call, which doesn't work. So "Block" should actually look like this:
package env;
import com.threed.jpct.*;
/**
*
* @author User
*/
public class Block extends Object3D
{
private static final long serialVersionUID = 1L;
public static Object3D MODEL = null;
public static Object3D getModel()
{
if(MODEL == null)
{
MODEL = Primitives.getCube(0.5f);
MODEL.setTexture("box");
MODEL.setEnvmapped(Object3D.ENVMAP_ENABLED);
MODEL.setCollisionMode(Object3D.COLLISION_CHECK_OTHERS);
MODEL.build();
MODEL.compile();
MODEL.rotateY((float) -Math.PI / 4f);
}
return MODEL;
}
public Block()
{
super(getModel(), true);
shareCompiledData(MODEL);
compile();
}
}
With that, the block will actually be compiled. However, that doesn't really help much (if at all)...
(But you should do it anyway...).
So, what's the reason for this? I did some benchmarking with your test code and discovered some minor bottlenecks, which i've fixed in this beta jar:
http://www.jpct.net/download/beta/jpct.jarHowever, even with this jar, it's still not very much faster. So i decided to tweak settings for this particular scene (i.e. all blocks in view, nothing clipped, nothing culled, nothing transparent etc...) by adding these lines:
Config.doSorting=false;
Config.alwaysSort=false;
Config.glMultiPassSorting=false;
Config.useFrustumCulling=false;
The first three ones disable sorting completely. It's not needed in your scene. It might be needed in more realistic scenes though.
The last line disables frustum culling, which isn't helpful if all objects of a scene are in view.
With that, performance slightly increases on my machine (Core2 Quad @ 3.2Ghz, Radeon HD 5870, Windows Vista, Java 6), but it's still pretty low. So i did some more detailed benchmarks to see where the time is actually spend.
The time taken for a complete render cycle is around:
34msThese consist of:
Engine work, i.e. iterate over all objects, setup transformation matrices, set engine states...that kind of stuff:
12msRender time, i.e. setting up OpenGL states and draw the display lists:
22msThe latter 22ms split into:
Setting up GL and similar actions:
9msRendering the DL:
13msSo we have 13ms that we can't avoid, because that's simple display list drawing that has to be done no matter what (Which means that the highest possible framerate when only making display list calls (which isn't possible) is somewhere around 76fps).
That leaves us with 9ms for GL setup and 12ms for engine work. Those 12ms again split into 9ms pure processing code and 3ms state management for OpenGL. Those 3ms can't be optimized away without killing performance on everything that's not as simple as this test scene.
So we have 2 * 9ms left. The parts that take this time are already pretty much optimized. It might be possible to squeeze out one ms or the other, but that's pretty much what it takes.
This application is a very special case as it renders very simple objects but tons of them. All the overhead (GL's and jPCT's) stacks up pretty high in this case. I don't really see how i can improve this much further. I would like to try your direct-GL-testcase to see, how that one really performs and if it's up to the assumptions about maximum possible performance that i made above. Maybe i'm doing something really stupid that i just don't see right now...