I was already using multiple threads. Disabling it for the clearing increases the time from 14 to 20 ms. So that's not helping. Guess I will have to accept it. Could you explain what is happening when clearing? As I understand it, it's "just" deleting all the old data and (re)initializing the data structure. Why is that taking such a long time (I'm just curious)?
Actually not much. It's clearing two buffers (color and depth) and some internal data structures for HSR and things like that. However, @2000*1000, each buffer has 2,000,000 pixels which means 8,000,000 bytes. So these two buffer alone have 16mb to be cleared. That simply takes it's time...
Is there a way to determine what the bottle neck is? Considering that the optimization I have in mind is far from trivial I'd be nice to know if it's worth it. If you can think of an easy way to add profiling, that would be very much appreciated!
I would create a simple test case for this. Just render a large cube filled with small cubes and another one where you are rendering the hull only and you should get a pretty good approximation of what to expect.