Testing against the nodes is almost trivial. It's a simple primitive (sphere, ellipsoid, ray...) against axis aligned bounding box test, which is pretty fast. It's not much slower than testing against a single polygon, so even at a depth of 10, it's not such a big deal compared to the alternative (testing agains all polygons in the mesh). BTW: In jPCT, the nodes are overlapping if needed to avoid polygon splitting.
There's no method to tell the optimal depth/polygon count for a tree/node. It's a matter of trial and error to find that out for your particular mesh. I tend to keep the polygon count in each node at around 500-1000. Anything lower usually doesn't help performance wise because (as you've mentioned), there is an overhead in traversing the tree and if it becomes too high, it might eat up the benefits of the tree.
Using the tree for rendering is another story, because it increases the number of draw calls and that will eat up the performance gain pretty quickly.