No, translate() is fine. However, apart from rotateAxis(), which created one SimpleVector and one Matrix for each call, the methods you are using in your own code are partially a problem. Whenever i encounter such a bottleneck, i add a variant that takes the object to return as a parameter, so that you can reuse objects that you create in your own code.
I've uploaded a new jar (as well as new javadocs), that fixes the rotateAxis() flaw and adds two new methods, getTransformedCenter(<SimpleVector>) and fillDump(<float[]>) (instead of getDump()). Both should help you to optimize your code by creating some "pool"-objects that you can reuse in each iteration. And try to get rid of the call to new Matrix().
It's a shame, that Dalvik's gc sucks so much that this stuff is actually needed...