My "Stress Test" stage has 256 guys fighting each other on screen. Due to the AI decision process and hit detection, the minimal number of collision I was able to implement so far was (N^2 + N) / 2, which is 32896 collision tests per cycle.
@Informatix suggested me the usage of a QuadTree algorithm (thanks!), but I'm still studying its implementation.
Each character has a shadow, a special effects layer and a lifebar. That's 512 (to max 1024) objects (libGDX tex regions) being rendered per frame.
This is just a stress test, a normal stage will never have more than 16 characters per stage. Dead characters will always get recycled as new characters for the next stage area(s).
PS: I'm going to write my collision detector in C++ this week and compile it as an .so library, so hopefully this will give some extra horse power.