By doing any of this, you will _increase_ the number of heap allocations. My point was that, if you make a struct, then you will reduce the number of allocations from pushing every variable, one at a time, onto your stack. Using the heap is way slower than just using the built-in stack.
This type of thing *will* slow down your algorithm, unless you can find patterns in the algorithm.
E.g., quicksort has two recursive calls on the bottom that look like this:
quicksort(array, left, pivot - 1);
quicksort(array, pivot + 1, right);
You'll notice that, aside from as parameters, left, right, and pivot will never be used again (since this is the last part of the function). Also, array never changes, so there's no point in storing that. Therefore, I could modify quicksort to not make a recursive call by doing this instead:
push(new pair(0, array.count - 1));
while (!stack_empty) {
pop
// Do quicksort on popped pair
push(new pair(left, pivot - 1));
push(new pair(pivot + 1, right));
}
*If* you implement that correctly, it *might* be faster than the pure recursive version. However, if every pair is a heap allocation that is pushed onto the heap, I'd say it's not going to be good enough. Chances are, you'd need a vector with a large pre-allocated buffer that simulated a stack to actually see a performance improvement.
Keep in mind, however, that pushing/popping on the CPU's stack are usually single instructions, plus the stack is likely to be hot on the cache, and a function call is very low overhead. Finally, you can probably get better performance by marking the function fastcall (pass parameters in registers), rather than trying to tune things like this. |