In part 1, I talked a little about how Erlang optimizes tail recursive functions, a process generally known as Tail Call Optimization (TCO). To verify this, we can compile the functions into Erlang assembler source code and take a look. Erlang assembler source is the disassembled bytecode which gets converted to a BEAM file.
The erlc compiler command has a flag (-S) to compile into Erlang assembly.
$ erlc -S map.erl
A few notes about Erlang assembly:
- The full list of opcodes can be found here: genop.tab
- The CP register stands for Continuation Pointer
- There are two sets of registers: ‘x’ and ‘y’.
- The ‘x’ registers are used for passing function parameters
- The ‘y’ registers are used for local variables
Assembly code for map_tail/3
Lines (24-27) implements the main function clause which executes the mapping. It invokes a recursive call with the call_last/3 opcode. I pulled the description from the genop.tab file.
Opcode call_last/3 comments
## @spec call_last Arity Label Deallocate ## @doc Deallocate and do a tail recursive call to the function at Label. ## Do not update the CP register. ## Before the call deallocate Deallocate words of stack. 5: call_last/3
Assembly code for map_body/2
Lines (50-68) implement the main function clause which executes the mapping. It invokes a recursive call with the call/2 opcode. I pulled the description from the genop.tab file.
Opcode call/2 comments
## @spec call Arity Label ## @doc Call the function at Label. ## Save the next instruction as the return address in the CP register. 4: call/2
The tail recursive implementation replaces call/2 with call_last/3. From the description, call_last/3 will deallocate the stack frame before making the function call and not update the CP register. Therefore, the tail recursive implementation with be optimized to reuse the same stack frame for each recursive call.