Stack register

A stack register is a computer central processor register whose purpose is to keep track of a call stack. On an accumulator-based architecture machine, this may be a dedicated register. On a machine with multiple general-purpose registers, it may be a register that is reserved by convention, such as on the IBM System/360 through z/Architecture architecture and RISC architectures, or it may be a register that procedure call and return instructions are hardwired to use, such as on the PDP-11, VAX, and Intel x86 architectures. Some designs such as the Data General Eclipse had no dedicated register, but used a reserved hardware memory address for this function.

Machines before the late 1960s—such as the PDP-8 and HP 2100—did not have compilers which supported recursion. Their subroutine instructions typically would save the current location in the jump address, and then set the program counter to the next address.^[1] While this is simpler than maintaining a stack, since there is only one return location per subroutine code section, there cannot be recursion without considerable effort on the part of the programmer.

A stack machine may have 2 or more stack registers — one of them keeps track of a call stack, the other(s) keep track of other stack(s).

Stack engine

Simpler processors store the stack pointer in a regular hardware register and use the arithmetic logic unit (ALU) to manipulate its value. Typically push and pop are translated into multiple micro-ops, to separately add/subtract the stack pointer, and perform the load/store in memory.^[2]

Newer processors contain a dedicated stack engine to optimize stack operations. Pentium M was the first x86 processor to introduce a stack engine. In its implementation, the stack pointer is split among two registers: ESP_O, which is a 32-bit register, and ESP_d, an 8-bit delta value that is updated directly by stack operations. PUSH, POP, CALL and RET opcodes operate directly with the ESP_d register. If ESP_d is near overflow or the ESP register is referenced from other instructions (when ESP_d ≠ 0), a synchronisation micro-op is inserted that updates the ESP_O using the ALU and resets ESP_d to 0. This design has remained largely unmodified in later Intel processors, although ESP_O has been expanded to 64 bits.^[3]

A stack engine similar to Intel's was also adopted in the AMD K8 microarchitecture. In Bulldozer, the need for synchronization micro-ops was removed, but the internal design of the stack engine is not known.^[3]

References

^ Salomon, David (February 1993) [1992]. Written at California State University, Northridge, California, USA. Chivers, Ian D. (ed.). Assemblers and Loaders (PDF). Ellis Horwood Series In Computers And Their Applications (1 ed.). Chicester, West Sussex, UK: Ellis Horwood Limited / Simon & Schuster International Group. ISBN 0-13-052564-2. Archived (PDF) from the original on 2020-03-23. Retrieved 2008-10-01. Most computers save the return address in either the stack, in one of the registers, or in the first word of the procedure (in which case the first executable instruction of the procedure should be stored in the second word). If the latter method is used, a return from the procedure is a jump to the memory location whose address is contained in the first word of the procedure. (xiv+294+4 pages)
^ Stokes, Jon "Hannibal" (2004-02-25). "A Look at Centrino's Core: The Pentium M". archive.arstechnica.com. p. 5.
^ ^a ^b Fog, Agner. "The microarchitecture of Intel, AMD and VIA CPUs" (PDF). Technical University of Denmark.

[Salomon_1993-1] Salomon, David (February 1993) [1992]. Written at California State University, Northridge, California, USA. Chivers, Ian D. (ed.). Assemblers and Loaders (PDF). Ellis Horwood Series In Computers And Their Applications (1 ed.). Chicester, West Sussex, UK: Ellis Horwood Limited / Simon & Schuster International Group. ISBN 0-13-052564-2. Archived (PDF) from the original on 2020-03-23. Retrieved 2008-10-01. Most computers save the return address in either the stack, in one of the registers, or in the first word of the procedure (in which case the first executable instruction of the procedure should be stored in the second word). If the latter method is used, a return from the procedure is a jump to the memory location whose address is contained in the first word of the procedure. (xiv+294+4 pages)

[Stokes_2004-2] Stokes, Jon "Hannibal" (2004-02-25). "A Look at Centrino's Core: The Pentium M". archive.arstechnica.com. p. 5.

[Fog-3] Fog, Agner. "The microarchitecture of Intel, AMD and VIA CPUs" (PDF). Technical University of Denmark.

[1]

[2]

[3]