1-12 RECURSION
***************
(Thanks to Dieter Britz for the excellent comments, and to Craig Burley
for the corrections)
A small glossary
----------------
ACTIVATION RECORD Context associated with a given execution of a procedure
CONTEXT The state (values) of all variables at a given time
ENTRY POINT The place in the code where a routine starts executing
INSTANCE Another word for procedure activation
INVOCATION Calling a procedure
STACK A dynamic data structure (see below)
STACK POINTER A variable keeping the place of the stack's top
STATIC MEMORY Pre-allocated by the compiler (at compile time)
Introduction
------------
Recursion is calling a procedure from itself, directly (simple recursion)
or via calls to other procedures (mutual recursion). FORTRAN 77 does not
allow recursion, Fortran 90 does, (recursive routines must be explicitly
declared so).
Most FORTRAN 77 compilers allow recursion, some (e.g. DEC) require
using a compiler option (see compiler options chapter). The GNU g77,
which conforms strictly to the Fortran 77 standard, doesn't allow
recursion at all.
Recursion is usually less computationally efficient, compiler generated
procedure calls introduce overhead when storing all variables before a
procedure call and loading them back upon entering the procedure.
Automatic optimizations reduce the overhead associated with procedure
calls by eliminating redundant store/load operations to some degree,
but of course hand optimizing may improve upon that.
Efficiency is usually not the most important consideration when
choosing an algorithm, in the name of good programming we already
agreed to gladly suffer some measure of inefficiency.
In spite of the inherent inefficacy, recursion is important as the
natural solution to some problems is a recursive algorithm,
for example:
1) Identifying pixels belonging to the same 'blob'
2) Searching a tree-structured database
Writing a program using a recursive algorithm is usually difficult,
every small mistake, and in a recursive procedure, almost every
modification is a mistake, renders the procedure useless.
If your compiler doesn't support recursion, or you want to write a
fully portable program, or you want your program to be more efficient,
you can implement a recursive algorithm by manual simulation.
Simultaneous invocations of a procedure
---------------------------------------
Why the execution of recursive procedures requires special treatment?
Non-recursive programs have the property that at any given time many
procedures may be 'active', but only one 'instance' of a given procedure
is 'active'.
That means that non-recursive programs can be implemented STATICALLY,
i.e. each of the various variables (and arrays) used in the program are
assigned a pre-determined location in main memory at compile time
(subject to relocation by the link editor).
The characteristic property of recursion is that different invocation
of the same procedure are active at the same time, and the variables
associated with each invocation must be kept separately, so they may
hold different values.
We need to keep more than one copy of the procedure set of variables
(the code may be shared), one for each possible activation. Clearly
the simple static memory allocation scheme fails for recursive procedures,
and something more elaborate is needed.
Simple-minded implementation of recursion
-----------------------------------------
Let's say we want to implement a procedure that call itself, and
for some reason or other we know that the 'call depth' is less than
say 100 activations.
We create 100 copies of the procedure, name them in some way to keep
the names unique, e.g. SUB001, SUB002, ... ,SUB100, the CALL statements
inside each copy will refer to the next one, i.e. SUB001 will call
SUB002, SUB002 will call SUB003, ... and instead of the CALL statement
in SUB100 we will put a STOP 'Not enough copies '.
In this way we can perform recursion with a compiler that supports
only a static memory allocation scheme. Of course this method is very
inefficient and wasteful, We don't need to duplicate the procedure's
code, only the data part (activation record).
The solution to the problem of managing several 'activation records'
at the same time is the STACK data structure.
The stack data structure
------------------------
A stack is a sequence of memory locations with an associated STACK POINTER
that points to the 'current location', a stack can be manipulated by two
operations: PUSH and POP.
Stacks (like main-memory) are really one-dimensional entities, they go
from location 0 to location N, N is supposed to be large enough for the
task on hand ('real' stacks may go from -N to 0).
All locations 'below' the current value of the stack pointer are
considered to contain valid values, all locations above are considered
'garbage' left from previous operations. The stack can 'grow' into
the garbage area if needed.
PUSH increments the stack pointer, and puts a value at the location
the stack pointer points to, so it will point to the next available
place on the stack.
POP takes the value pointed to by the stack pointer, and decrements
the stack pointer, thus 'trashing' that location.
0006 |013|
0005 |140| GARBAGE AREA
0004 |112|
0003 |358|<----Stack-pointer
0002 |871|
0001 |348| VALID DATA AREA
0000 |231|
+---+
Address Data
Placing activation records on the stack
---------------------------------------
An ACTIVATION RECORD is a memory region holding the variables associated
with a certain activation (instance) of a procedure, and some control
information. Activation records are placed on the stack in the order
of activation.
Recursion simulation
--------------------
You can simulate recursion in FORTRAN 77, in an efficient way. The
warnings about the sensitivity of recursive procedures to programming
errors, apply even more to simulated recursion.
The general idea behind recursion simulation is to replace the recursion
with some suitable DO loop.
The straightforward method to simulate recursion is to closely imitate
the way compilers implement recursive procedure calls, that way we will
get a bulky and inefficient code, that can be manually optimized.
With enough experience you can do it more informally, replace the call
to the recursive procedure with a GOTO and add the necessary data
structures as needed.
Remember that the recursive algorithm may be equivalent to a simple
iterative algorithm, and imitating a compiler may hide that simple
algorithm behind redundant complexity.
Implementation of recursion simulation: general scheme
------------------------------------------------------
This is a general scheme for simulating one procedure calling
itself (simple recursion). See the following text for explanation
of the scheme.
Data-declarations (Of Modified data structures)
.......................................
stack-pointer = 1
Initialize the stack (First activation)
Entry-point:
Read local variables from the stack
.......................................
.-----------------------------------------.
| A recursive call: |
| stack-pointer = stack-pointer + 1 |
| Store local variables on the stack |
| GOTO Entry-point |
'-----------------------------------------'
.......................................
stack-pointer = stack-pointer - 1
IF (stack-pointer .GE. 1) GOTO Entry-point (exit point!)
Implementation of recursion simulation: data structures
-------------------------------------------------------
We need to keep the value of each variable for each activation of the
procedure, so it's clear we should replace every scalar variable with
a 1D array of the same data type (REAL etc.), and size large enough
to hold all expected activations. The array will be indexed by the
'call depth', the first activation will have index 1, the second and
third will be 2,3...
Similarly we will replace a 1D array by a 2D array, a 2D array by a
3D array, etc:
REAL scalar, array(10)
Will be replaced by:
INTEGER MAXACT
PARAMETER(MAXACT = 200)
REAL scalar, Xscalar(MAXACT), array(10), Xarray(10, MAXACT)
We add the expanded variables to the original ones, the original
variables will serve as 'local' variables (we will implement a
call by value parameter passing mechanism), the expended variables
taken together will serve as our simulated stack.
The added dimension is put last, it's more efficient that way because
of FORTRAN row major storage order of arrays.
Implementation of recursion simulation: stack pointer
-----------------------------------------------------
Our replacement for the stack pointer is an INTEGER variable holding
the value of the 'call depth', with possible values ranging from 1
(first simulated activation) to MAXACT.
On starting a new simulated activation the value of the 'stack pointer',
will be incremented by 1, and used to index the modified procedure data
structures (simulated stack).
Implementation of recursion simulation: entry point
---------------------------------------------------
After passing the simulated entry point, all variables are read from
the simulated stack. This step is necessary, as program control can
flow to the entry point from the 'exit point' (the conditional GOTO
at the end of the code).
If the exit point transfers control to the entry point, we have to
read the variables from the stack using the decremented stack-pointer,
the current values are no more good.
The scheme can be improved to skip reading the variables if the
current values are valid, i.e. the stack-pointer was not decremented.
Implementation of recursion simulation: recursive call
------------------------------------------------------
Instead of a 'recursive' CALL statement we will use a series of
FORTRAN statements:
1) Incrementing the stack-pointer by 1.
2) Several assignments that will store all variables on the
stack, thus creating a simulated 'activation record'.
3) Unconditional GOTO to the 'entry point'.
Implementation of recursion simulation: exit point
--------------------------------------------------
Recursion (simulated or not) must stop sometimes, the call depth may
go up and down while executing, but in the end it must drop below 1.
The natural exit point is at the end of the code, reaching the end
means we finished an activation, so we decrement the stack-pointer.
If the stack-pointer is zero, it means everything is over, otherwise
we loop again to the entry point.
Implementation of recursion simulation: termination condition
-------------------------------------------------------------
We exit the recursion when the call depth (value of stack-pointer)
drops below 1, something in the computation have to make the call
depth eventually drop to zero.
The termination condition of the recursion is a delicate matter,
in our blobs example we mark more and more elements of matrix B,
the marked elements keep a record of the places we have already
visited, as we don't visit places we have already visited, we avoid
the trap of going after blob points in circles.
That is a subtle and important problem in multi-dimensional searches,
because we manage to avoid it with the help of the history matrix B,
we will eventually get to the end of the recursion.
Example: Identifying 'blobs'
----------------------------
MATRIX(M,N) is the input array, elements larger than TRSH are 'blob'
pixels, the output array is BLOBS(M,N), when the double loop completes,
every element in BLOBS will contain the serial number of the 'blob' it
belongs to (or stay zero).
MAXSTACK should be raised if necessary, it should be on the order
of M*N, to allow large blobs.
This is a corrected version of the procedure, to be more portable
and understandable it was made a bit longer than necessary.
C ------------------------------------------------------------------
C | Procedure: Identify and mark "blobs" in a matrix.
C | Input is matrix A, matrix B gives for each point,
C | the serial number of the blob it belongs to.
C | Author: Abraham Agay
C | Update:
C ------------------------------------------------------------------
SUBROUTINE BLOBER(MATRIX, BLOBS, M, N, TRSH, CURBLB)
IMPLICIT NONE
C ------------------------------------------------------------------
INTEGER
* M, N, BLOBS(M,N), CURBLB,
* I, J, II, JJ,
* SP, MAXSTACK
REAL MATRIX(M,N), TRSH
PARAMETER (MAXSTACK = 100)
INTEGER STACK(2,MAXSTACK)
LOGICAL NEWGOOD, GOAHEAD
C ------------------------------------------------------------------
NEWGOOD(I,J) = (MATRIX(I,J) .GE. TRSH) .AND. (BLOBS(I,J) .EQ. 0)
C ==================================================================
CURBLB = 0
DO J = 1, N
DO I = 1, M
BLOBS(I,J) = 0
ENDDO
ENDDO
C ------------------------------------------------------------------
DO J = 1, N
DO I = 1, M
IF (.NOT. NEWGOOD(I,J)) GOTO 200
II = I
JJ = J
SP = 0
CURBLB = CURBLB + 1
GOAHEAD = .TRUE.
C ------------------------------------------------------------------
100 CONTINUE
IF (GOAHEAD) THEN
SP = SP + 1
IF (SP .GT. MAXSTACK) STOP 'BLOBER: Increase stack size! '
STACK(1,SP) = II
STACK(2,SP) = JJ
IF (MATRIX(II,JJ) .GE. TRSH) BLOBS(II,JJ) = CURBLB
ELSE
SP = SP - 1
IF (SP .LE. 0) GOTO 200
II = STACK(1,SP)
JJ = STACK(2,SP)
GOAHEAD = .TRUE.
ENDIF
C ------------------------------------------------------------------
IF (II .LT. M) THEN
IF (NEWGOOD(II+1,JJ)) THEN
II = II + 1
GOTO 100
ENDIF
ENDIF
C ------------------------------------------------------------------
IF (JJ .LT. N) THEN
IF (NEWGOOD(II,JJ+1)) THEN
JJ = JJ + 1
GOTO 100
ENDIF
ENDIF
C ------------------------------------------------------------------
IF (II .GT. 1) THEN
IF (NEWGOOD(II-1,JJ)) THEN
II = II - 1
GOTO 100
ENDIF
ENDIF
C ------------------------------------------------------------------
IF (JJ .GT. 1) THEN
IF (NEWGOOD(II,JJ-1)) THEN
JJ = JJ - 1
GOTO 100
ENDIF
ENDIF
C ------------------------------------------------------------------
GOAHEAD = .FALSE.
GOTO 100
C ------------------------------------------------------------------
200 CONTINUE
ENDDO
ENDDO
RETURN
END
By the way, there is a good non-recursive algorithm by Yoav Levy
of the Hebrew University of Jerusalem ( yoavl@vms.huji.ac.il ).
Bibliography
------------
On simulating recursion:
Data structures using Pascal (2nd edition)
Aaron M. Tenenbaum, Moshe J. Augenstein
Prentice-Hall 1986
ISBN 0-13-196684-7
On computer architecture:
Computer Organization, 3rd edition
V. Carl Hamacher, Zvonko G. Vranesic, Safwat G. Zaky
McGraw-Hill 1984
ISBN 0-07-Y66313-0 (2nd edition)
Return to contents page