1-8 FORTRAN COMMON PITFALLS
****************************
(Thanks to Sergio Gelato for the important contributions,
to Dan Pop for the enlightening comments and suggestions,
and to Craig Burley for the very important comments)
This chapter explains the less-obvious pitfalls that appear
in the 'common pitfalls list' of the chapter on debugging.
Contents
--------
1) Passing constants to a subprogram and modifying them
2) Using floating-point comparisons tests
3) Loops with a REAL control variable
4) Using the MOD function with REAL arguments
5) Assigning a real to an integer
6) Assuming intrinsic conversion functions take care of result type
7) Uncareful use of automatic type promotions
8) Letting array indexes go out of bounds
9) Common blocks losing their values while the program runs
10) Aliasing of dummy arguments and common block variables
11) Depending on assumptions about the computing order of subexpressions
12) Assuming Short-circuit evaluation of expressions
13) Using trigonometric functions with large arguments on some machines
14) Incompatible argument lists in CALL and the routine definition
Passing constants to a subprogram
---------------------------------
In FORTRAN, when a variable is passed as a actual argument to
a procedure, and the procedure modifies the corresponding dummy
argument, the variable itself gets modified, unlike C or the
default argument-passing mechanism of Pascal.
In other words, changing the value of a dummy argument inside
a procedure changes the actual/formal variable (corresponding
entity used in the CALL statement) that invoked it.
FORTRAN constants are declared with the PARAMETER statement,
Constants are useful when we want to make sure that some quantity
wouldn't change its value no matter what we do, it's a safety
measure against a certain kind of programming errors.
Compilers take a variety of approaches to implementing constants,
often largely influenced by the underlying OS and CPU, depending
on performance vs. debugging considerations, and so on.
Constants can be implemented in various ways:
1) Replacement during compilation of every occurrence
of the constant with its value. If the constant is
passed to another procedure, it is no longer protected.
2) Storing them in memory like ordinary variables, but
making the associated memory area 'read only'.
This method insures that the protection against
modification is "propagated" to called procedures.
Passing a constant to a procedure may be done:
1) Just like ordinary variables with no extra protection
against modification, this may be bad practice if
a "non-propagating" method is used.
2) Using passing by value, then at least the calling
procedure is protected.
If you pass a constant, and modify it, either you get a segmentation
fault (access violation error), or something gets modified. If your
compiler pools and reuses constants, you might find that other
instances of the constant got modified as well!
If modification of constants is possible, it defeats the original
purpose of a programming safety measure, and turns constants into
a programming trap.
Operating systems like DOS work directly with physical addresses,
and can't protect memory, so you will not get a warning from the
system if you access entities supposed to be 'read only'.
Operating systems with Virtual memory (VMS, UNIX) can protect
specific memory areas.
Using floating-point comparisons tests
--------------------------------------
When comparing two floating-point values, one has to keep in mind
that .EQ. and .NE. test for strict equality of the model numbers
being compared.
Often one needs to allow for small round-off errors in the results
of floating-point calculations, so that ABS(A-B) .LE. EPS, for some
suitable tolerance EPS, is more appropriate than A .EQ. B . EPS may
be a constant, or it may depend on A and B, e.g. It could be a
constant times MAX(ABS(A), ABS(B)).
The best way to choose EPS is by studying the numerical properties
of the algorithms used to compute A and B; there is no single best
answer for all applications.
Remember: in Fortran, and most other languages, floating-point
values are always _approximations_ of mathematical
values (See the chapters on floating-point numbers).
In simple language, using the relational operators .EQ. or .NE.
with floating-point numbers is dangerous, because the value of
numbers may be different from the expected mathematical value,
due to radix conversion and roundoff errors. If you can't avoid
FP tests, use the following statement-functions:
FPEQ(X,Y) = ABS(X - Y) .LT. EPS
FPEQ(X,Y) = ABS(X - Y) .LT. EPS * MAX(ABS(X), ABS(Y))
with EPS larger than the machine precision.
Loops with a REAL or DOUBLE PRECISION control variable
------------------------------------------------------
Another version of the floating-point comparisons problem, with a REAL
or DOUBLE PRECISION loop control variable, you may get different number
of loop iterations than what you expected.
A DO loop:
DO I = E1, E2, E3
Will be executed:
( N times if N > 0
= (
( 0 times if N <= 0
where N = INT((E2 - E1 + E3) / E3)
The INT intrinsic function is very sensitive to small errors in
the quotient, if the quotient changes from 1.00001 to 0.99999 the
result will be quite different (zero iterations instead of one).
DO loops with floating-point control variables are still allowed
in Fortran 90, but programmers are strongly advised against using
them, you can stop using them now.
Using the MOD function with REAL arguments
------------------------------------------
This is yet another version of the floating-point comparisons problem.
A small example program will make it clear:
PROGRAM RNDOFF
C ------------------------------------------------------------------
REAL
* R1, R2,
* RESULT
C ------------------------------------------------------------------
R1 = 1.0
R2 = 0.2
RESULT = MOD(R1,R2)
WRITE(*,'(1X,F3.1)') RESULT
C ------------------------------------------------------------------
END
The MOD function really computes:
MOD = R1 - (R2 * INT(R1 / R2))
Upon converting the constant 0.2 to the internal binary representation
a small roundoff error is introduced, on some machines the internal
value will be rounded UP a little. Because of the rounding up,
R1/R2 will be slightly smaller than 5.0, and the INT function will
give the value 4 instead of the correct value 5. The program then
will proceed to write the WRONG result 0.2.
On other machines you may get some very small value.
Assigning a REAL/DOUBLE PRECISION to an INTEGER
-----------------------------------------------
In a similar way roundoff errors are amplified by a conversion
to integer.
An assignment of a REAL or DOUBLE PRECISION to an INTEGER is
really an implicit conversion from a floating-point to integer,
and the roundoff errors just wait for such moments.
A likely situation to fall into this trap is when you compute
array indexes from floating-point expressions.
Assuming intrinsic conversion functions take care of result type
----------------------------------------------------------------
Because intrinsic routines are generic, some people assume that
calling an intrinsic routine takes care of all data typing problems.
A possible trap is:
INTEGER I
DOUBLE PRECISION X
.....................
I = 123456789
X = REAL(I)
WRITE(*,*) X
The result of REAL() is just a REAL, it is followed by an implicit
conversion (by assignment) to DOUBLE PRECISION. Instead of directly
converting the integer to DOUBLE we are 'chaining' conversions,
this may cause precision loss.
It is correct in general that DBLE(I) will never be less accurate
than DBLE(REAL(I)), and may be more accurate in many situations.
The REAL(I) call may cause precision loss if the number of significant
bits of the value of I exceeds the number of bits in the mantissa of a
REAL, or radix conversion errors are introduced.
Uncareful use of automatic type promotions
------------------------------------------
Automatic type promotions (by assignment) sweeps data-type problems
under the rug, all problems seem to take care of themselves with
that feature.
The danger is that we will 'chain' conversions and lose precision.
In the following example we convert a decimal floating-point number
to DOUBLE PRECISION, directly and via an intermediary REAL and
compare the results:
PROGRAM CONV
C ------------------------------------------------------------------
REAL y
C ------------------------------------------------------------------
DOUBLE PRECISION xx, yy
C ------------------------------------------------------------------
CHARACTER*40 str
C ------------------------------------------------------------------
100 CONTINUE
WRITE(*,'(1X,A)') 'Enter a FP number: '
READ(*,'(A)') str
READ(UNIT=STR, FMT=*) xx
READ(UNIT=STR, FMT=*) y
yy = y
WRITE (*,'(1X,3E20.10)') xx, yy, xx - yy
GOTO 100
C ------------------------------------------------------------------
END
Explicitly specified numeric constants (by assignment or PARAMETER)
may create similar problems (see the chapter on constants).
Letting array indexes go out of bounds
--------------------------------------
By default, array indexes are not checked before being used. The memory
protection mechanism discovers out of bound references only when they
get out of allocated memory.
The best way is to compile the program (until you are sure it works
o.k.) with:
$ FORTRAN/CHECK=BOUNDS (VMS)
f77 -C (SUN)
f77 -C (IRIX)
f77 -C (OSF/1)
f77 -C (ULTRIX)
xlf -C (AIX)
In order to reap full benefits from the compiler's bounds checking
option, formal arguments that are arrays should not be declared as
assumed-size (with their last dimension given as *).
They should also not be declared with a size different from that of
the actual argument; in particular, declaring them of size (1) is
usually not a good idea.
Fortran-66 legacy code may contain such formal arguments with size 1.
At the very least they should be replaced with assumed-size (*) arrays.
Even better, the actual size should be passed as a separate argument
and the formal array should be declared with that size.
Example:
SUBROUTINE S(A,N)
INTEGER N
REAL A(N)
is better than
SUBROUTINE S(A)
REAL A(*)
which in turn is better than
SUBROUTINE S(A)
REAL A(1)
The last choice will cause the -C option to report spurious errors;
the assumed-size version will prevent -C from detecting anything at all.
Fortran 90 adds an even more convenient option (but only when the
subprogram interface is explicit at the point of call):
SUBROUTINE S(A)
REAL A(:)
Be sure to declare S in a module or interface block in this case.
Common blocks losing their values while the program runs
--------------------------------------------------------
The FORTRAN standard DOESN'T guarantee that named common blocks
will keep their data values. When none of the procedures that
declared the common block are activated, the common block may
lose the data.
Remember that an _unnamed_ (blank) common block does _not_
lose its data.
See the chapter on common blocks for more information on
common blocks.
Static code checkers like FTNCHEK can warn you if your program
is in such danger.
Aliasing of dummy arguments and common block variables
------------------------------------------------------
The FORTRAN standard prohibits using the same variable (or array
element) more than once in the same CALL statement, if one of the
associated formal arguments (i.e. associated with the aliased
actual arguments) is assigned a value during the subprogram call.
An example of such aliasing:
CALL SUB1(ARRAY, ARRAY(1))
Passing the same variable (or array element) at the same time
with a CALL statement and in a common block is also prohibited.
What happens if you alias arguments by mistake?
Eager to optimize compilers may assume the no-aliasing condition,
because it makes possible some more code optimizations, so if you
do use the illegal dummy aliases, the results of your computations
may be wrong.
The important point is that in the presence of dummy aliasing,
the results of a calculation may be different whether the compiler
assumes dummy aliasing or not.
You can check your compiler with the following program:
PROGRAM ALIAS
C ------------------------------------------------------------------
INTEGER
* I, A(5)
C ------------------------------------------------------------------
DO I=1,5
A(I) = 2
ENDDO
C ------------------------------------------------------------------
WRITE(*,*) 'Before: ', A
CALL SUB(A, A(1))
WRITE(*,*) 'After: ', A
C ------------------------------------------------------------------
END
SUBROUTINE SUB(X, Y)
C ------------------------------------------------------------------
INTEGER
* I, X(5), Y
C ------------------------------------------------------------------
DO I=1,5
X(I) = Y * X(I)
ENDDO
C ------------------------------------------------------------------
RETURN
END
Our point can be easily checked on VMS, DEC Fortran provides a
compiler option that toggles on and off the aliasing assumption,
our example program will give different results when compiled on
VMS with:
Deafult option: /ASSUME=NODUMMY_ALIASES (wrong in our case)
====================================================================
Before: 2 2 2 2 2
After: 4 4 4 4 4
Option: /ASSUME=DUMMY_ALIASES (slow, assures correct results)
====================================================================
Before: 2 2 2 2 2
After: 4 8 8 8 8
The no-aliasing assumption may be implemented by having a local
copy of Y that is not affected by writes to array X, we get then
the all 4s result.
Don't try to use these effects in a program, in our small example
it seems simple enough, but with more complex programs it may give
unpredictable results. Moreover, different Fortran implementations
may display different behaviour.
To make it more clear why the second result is the correct one,
let's inline the subroutine and unroll all loops (substituting
Y = X(1)). A basic rule is that subroutines should give the same
results as the inlined code.
loop in main program:
X(1) = 2 X = (2, *, *, *, *)
X(2) = 2 X = (2, 2, *, *, *)
X(3) = 2 X = (2, 2, 2, *, *)
X(4) = 2 X = (2, 2, 2, 2, *)
X(5) = 2 X = (2, 2, 2, 2, 2)
loop in subroutine:
X(1) = X(1) * X(1) X = (4, 2, 2, 2, 2)
X(2) = X(1) * X(2) X = (4, 8, 2, 2, 2)
X(3) = X(1) * X(3) X = (4, 8, 8, 2, 2)
X(4) = X(1) * X(4) X = (4, 8, 8, 8, 2)
X(5) = X(1) * X(5) X = (4, 8, 8, 8, 8)
Our rule implies that (4,8,8,8,8) is the right answer.
+---------------------------+
| AVOID VARIABLE ALIASING |
+---------------------------+
Depending on assumptions about the computing order of subexpressions
--------------------------------------------------------------------
The FORTRAN standard says nothing about the order in which the compiler
will compute subexpressions when computing an arithmetical expression,
except that the compiler will obey nested parentheses.
Moreover, the standard explicitly requires that your program shouldn't
depend on the computing order.
A small program can check the computing order on your machine:
PROGRAM ARGORD
INTEGER I, F1, F2, F3, F4
EXTERNAL F1, F2, F3, F4
WRITE (*,*) 'Evaluation order: '
I = F1(I) + F2(I) + F3(I) + F4(I)
END
INTEGER FUNCTION F1()
WRITE (*,*) ' ARG #1 '
F1 = 1
RETURN
END
INTEGER FUNCTION F2()
WRITE (*,*) ' ARG #2 '
F2 = 1
RETURN
END
INTEGER FUNCTION F3()
WRITE (*,*) ' ARG #3 '
F3 = 1
RETURN
END
INTEGER FUNCTION F4()
WRITE (*,*) ' ARG #4 '
F4 = 1
RETURN
END
All evaluation sequences are possible outputs of this program,
a common behaviour is (#1,#2,#3,#4), but exceptions like (#2,#3,#4,#1)
are known.
Assuming Short-circuit evaluation of expressions
------------------------------------------------
Programmers accustomed to other languages (e.g. C) may try coding
tricks based on the assumption that logical expressions are computed
only up to the point the value is known, for example:
INTEGER I
REAL ARRAY(100)
...........................
IF ((I .GE. 1) .AND. (I .LE. 100) .AND. ARRAY(I) .GT. 0) THEN
WRITE (*,*) 'Array element is positive '
ENDIF
REAL X
..............................................
IF ((X .GT. 0.0) .AND. (LOG(X) .LE. 1.23) THEN
The idea behind these IF conditions is to avoid the testing of the
last part of the condition, if the range-checking conditions are false.
In these examples testing the last condition with incorrect value
of I or X is an error and may abort the program.
Fortran doesn't guarantee short-circuiting evaluation of the .AND.
operator, so the compiler is free to evaluate the last condition
even if the range-checking conditions were evaluated first and the
result was .FALSE.
Furthermore, if the last condition contains a function-call, the
compiler is forced to evaluate the expression, even if it already
"knows" that the result of the whole expression will be .FALSE.
because the function might have side effects.
Programs using such a 'trick' may work when using one compiler and
abort with an error message on another.
The FORTRAN standard allows short-circuiting evaluation, and many
compilers indeed provide it, you can check that by looking at the
assembly listings of several test programs.
A VAX/VMS example:
SUBROUTINE SHRTCT(I, J, K)
C ------------------------------------------------------------------
INTEGER
* I, J, K
C ------------------------------------------------------------------
IF ((I .GT. 0) .AND. (J .GT. 0)) THEN
K = 1
ELSE
K = 2
ENDIF
C ------------------------------------------------------------------
RETURN
END
.TITLE SHRTCT
.IDENT 01
0000 .PSECT $CODE
SUBROUTINE SHRTCT(I, J, K)
0000 SHRTCT::
0000 .WORD ^M
0002 MOVAL @K(AP), R0
IF ((I .GT. 0) .AND. (J .GT. 0)) THEN
0006 TSTL @I(AP)
0009 BLEQ L$1
000B TSTL @J(AP)
000E BLEQ L$1
K = 1
0010 MOVL #1, (R0)
0013 BRB L$2
0015 NOP
0016 NOP
0017 NOP
0018 L$1:
K = 2
0018 MOVL #2, (R0)
001B L$2:
RETURN
001B RET
.END
Using trigonometric functions with large arguments on some machines
-------------------------------------------------------------------
Computing trigonometric functions with large arguments is difficult
because a very exact reduction of the argument to the primary range
(say between -Pi and Pi) must be performed.
In other words, the reminder upon dividing the argument by the
transcendental number Pi must be found very accurately.
Good algorithms were found around 1982, but some compilers didn't
implement them, so they return garbage results, or 0.0 with an
error message. Beware especially of PC and old UNIX compilers.
Some standards (AT&T system V) says that the result of computing
the trigonometric functions in such cases should be 0.0 with an
error message.
Incompatible argument lists in CALL and the routine definition
--------------------------------------------------------------
At run-time all variables reside in memory without any data type
information, and no type checking is done, variables are just
accessed by pre-computed addresses (starting address of variable).
When you pass variables in an argument list (or common block),
and they are typed differently in the called procedure you will
get wrong results.
If you are familiar with the internal representations, you can
perform special (and usually non-portable) processing tricks
this way, it's just like using EQUIVALENCE (in a wild way).
In short, if you don't know enough about the internal representation
of data types don't use incompatible declarations.
Return to contents page