FORTRAN pitfalls

 1-8  FORTRAN COMMON PITFALLS
 ****************************
 (Thanks to Sergio Gelato for the important contributions, 
  to Dan Pop for the enlightening comments and suggestions, 
  and to Craig Burley for the very important comments)

 This chapter explains the less-obvious pitfalls that appear
 in the 'common pitfalls list' of the chapter on debugging.

 Contents
 --------
    1) Passing constants to a subprogram and modifying them 
    2) Using floating-point comparisons tests 
    3) Loops with a REAL control variable 
    4) Using the MOD function with REAL arguments 
    5) Assigning a real to an integer 
    6) Assuming intrinsic conversion functions take care of result type 
    7) Uncareful use of automatic type promotions 
    8) Letting array indexes go out of bounds 
    9) Common blocks losing their values while the program runs 
   10) Aliasing of dummy arguments and common block variables 
   11) Depending on assumptions about the computing order of subexpressions 
   12) Assuming Short-circuit evaluation of expressions 
   13) Using trigonometric functions with large arguments on some machines 
   14) Incompatible argument lists in CALL and the routine definition 





 Passing constants to a subprogram 
 ---------------------------------
 In FORTRAN, when a variable is passed as a actual argument to 
 a procedure, and the procedure modifies the corresponding dummy
 argument, the variable itself gets modified, unlike C or the 
 default argument-passing mechanism of Pascal.

 In other words, changing the value of a dummy argument inside 
 a procedure changes the actual/formal variable (corresponding 
 entity used in the CALL statement) that invoked it.

 FORTRAN constants are declared with the PARAMETER statement, 
 Constants are useful when we want to make sure that some quantity
 wouldn't change its value no matter what we do, it's a safety 
 measure against a certain kind of programming errors.

 Compilers take a variety of approaches to implementing constants, 
 often largely influenced by the underlying OS and CPU, depending 
 on performance vs. debugging considerations, and so on.

 Constants can be implemented in various ways: 

    1) Replacement during compilation of every occurrence 
       of the constant with its value. If the constant is
       passed to another procedure, it is no longer protected.

    2) Storing them in memory like ordinary variables, but
       making the associated memory area 'read only'.
       This method insures that the protection against
       modification is "propagated" to called procedures.

 Passing a constant to a procedure may be done:

    1) Just like ordinary variables with no extra protection
       against modification, this may be bad practice if 
       a "non-propagating" method is used. 

    2) Using passing by value, then at least the calling 
       procedure is protected.

 If you pass a constant, and modify it, either you get a segmentation 
 fault (access violation error), or something gets modified. If your 
 compiler pools and reuses constants, you might find that other 
 instances of the constant got modified as well!

 If modification of constants is possible, it defeats the original 
 purpose of a programming safety measure, and turns constants into 
 a programming trap.

 Operating systems like DOS work directly with physical addresses,
 and can't protect memory, so you will not get a warning from the
 system if you access entities supposed to be 'read only'.
 Operating systems with Virtual memory (VMS, UNIX) can protect 
 specific memory areas. 





 Using floating-point comparisons tests 
 --------------------------------------
 When comparing two floating-point values, one has to keep in mind 
 that .EQ. and .NE. test for strict equality of the model numbers 
 being compared.

 Often one needs to allow for small round-off errors in the results 
 of floating-point calculations, so that ABS(A-B) .LE. EPS, for some 
 suitable tolerance EPS, is more appropriate than A .EQ. B . EPS may 
 be a constant, or it may depend on A and B, e.g. It could be a 
 constant times MAX(ABS(A), ABS(B)).

 The best way to choose EPS is by studying the numerical properties 
 of the algorithms used to compute A and B; there is no single best 
 answer for all applications.

 Remember: in Fortran, and most other languages, floating-point
           values are always _approximations_ of mathematical 
           values (See the chapters on floating-point numbers). 

 In simple language, using the relational operators .EQ. or .NE.  
 with floating-point numbers is dangerous, because the value of 
 numbers may be different from the expected mathematical value, 
 due to radix conversion and roundoff errors. If you can't avoid
 FP tests, use the following statement-functions:

   FPEQ(X,Y) = ABS(X - Y) .LT. EPS 
   FPEQ(X,Y) = ABS(X - Y) .LT. EPS * MAX(ABS(X), ABS(Y))

 with EPS larger than the machine precision.




 Loops with a REAL or DOUBLE PRECISION control variable 
 ------------------------------------------------------
 Another version of the floating-point comparisons problem, with a REAL
 or DOUBLE PRECISION loop control variable, you may get different number 
 of loop iterations than what you expected.

 A DO loop:

      DO I = E1, E2, E3

 Will be executed:  

      (  N times  if  N  >  0
    = (
      (  0 times  if  N  <= 0

 where N = INT((E2 - E1 + E3) / E3) 

 The INT intrinsic function is very sensitive to small errors in 
 the quotient, if the quotient changes from 1.00001 to 0.99999 the 
 result will be quite different (zero iterations instead of one).

 DO loops with floating-point control variables are still allowed 
 in Fortran 90, but programmers are strongly advised against using 
 them, you can stop using them now.
 




 Using the MOD function with REAL arguments 
 ------------------------------------------
 This is yet another version of the floating-point comparisons problem.
 A small example program will make it clear:


      PROGRAM RNDOFF
C     ------------------------------------------------------------------
      REAL
     *          R1, R2,
     *          RESULT
C     ------------------------------------------------------------------
      R1 = 1.0
      R2 = 0.2
      RESULT = MOD(R1,R2)
      WRITE(*,'(1X,F3.1)') RESULT
C     ------------------------------------------------------------------
      END


 The MOD function really computes:

       MOD = R1 - (R2 * INT(R1 / R2))

 Upon converting the constant 0.2 to the internal binary representation 
 a small roundoff error is introduced, on some machines the internal 
 value will be rounded  UP a little. Because of the rounding up, 
 R1/R2 will be slightly smaller than 5.0, and the INT function will 
 give the value 4 instead of the correct value 5. The program then 
 will proceed to write the WRONG result 0.2.

 On other machines you may get some very small value. 





 Assigning a REAL/DOUBLE PRECISION to an INTEGER 
 -----------------------------------------------
 In a similar way roundoff errors are amplified by a conversion 
 to integer.

 An assignment of a REAL or DOUBLE PRECISION to an INTEGER is 
 really an implicit conversion from a floating-point to integer,
 and the roundoff errors just wait for such moments.

 A likely situation to fall into this trap is when you compute 
 array indexes from floating-point expressions.





 Assuming intrinsic conversion functions take care of result type 
 ----------------------------------------------------------------
 Because intrinsic routines are generic, some people assume that
 calling an intrinsic routine takes care of all data typing problems.
 A possible trap is:

      INTEGER               I
      DOUBLE PRECISION      X
      .....................
      I = 123456789
      X = REAL(I)
      WRITE(*,*) X

 The result of REAL() is just a REAL, it is followed by an implicit 
 conversion (by assignment) to DOUBLE PRECISION. Instead of directly 
 converting the integer to DOUBLE we are 'chaining' conversions, 
 this may cause precision loss. 

 It is correct in general that DBLE(I) will never be less accurate
 than DBLE(REAL(I)), and may be more accurate in many situations.

 The REAL(I) call may cause precision loss if the number of significant 
 bits of the value of I exceeds the number of bits in the mantissa of a 
 REAL, or radix conversion errors are introduced. 





 Uncareful use of automatic type promotions 
 ------------------------------------------
 Automatic type promotions (by assignment) sweeps data-type problems
 under the rug, all problems seem to take care of themselves with 
 that feature. 

 The danger is that we will 'chain' conversions and lose precision.
 In the following example we convert a decimal floating-point number
 to DOUBLE PRECISION, directly and via an intermediary REAL and 
 compare the results:


      PROGRAM CONV
C     ------------------------------------------------------------------
      REAL              y
C     ------------------------------------------------------------------
      DOUBLE PRECISION  xx, yy
C     ------------------------------------------------------------------
      CHARACTER*40      str
C     ------------------------------------------------------------------
100   CONTINUE
        WRITE(*,'(1X,A)') 'Enter a FP number: '
        READ(*,'(A)') str
        READ(UNIT=STR, FMT=*) xx
        READ(UNIT=STR, FMT=*) y
        yy = y
        WRITE (*,'(1X,3E20.10)') xx, yy, xx - yy
      GOTO 100
C     ------------------------------------------------------------------
      END


 Explicitly specified numeric constants (by assignment or PARAMETER)
 may create similar problems (see the chapter on constants).





 Letting array indexes go out of bounds 
 --------------------------------------
 By default, array indexes are not checked before being used. The memory 
 protection mechanism discovers out of bound references only when they 
 get out of allocated memory.

 The best way is to compile the program (until you are sure it works 
 o.k.) with:

   $ FORTRAN/CHECK=BOUNDS                        (VMS)
     f77 -C                                      (SUN)
     f77 -C                                      (IRIX)
     f77 -C                                      (OSF/1)
     f77 -C                                      (ULTRIX)
     xlf -C                                      (AIX) 

 In order to reap full benefits from the compiler's bounds checking 
 option, formal arguments that are arrays should not be declared as 
 assumed-size (with their last dimension given as *). 

 They should also not be declared with a size different from that of 
 the actual argument; in particular, declaring them of size (1) is 
 usually not a good idea. 

 Fortran-66 legacy code may contain such formal arguments with size 1. 
 At the very least they should be replaced with assumed-size (*) arrays. 
 Even better, the actual size should be passed as a separate argument 
 and the formal array should be declared with that size. 

 Example:

      SUBROUTINE S(A,N)
      INTEGER N
      REAL A(N)

 is better than

      SUBROUTINE S(A)
      REAL A(*)

 which in turn is better than

      SUBROUTINE S(A)
      REAL A(1)

 The last choice will cause the -C option to report spurious errors;
 the assumed-size version will prevent -C from detecting anything at all.

 Fortran 90 adds an even more convenient option (but only when the 
 subprogram interface is explicit at the point of call):

      SUBROUTINE S(A)
      REAL A(:)

 Be sure to declare S in a module or interface block in this case.





 Common blocks losing their values while the program runs 
 --------------------------------------------------------
 The FORTRAN standard DOESN'T guarantee that named common blocks 
 will keep their data values. When none of the procedures that 
 declared the common block are activated, the common block may 
 lose the data.

 Remember that an _unnamed_ (blank) common block does _not_ 
 lose its data.

 See the chapter on common blocks for more information on 
 common blocks.

 Static code checkers like FTNCHEK can warn you if your program 
 is in such danger. 





 Aliasing of dummy arguments and common block variables 
 ------------------------------------------------------
 The FORTRAN standard prohibits using the same variable (or array 
 element) more than once in the same CALL statement, if one of the 
 associated formal arguments (i.e. associated with the aliased 
 actual arguments) is assigned a value during the subprogram call.

 An example of such aliasing:

    CALL SUB1(ARRAY, ARRAY(1))

 Passing the same variable (or array element) at the same time 
 with a CALL statement and in a common block is also prohibited.

 What happens if you alias arguments by mistake?

 Eager to optimize compilers may assume the no-aliasing condition, 
 because it makes possible some more code optimizations, so if you 
 do use the illegal dummy aliases, the results of your computations 
 may be wrong. 

 The important point is that in the presence of dummy aliasing, 
 the results of a calculation may be different whether the compiler 
 assumes dummy aliasing or not. 

 You can check your compiler with the following program:


      PROGRAM ALIAS
C     ------------------------------------------------------------------
      INTEGER
     *          I, A(5)
C     ------------------------------------------------------------------
      DO I=1,5
        A(I) = 2
      ENDDO
C     ------------------------------------------------------------------
      WRITE(*,*) 'Before: ', A
      CALL SUB(A, A(1))
      WRITE(*,*) 'After:  ', A
C     ------------------------------------------------------------------
      END


      SUBROUTINE SUB(X, Y)
C     ------------------------------------------------------------------
      INTEGER
     *          I, X(5), Y
C     ------------------------------------------------------------------
      DO I=1,5
        X(I) = Y * X(I)
      ENDDO
C     ------------------------------------------------------------------
      RETURN
      END

 Our point can be easily checked on VMS, DEC Fortran provides a 
 compiler option that toggles on and off the aliasing assumption, 
 our example program will give different results when compiled on 
 VMS with:  

   Deafult option:   /ASSUME=NODUMMY_ALIASES (wrong in our case)
   ====================================================================
   Before:            2           2           2           2           2
   After:             4           4           4           4           4

   Option:   /ASSUME=DUMMY_ALIASES (slow, assures correct results)
   ====================================================================
   Before:            2           2           2           2           2
   After:             4           8           8           8           8


 The no-aliasing assumption may be implemented by having a local 
 copy of Y that is not affected by writes to array X, we get then 
 the all 4s result. 

 Don't try to use these effects in a program, in our small example
 it seems simple enough, but with more complex programs it may give
 unpredictable results. Moreover, different Fortran implementations 
 may display different behaviour. 

 To make it more clear why the second result is the correct one, 
 let's inline the subroutine and unroll all loops (substituting 
 Y = X(1)). A basic rule is that subroutines should give the same 
 results as the inlined code. 

     loop in main program:

        X(1) = 2                  X = (2, *, *, *, *)
        X(2) = 2                  X = (2, 2, *, *, *)
        X(3) = 2                  X = (2, 2, 2, *, *)
        X(4) = 2                  X = (2, 2, 2, 2, *)
        X(5) = 2                  X = (2, 2, 2, 2, 2)

     loop in subroutine:

        X(1) = X(1) * X(1)        X = (4, 2, 2, 2, 2)
        X(2) = X(1) * X(2)        X = (4, 8, 2, 2, 2)
        X(3) = X(1) * X(3)        X = (4, 8, 8, 2, 2)
        X(4) = X(1) * X(4)        X = (4, 8, 8, 8, 2)
        X(5) = X(1) * X(5)        X = (4, 8, 8, 8, 8)

 Our rule implies that (4,8,8,8,8) is the right answer.

    +---------------------------+
    |  AVOID VARIABLE ALIASING  |
    +---------------------------+





 Depending on assumptions about the computing order of subexpressions 
 --------------------------------------------------------------------
 The FORTRAN standard says nothing about the order in which the compiler
 will compute subexpressions when computing an arithmetical expression, 
 except that the compiler will obey nested parentheses. 

 Moreover, the standard explicitly requires that your program shouldn't 
 depend on the computing order.

 A small program can check the computing order on your machine: 


      PROGRAM ARGORD
      INTEGER		I, F1, F2, F3, F4
      EXTERNAL		F1, F2, F3, F4
      WRITE (*,*) 'Evaluation order: '
      I = F1(I) + F2(I) + F3(I) + F4(I)
      END

      INTEGER FUNCTION F1()
      WRITE (*,*) '	ARG #1 '
      F1 = 1
      RETURN
      END

      INTEGER FUNCTION F2()
      WRITE (*,*) '	ARG #2 '
      F2 = 1
      RETURN
      END

      INTEGER FUNCTION F3()
      WRITE (*,*) '	ARG #3 '
      F3 = 1
      RETURN
      END

      INTEGER FUNCTION F4()
      WRITE (*,*) '	ARG #4 '
      F4 = 1
      RETURN
      END

 All evaluation sequences are possible outputs of this program, 
 a common behaviour is (#1,#2,#3,#4), but exceptions like (#2,#3,#4,#1) 
 are known.





 Assuming Short-circuit evaluation of expressions 
 ------------------------------------------------
 Programmers accustomed to other languages (e.g. C) may try coding
 tricks based on the assumption that logical expressions are computed 
 only up to the point the value is known, for example: 


      INTEGER   I
      REAL      ARRAY(100)
      ...........................
      IF ((I .GE. 1) .AND. (I .LE. 100) .AND. ARRAY(I) .GT. 0) THEN
        WRITE (*,*) 'Array element is positive '
      ENDIF


      REAL      X
      ..............................................
      IF ((X .GT. 0.0) .AND. (LOG(X) .LE. 1.23) THEN


 The idea behind these IF conditions is to avoid the testing of the 
 last part of the condition, if the range-checking conditions are false. 
 In these examples testing the last condition with incorrect value
 of I or X is an error and may abort the program.

 Fortran doesn't guarantee short-circuiting evaluation of the .AND.
 operator, so the compiler is free to evaluate the last condition 
 even if the range-checking conditions were evaluated first and the 
 result was  .FALSE. 

 Furthermore, if the last condition contains a function-call, the 
 compiler is forced to evaluate the expression, even if it already 
 "knows" that the result of the whole expression will be  .FALSE.  
 because the function might have side effects.

 Programs using such a 'trick' may work when using one compiler and 
 abort with an error message on another.

 The FORTRAN standard allows short-circuiting evaluation, and many 
 compilers indeed provide it, you can check that by looking at the
 assembly listings of several test programs. 

 A VAX/VMS example: 


      SUBROUTINE SHRTCT(I, J, K)
C     ------------------------------------------------------------------
      INTEGER
     *			I, J, K
C     ------------------------------------------------------------------
      IF ((I .GT. 0) .AND. (J .GT. 0)) THEN
        K = 1
      ELSE
        K = 2
      ENDIF
C     ------------------------------------------------------------------
      RETURN
      END


            .TITLE  SHRTCT
            .IDENT  01
    
        0000    .PSECT      $CODE

SUBROUTINE SHRTCT(I, J, K)

        0000  SHRTCT::
        0000    .WORD       ^M
        0002    MOVAL       @K(AP), R0

IF ((I .GT. 0) .AND. (J .GT. 0)) THEN

        0006    TSTL        @I(AP)
        0009    BLEQ        L$1
        000B    TSTL        @J(AP)
        000E    BLEQ        L$1

K = 1

        0010    MOVL        #1, (R0)
        0013    BRB         L$2
        0015    NOP
        0016    NOP
        0017    NOP
        0018  L$1:
    
K = 2

        0018    MOVL        #2, (R0)
        001B  L$2:

RETURN

        001B    RET     
            .END




 Using trigonometric functions with large arguments on some machines 
 -------------------------------------------------------------------
 Computing trigonometric functions with large arguments is difficult
 because a very exact reduction of the argument to the primary range 
 (say between  -Pi  and  Pi) must be performed. 

 In other words, the reminder upon dividing the argument by the 
 transcendental number Pi must be found very accurately.

 Good algorithms were found around 1982, but some compilers didn't
 implement them, so they return garbage results, or 0.0 with an 
 error message. Beware especially of PC and old UNIX compilers.

 Some standards (AT&T system V) says that the result of computing 
 the trigonometric functions in such cases should be 0.0 with an 
 error message.





 Incompatible argument lists in CALL and the routine definition 
 --------------------------------------------------------------
 At run-time all variables reside in memory without any data type 
 information, and no type checking is done, variables are just 
 accessed by pre-computed addresses (starting address of variable).

 When you pass variables in an argument list (or common block), 
 and they are typed differently in the called procedure you will 
 get wrong results. 

 If you are familiar with the internal representations, you can
 perform special (and usually non-portable) processing tricks
 this way, it's just like using EQUIVALENCE (in a wild way).

 In short, if you don't know enough about the internal representation
 of data types don't use incompatible declarations.
Return to contents page