2-13 USING STRINGS AND CHARACTER ARRAYS
****************************************
Comparison between strings and character arrays
-----------------------------------------------
| Strings | Character arrays
=============================|======================|========================
Substring notation | ST(I:J) | not allowed
-----------------------------|----------------------|------------------------
Array notation | not allowed | AR(I)
=============================|======================|========================
Constant declaration syntax | CHARACTER ST*10 | CHARACTER AR(10)
-----------------------------|----------------------|------------------------
Block I/O operations | whole [sub]string | only with an implied DO
of the constant notation | |
=============================|======================|========================
Star declaration syntax | CHARACTER ST*(*) | CHARACTER AR(*)
-----------------------------|----------------------|------------------------
Semantics of the star | the length is passed | no length information
declaration syntax | transparently | is passed
-----------------------------|----------------------|------------------------
Block I/O operations | whole [sub]string | only with an implied DO
of the star notation | |
-----------------------------|----------------------|------------------------
Mechanisms used for | hidden argument, | you are responsible
passing the length | descriptor, | to keep inside bounds
=============================|======================|========================
Variable declaration syntax | CHARACTER ST*(N) | CHARACTER ST*(N)
-----------------------------|----------------------|------------------------
Variable declaration | | the usual adjustable
semantics | | array mechanism
-----------------------------|----------------------|------------------------
The blank padding
-----------------
When you declare a FORTRAN's string, you define the maximal length it can
have ("the physical length"). Whole string operations "using" only "part"
of it, e.g. assignment of a shorter string, or reading a shorter record,
automatically pads the rest of the string with blanks (spaces).
CHARACTER STRING*12
...........................
STRING = 'FORTRAN'
|--------------- Physical Length ---------------|
+---+---+---+---+---+---+---+---+---+---+---+---+
| F | O | R | T | R | A | N | | | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+
|------ Logical Length -----|---- Blank tail ---|
It is clear that the physical length doesn't change, but the logical
length may change on each assignment or read operation.
A subtle point about FORTRAN strings is that the "logical" length of the
string is not well-defined - it is defined only up to an arbitrary number
of trailing blank characters. Having assigned some text to a string all
the information on the original number of trailing blank characters is
irreversibly lost, e.g. trying to concatenate the text in two strings is
ambiguous, you can't be sure if there was one or more blanks at the end
of the first one.
CHARACTER ST1*10, ST2*10
...........................
ST1 = 'FORTRAN'
ST2 = 'FORTRAN '
IF (ST1 .EQ. ST2) WRITE (*,*) 'The strings are equal! '
The blank padding at the end of the string is counted when you use the
LEN() function to find the string's length, or when you WRITE the string.
To find the "true" length of the string use:
integer function strlen(st)
integer i
character st*(*)
i = len(st)
do while (st(i:i) .eq. ' ')
i = i - 1
enddo
strlen = i
return
end
Strings don't come initialized with blanks, if the compiler initializes
them (VMS, Sun) they are initialized to NULs. Note that some terminals
(e.g. VTnnn) ignore NUL characters and if such a string is written to the
screen there will be no visible output (except the start of a new line).
Self-assignment of strings
--------------------------
Be careful when assigning strings to themselves, the FORTRAN 77 standard
prohibits some common situations (Fortran 90 lifted this restriction).
A string (or sub-string) STR may not be assigned a character expression
that one of its components is an overlapping substring of STR itself.
A small example program:
PROGRAM SLFASS
CHARACTER
* STRING*10
STRING = '1234567890'
STRING(2:) = STRING
WRITE (*,*) ' Correct result is: 1123456789 '
WRITE (*,*) ' Local result is: ', STRING
END
On some machines you'll get a string composed of 1s only!
The FORTRAN 77 standard didn't allow "self assignments" because the
"right" way to do it requires (in the general case) using a temporary
character variable whose length is known only at run-time. Some older
machines used at the time the FORTRAN 77 standard was written had
problems with dynamic (run-time) memory allocations, to accommodate
their needs the standard choosed to restrict character assignments.
A possible workaround for "self assignments" is concatenation
with a null string:
STRING(2:) = STRING // ''
Null strings
------------
The FORTRAN standard doesn't allow null constant strings (strings
with length = 0), you can check that with a small program:
PROGRAM NULSTR
C ------------------------------------------------------------------
CHARACTER*1 STRING
C ------------------------------------------------------------------
STRING = ''
WRITE(*,*) ' STRING= |', STRING, '|'
C ------------------------------------------------------------------
END
Input/Output
------------
FORTRAN supports input and output of strings, a very convenient
feature, and a rich set of string operations.
You can use WRITE and READ with passed strings since they are not
assumed-size strings, although the syntax looks similar.
Sub-string manipulations
------------------------
The following code shows some elementary 'tricks':
INTEGER OFFSET1, OFFSET2
CHARACTER STRING1*20, STRING2*20
......................................
STRING1 = 'bla bla bla (FORTRAN) bla bla ... '
OFFSET1 = INDEX('(', STRING1) + 1
OFFSET2 = INDEX(')', STRING1) - 1
STRING2 = STRING1(OFFSET1:OFFSET2)
STRING2 = ' ' // STRING2
WRITE(UNIT=*, FMT=*) STRING2
WRITE(UNIT=*, FMT=*) STRING2 // STRING2
WRITE(UNIT=*, FMT=*) STRING2 // STRING2 // STRING2
INDEX() is an intrinsic standard FORTRAN function - a function that
every FORTRAN compiler knows. INDEX takes two arguments, both of them
are strings, it looks for the first string inside the second and
returns the place the first string begins inside the second.
For example:
ST1 = 'good'
ST2 = 'Fortran is good'
123456789012345
INDEX(ST1,ST2) is equal 12
The // is the concatenation operator, it takes two strings and 'adds'
them one after the other, to form one larger string.
You may use the // operator with passed string operands only in
assignment statements. Other FORTRAN statements (e.g. WRITE) may
accepts string concatenations, but it's against the standard.
For example:
C ------------------------------------------------------------------
CHARACTER
* ST1*7,
* ST2*3,
* ST3*5
C ------------------------------------------------------------------
ST1 = 'Fortran'
ST2 = ' is'
ST3 = ' good'
ST1 // ST2 // ST3 is equal 'Fortran is good'
If the strings were defined with lengths larger than the non-blank
content, they would be padded by blanks, and when the // operator
will be applied the strings complete with the padding blanks will
be concatenated together to produce a rather ugly result.
You can find the beginning of the blank padding (maybe with the
INDEX function), and use a substring excluding it.
+-------------------------------------------------+
| USE STRINGS TO MANIPULATE FILE NAMES, ETC |
+-------------------------------------------------+
Return to contents page