My essay:
Professional Programmer's Guide to Fortran77 [Note: © Clive G. Page, 1988, 1995]
Clive G. Page, University of Leicester, UK
22nd February 1995
This file contains the full text of Professional Programmer's Guide to Fortran77 published by Pitman in 1988. Since the book went out of print in 1994, it seemed reasonable to make the text available free of charge over the Internet. Fortran77 is, of course, technically obsolete, since the ISO Standard for Fortran90 has replaced it. Many programmers are likely to continue using Fortran77, however, until Fortran90 compilers become much more widely available.
I am retaining all rights to this text, except that it may be copied and reproduced without fee, provided that the attribution to the author is preserved.
This file is written in LaTeX and is called prof77.tex: it is substantially the same as the published version but the opportunity has been taken to correct a few mistakes and make some minor updates.
In order to keep the price down, the book was deliberately kept rather shorter than the average Fortran textbook, but it still covered the entire Fortran77 language as defined in the ANSI and ISO Standards. I also managed to include several topics which are often omitted from much larger textbooks because they are deemed to be too "advanced".
I also wanted to encourage the writing of clear, reliable, portable, robust, and well structured code, so short sections appear throughout the book offering guidance on the practical use of Fortran. Various obsolete or superfluous features of the language, mainly those which have been retained for compatibility with earlier versions of Fortran, are omitted from the main text but are covered in the section 13. This is provided solely for the assistance of those who inherit elderly software.
\tableofcontents
--------------------------------------------------------------------------------
What Is Fortran?
Fortran is the most widely used programming language in the world for numerical applications. It has achieved this position partly by being on the scene earlier than any of the other major languages and partly because it seems gradually to have evolved the features which its users, especially scientists and engineers, found most useful. In order to retain compatibility with old programs, Fortran has advanced mainly by adding new features rather than by removing old ones. The net result is, of course, that some parts of the language are, by present standards, rather archaic: some of these can be avoided easily, others can still be a nuisance.
This section gives a brief history of the language, outlines its future prospects, and summarises its strengths and weaknesses.
Early Development
Fortran was invented by a team of programmers working for IBM in the early nineteen-fifties. This group, led by John Backus, produced the first compiler, for an IBM 704 computer, in 1957. They used the name Fortran because one of their principal aims was "formula translation". But Fortran was in fact one of the very first high-level language: it came complete with control structures and facilities for input/output. Fortran became popular quite rapidly and compilers were soon produced for other IBM machines. Before long other manufacturers were forced to design Fortran compilers for their own hardware. By 1963 all the major manufacturers had joined in and there were dozens of different Fortran compilers in existence, many of them rather more powerful than the original.
All this resulted in a chaos of incompatible dialects. Some order was restored in 1966 when an American national standard was defined for Fortran. This was the first time that a standard had ever been produced for a computer programming language. Although it was very valuable, it hardly checked the growth of the language. Quite deliberately the Fortran66 standard only specified a set of language features which had to be present: it did not prevent other features being added. As time went on these extensions proliferated and the need for a further standardization exercise became apparent. This eventually resulted in the current version of the language: Fortran77.
Standardization
One of the most important features of Fortran programs is their portability, that is the ease with which they can be moved from one computer system to another. Now that each generation of hardware succeeds the previous one every few years, while good software often lasts for much longer, more and more programs need to be portable. The growth in computer networks is also encouraging the development of portable programs.
The first step in achieving portability is to ensure that a standard form of programming language is acceptable everywhere. This need is now widely recognised and has resulted in the development of standards for all the major programming languages. In practice, however, many of the new standards have been ignored and standard-conforming systems for languages like Basic and Pascal are still very rare.
Fortunately Fortran is in much better shape: almost all current Fortran systems are designed to conform to the standard usually called Fortran77. This was produced in 1977 by a committee of the American National Standards Institute (ANSI) and was subsequently adopted by the International Standards Organisation (ISO). The definition was published as ANSI X3.9-1978 and ISO 1539-1980. The term "Standard Fortran" will be used in the rest of this book to refer to mean Fortran77 according to this definition.
Fortran is now one of the most widely used computer languages in the world with compilers available for almost every type of computer on the market. Since Fortran77 is quite good at handling character strings as well as numbers and also has powerful file-handling and input/output facilities, it is suitable for a much wider range of applications than before.
Full and Subset Fortran
The ANSI Standard actually defines two different levels for Fortran77. The simpler form, subset Fortran, was intended for use on computers which were too small to handle the full language. Now that even personal computers are powerful enough to handle full Fortran77, subset Fortran is practically obsolete. This book, therefore, only describes full Fortran77.
Fortran90 Fortran90
The ISO Standard for Fortran90 has, officially, replaced that for Fortran77. It introduces a wealth of new features many of them already in use in other high-level languages, which will make programming easier, and facilitate the construction of portable and robust programs. The whole of the Fortran77 Standard is included as a proper subset, so existing (standard-conforming) Fortran programs will automatically conform also to the new Standard. Until well-tested compilers for Fortran90 are widespread, however, most programmers are still using Fortran77, with perhaps a few minor extensions.
Strengths and Weaknesses
Fortran has become popular and widespread because of its unique combination of properties. Its numerical and input/output facilities are almost unrivalled while those for logic and character handling are as good as most other languages. Fortran is simple enough that you do not need to be a computer specialist to become familiar with it fairly quickly, yet it has features, such as the independent compilation of program units, which allow it to be used on very large applications. Programs written in Fortran are also more portable than those in other major languages. The efficiency of compiled code also tends to be quite high because the language is straight-forward to compile and techniques for handling Fortran have reached a considerable degree of refinement. Finally, the ease with which existing procedures can be incorporated into new software makes it especially easy to develop new programs out of old ones.
It cannot be denied, however, that Fortran has more than its fair share of weaknesses and drawbacks. Many of these have existed in Fortran since it was first invented and ought to have been eliminated long ago: examples include the 6-character limit on symbolic names, the fixed statement layout, and the need to use statement labels.
Fortran also has rather liberal rules and an extensive system of default values: while this reduces programming effort it also makes it harder for the system to detect the programmer's mistakes. In many other programming languages, for example, the data type of every variable has to be declared in advance. Fortran does not insist on this but, in consequence, if you make a spelling mistake in a variable name the compiler is likely to use two variables when you only intended to use one. Such errors can be serious but are not always easy to detect.
Fortran also lacks various control and data structures which simplify programming languages with a more modern design. These limitations, and others, are all eliminated with the advent of Fortran90.
Precautions
Extensions Extensions and Portability
Computer manufacturers have a natural tendency to compete with each other by providing Fortran systems which are "better" than before, usually by providing extensions to the language. This does not conflict with the Fortran Standard, provided that standard-conforming programs are still processed correctly. Indeed in the long term languages advance by the absorbtion of such extensions. In the short term, however, their use is more problematical, since they necessarily makes programs less portable.
When the latest Fortran Standard was issued in 1977 there was fairly widespread disappointment that it did not go just a little further in eliminating some of the tiresome restrictions that had persisted since the early days. The US Department of Defense issued a short list of extensions which manufacturers were encouraged to add to their Fortran77 systems. The most important of these were the following:
the END DO statement END DO statement
the DO WHILE loop DO WHILE loops
the INCLUDE statement INCLUDE statement
the IMPLICIT NONE facility IMPLICIT NONE statement
intrinsic functions for bit-wise operations on integers.
Many Fortran systems, especially those produced in the United States, now support these extensions but they are by no means universal and should not be used in portable programs.
One of the most irksome restrictions of Fortran77 is that symbolic names cannot be more than six characters long. This forces programmers to devise all manner of contractions, abbreviations, and acronyms in place of meaningful symbolic names. It is very tempting to take advantage of systems which relax this rule but this can have serious repercussions. Consider a program which makes use of variables called TEMPERATURE and TEMPERED. Many compilers will be quite happy with these, though a few will reject both names on grounds of length. Unfortunately there are also one or two compilers in existence which will simply ignore all letters after the sixth so that both names will be taken as references to the same variable, TEMPER. Such behaviour, while deplorable, is quite in accordance with the Standard which only requires systems to compile programs correctly if they conform to its rules.
The only way to be certain of avoiding problems like this is to ignore such temptations entirely and just use Standard Fortran. Many compilers provide a switch or option which can be set to cause all non-standard syntax to be flagged. Everything covered in this book is part of Standard Fortran unless clearly marked to the contrary.
Guidelines
Computer programming always requires a very high standard of care and accuracy if it is to be successful. This is even more vital when using Fortran than with some other languages, because, as explained above, the liberal rules of Fortran make it harder for the system to detect mistakes. To program successfully it is not enough just to conform to the rules of the language, it is also important to defend yourself against known pitfalls.
There is a useful lesson to be learned from the failure of one of the earliest planetary probes launched by NASA. The cause of the failure was eventually traced to a statement in its control software similar to this:
+ DO 15 I = 1.100+
when what should have been written was:
+ DO 15 I = 1,100+
but somehow a dot had replaced the comma. Because Fortran ignores spaces, this was seen by the compiler as:
+ DO15I = 1.100+
which is a perfectly valid assignment to a variable called DO15I and not at all what was intended.
Fortran77 permits an additional comma to be inserted after the label in a DO statement, so it could now be written as:
+ DO 15,I = 1,100+
which has the great advantage that it is no longer as vulnerable to a single-point failure.
There are many hazards of this sort in Fortran, but the risk of falling victim to them can be minimised by adopting the programming practices of more experienced users. To help you, various recommendations and guidelines are given throughout this book. Some of the most outdated and unsatisfactory features of Fortran are not described in the main part of the book at all but have been relegated to section 13.
There is not room in a book of this size to go further into the techniques of program design and software engineering. As far as possible everything recommended here is consistent with the methods of modular design and structured programming, but you should study these topics in more detail before embarking on any large-scale programming projects.
--------------------------------------------------------------------------------
Basic Fortran Concepts
This section presents some of the basic ideas of Fortran by showing some complete examples. In the interests of simplicity, the problems which these solve are hardly beyond the range of a good pocket calculator, and the programs shown here do not include various refinements that would usually be present in professional software. They are, however, complete working programs which you can try out for yourself if you have access to a Fortran system. If not, it is still worth reading through them to see how the basic elements of Fortran can be put together into complete programs.
Statements
To start with, here is one of the simplest program that can be devised:
PROGRAM TINY
WRITE(UNIT=*, FMT=*) 'Hello, world'
END
As you can probably guess, all this program does is to send a rather trite message "Hello, world" to your terminal. Even so its layout and structure deserve some explanation.
The program consists of three lines, each containing one statement. Each Fortran statement must have a line to itself (or more than one line if necessary), but the first six character positions on each line are reserved for statement labels and continuation markers. Since the statements in this example need neither of these features, the first six columns of each line have been left blank.
The PROGRAM statement gives a name to the program unit and declares that it is a main program unit. Other types of program unit will be covered later on. The program can be called anything you like provided the name conforms to the Fortran rules; the first character of a Fortran symbolic name must be a letter but, unfortunately, they cannot be more than six characters long in total. It is generally sensible to give the same name to the program and to the file which holds the Fortran source code (the original text).
The WRITE statement produces output: the parentheses enclose a list of control items which determine where and in what form the output appears. ?UNIT=*? selects the standard output file which is normally your own terminal; ?FMT=*? selects a default output layout (technically known as list-directed format). Asterisks are used here, as in many places in Fortran, to select a default or standard option. This program could, in fact, have been made slightly shorter by using an abbreviated form of the WRITE statements:
WRITE(*,*) 'Hello, world'
Although the keywords UNIT= and FMT= are optional, they help to make the program more readable. The items in the control list, like those in all lists in Fortran, are separated by commas.
The control information in the WRITE statement is followed by a list of the data items to be output: here there is just one item, a character constant which is enclosed in a pair of apostrophe (single quote) characters.
An END statement is required at the end of every program unit. When the program is compiled (translated into machine code) it tells the compiler that the program unit is complete; when encountered at run-time the END statement stops the program running and returns control to the operating system.
The Standard Fortran character set does not contain any lower-case letters so statements generally have to be written all in upper case. But Fortran programs can process as data any characters supported by the machine; character constants (such as the message in the last example) are not subject to this constraint.
Expressions and Assignments
The next example solves a somewhat more realistic problem: it computes the repayments on a fixed-term loan (such as a home mortgage loan). The fixed payments cover the interest and repay part of the capital sum; the annual payment can be calculated by the following formula:
payment = rate.amount / (1 - (1+rate)-nyears)
In this formula, rate is the annual interest rate expressed as a fraction; since it is more conventional to quote interest rates as a percentage the program does this conversion for us.
PROGRAM LOAN
WRITE(UNIT=*, FMT=*)'Enter amount, % rate, years'
READ(UNIT=*, FMT=*) AMOUNT, PCRATE, NYEARS
RATE = PCRATE / 100.0
REPAY = RATE * AMOUNT / (1.0 - (1.0+RATE)**(-NYEARS))
WRITE(UNIT=*, FMT=*)'Annual repayments are ', REPAY
END
This example introduces two new forms of statement: the READ and assignment statements, both of which can be used to assign new values to variables.
The READ statement has a similar form to WRITE: here it reads in three numbers entered on the terminal in response to the prompt and assigns their values to the three named variables. FMT=* again selects list-directed (or free-format) input which allows the numbers to be given in any convenient form: they can be separated by spaces or commas or even given one on each line.
The fourth statement is an assignment statement which divides PCRATE by 100 and assigns the result to another variable called RATE. The next assignment statement evaluates the loan repayment formula and assigns the result to a variable called REPAY.
Several arithmetic operators are used in these expressions: as in most programming languages " /'' represents division and " *'' represents multiplication; in Fortran " **'' is used for exponentiation, i.e. raising one number to the power of another. Note that two operators cannot appear in succession as this could be ambiguous, so that instead of " **-N'' the form " **(-N)'' has to be used.
Another general point concerning program layout: spaces (blanks) are not significant in Fortran statements so they can be inserted freely to improve the legibility of the program.
When the program is run, the terminal dialogue will look something like this:
Enter amount, % rate, years
20000, 9.5, 15
Annual repayments are 2554.873
The answer given by your system may not be exactly the same as this because the number of digits provided by list-directed formatting depends on the accuracy of the arithmetic, which varies from one computer to another.
Integer and Real Data Types
The LOAN program would have been more complicated if it had not taken advantage of some implicit rules of Fortran concerning data types: this requires a little more explanation.
Computers can store numbers in several different ways: the most common numerical data types are those called integer and real. Integer variables store numbers exactly and are mainly used to count discrete objects. Real variables are useful many other circumstances as they store numbers using a floating-point representation which can handle numbers with a fractional part as well as whole numbers. The disadvantage of the real data type is that floating-point numbers are not stored exactly: typically only the first six or seven decimal digits will be correct. It is important to select the correct type for every data item in the program. In the last example, the number of years was an integer, but all of the other variables were of real type.
The data type of a constant is always evident from its form: character constants, for example, are enclosed in a pair of apostrophes. In numerical constants the presence of a decimal point indicates that they are real and not integer constants: this is why the value one was represented as " 1.0" and not just " 1".
There are several ways to specify the data type of a variable. One is to use explicit type statements at the beginning of the program. For example, the previous program could have begun like this:
PROGRAM LOAN
INTEGER NYEARS
REAL AMOUNT, PCRATE, RATE, REPAY
Although many programming languages require declarations of this sort for every symbolic name used in the program, Fortran does not. Depending on your point of view, this makes Fortran programs easier to write, or allows Fortran programmers to become lazy. The reason that these declarations can often be omitted in Fortran is that, in the absence of an explicit declaration, the data type of any item is determined by the first letter of its name. The general rule is:
initial letters I-N integer type
initial letters A-H and O-Z real type.
In the preceding program, because the period of the loan was called NYEARS (and not simply YEARS) it automatically became an integer, while all the other variables were of real type.
DO Loops
Although the annual repayments on a home loan are usually fixed, the outstanding balance does not decline linearly with time. The next program demonstrates this with the aid of a DO-loop.
PROGRAM REDUCE
WRITE(UNIT=*, FMT=*)'Enter amount, % rate, years'
READ(UNIT=*, FMT=*) AMOUNT, PCRATE, NYEARS
RATE = PCRATE / 100.0
REPAY = RATE * AMOUNT / (1.0 - (1.0+RATE)**(-NYEARS))
WRITE(UNIT=*, FMT=*)'Annual repayments are ', REPAY
WRITE(UNIT=*, FMT=*)'End of Year Balance'
DO 15,IYEAR = 1,NYEARS
AMOUNT = AMOUNT + (AMOUNT * RATE) - REPAY
WRITE(UNIT=*, FMT=*) IYEAR, AMOUNT
15 CONTINUE
END
The first part of the program is similar to the earlier one. It continues with another WRITE statement which produces headings for the two columns of output which will be produced later on.
The DO statement then defines the start of a loop: the statements in the loop are executed repeatedly with the loop-control variable IYEAR taking successive values from 1 to NYEARS. The first statement in the loop updates the value of AMOUNT by adding the annual interest to it and subtracting the actual repayment. This results in AMOUNT storing the amount of the loan still owing at the end of the year. The next statement outputs the year number and the latest value of AMOUNT. After this there is a CONTINUE statement which actually does nothing but act as a place-marker. The loop ends at the CONTINUE statement because it is attached to the label, 15, that was specified in the DO statement at the start of the loop.
The active statements in the loop have been indented a little to the right of those outside it: this is not required but is very common practice among Fortran programmers because it makes the structure of the program more conspicuous.
The program REDUCE produces a table of values which, while mathematically correct, is not very easy to read:
Enter amount, % rate, years
2000, 9.5, 5
Annual repayments are 520.8728
End of Year Balance
1 1669.127
2 1306.822
3 910.0968
4 475.6832
5 2.9800416E-04
Formatted Output
The table of values would have a better appearance if the decimal points were properly aligned and if there were only two digits after them. The last figure in the table is actually less than a thirtieth of a penny, which is effectively zero to within the accuracy of the machine. A better layout can be produced easily enough by using an explicit format specification instead of the list-directed output used up to now. To do this, the last WRITE statement in the program should be replaced with one like this:
+ WRITE(UNIT=*, FMT='(1X,I9,F11.2)') IYEAR, AMOUNT+
The amended program will then produce a neater tabulation:
Enter amount, % rate, years
2000, 9.5, 5
Annual repayments are 520.8728
End of Year Balance
1 1669.13
2 1306.82
3 910.10
4 475.68
5 .00
The format specification has to be enclosed in parentheses and, as it is actually a character constant, in a pair of apostrophes as well. The first item in the format list, 1X, is needed to cope with the carriage-control convention: it provides an additional blank at the start of each line which is later removed by the Fortran system. There is no logical explanation for this: it is there for compatibility with very early Fortran system. The remaining items specify the layout of each number: I10 specifies that the first number, an integer, should be occupy a field 10 columns wide; similarly F11.2 puts the second number, a real (floating-point) value, into a field 11 characters wide with exactly 2 digits after the decimal point. Numbers are always right-justified in each field. The field widths in this example have been chosen so that the columns of figures line up satisfactorily with the headings.
Functions
Fortran provides a useful selection of intrinsic functions to carry out various mathematical operations such as square root, maximum and minimum, sine, cosine, etc., as well as various data type conversions. You can also write your own functions. The next example, which computes the area of a triangle, shows both forms of function in action.
The formulae for the area of a triangle with sides of length a, b, and c is: s = (a + b + c)/2 area = sqrt()[s.(s-a).(s-b).(s-c)]
PROGRAM TRIANG
WRITE(UNIT=*,FMT=*)'Enter lengths of three sides:'
READ(UNIT=*,FMT=*) SIDEA, SIDEB, SIDEC
WRITE(UNIT=*,FMT=*)'Area is ', AREA3(SIDEA,SIDEB,SIDEC)
END
FUNCTION AREA3(A, B, C)
*Computes the area of a triangle from lengths of sides
S = (A + B + C)/2.0
AREA3 = SQRT(S * (S-A) * (S-B) * (S-C))
END
This program consists of two program units. The first is the main program, and it has as similar form to those seen earlier. The only novel feature is that the list of items output by the WRITE statement includes a call to a function called AREA3. This computes the area of the triangle. It is an external function which is specified by means of a separate program unit technically known as a function subprogram.
The external function starts with a FUNCTION statement which names the function and specifies its set of dummy arguments. This function has three dummy arguments called A, B, and C. The values of the actual arguments, SIDEA, SIDEB, and SIDEC, are transferred to the corresponding dummy arguments when the function is called. Variable names used in the external function have no connection with those of the main program: the actual and dummy argument values are connected only by their relative position in each list. Thus SIDEA transfers its value to A, and so on. The name of the function can be used as a variable within the subprogram unit; this variable must be assigned a value before the function returns control, as this is the value returned to the calling program.
Within the function the dummy arguments can also be used as variables. The first assignment statement computes the sum, divides it by two, and assigns it to a local variable, S; the second assignment statement uses the intrinsic function SQRT which computes the square-root of its argument. The result is returned to the calling program by assigning it to the variable which has the same name as the function.
The END statement in a procedure does not cause the program to stop but just returns control to the calling program unit.
There is one other novelty: a comment line describing the action of the function. Any line of text can be inserted as a comment anywhere except after an END statement. Comment lines have an asterisk in the first column.
These two program units could be held on separate source files and even compiled separately. An additional stage, usually called linking, is needed to construct the complete executable program out of these separately compiled object modules. This seems an unnecessary overhead for such simple programs but, as described in the next section, it has advantages when building large programs.
In this very simple example it was not really necessary to separate the calculation from the input/output operations but in more complicated cases this is usually a sensible practice. For one thing it allows the same calculation to be executed anywhere else that it is required. For another, it reduces the complexity of the program by dividing the work up into small independent units which are easier to manage.
IF-blocks
Another important control structure in Fortran is the IF statement which allows a block of statements to be executed conditionally, or allows a choice to be made between different courses of action.
One obvious defect of the function AREA3 is that has no protection against incorrect input. Many sets of three real numbers could not possibly form the sides of a triangle, for example 1.0, 2.0, and 7.0. A little analysis shows that in all such impossible cases the argument of the square root function will be negative, which is illegal. Fortran systems should detect errors like this at run-time but will vary in their response. Even so, a message like "negative argument for square-root" may not be enough to suggest to the user what is wrong. The next version of the function is slightly more user-friendly:
REAL FUNCTION AREA3(A, B, C)
*Computes the area of a triangle from lengths of its sides.
*If arguments are invalid issues error message and returns zero.
REAL A, B, C
S = (A + B + C)/2.0
FACTOR = S * (S-A) * (S-B) * (S-C)
IF(FACTOR .LE. 0.0) THEN
WRITE(UNIT=*, FMT=*)'Impossible triangle', A, B, C
AREA3 = 0.0
ELSE
AREA3 = SQRT(FACTOR)
END IF
END
The IF statement works with the ELSE and END IF statements to enclose two blocks of code. The statements in the first block are only executed if the expression in the IF statement is true, those in the second block only if it is false. The statements in each block are indented for visibility, but this is, again, just a sensible programming practice.
With this modification, the value of FACTOR is tested and if it is negative or zero then an error message is produced; AREA3 is also set to an impossible value (zero) to flag the mistake. Note that the form " .LE.'' is used because the less-than-or-equals character, " A2 it returns (A1-A2), otherwise zero.
D = DPROD(R,R) Computes the double precision product of two real values.
R = AIMAG(X) Extracts the imaginary component of a complex number. Note that the real component can be obtained by using the REAL function.
X = CONJG(X) Computes the complex conjugate of a complex number.
The NINT and ANINT functions round upwards if the fractional part of the argument is 0.5 or more, whereas INT and AINT always round towards zero. Thus:
? INT(+3.5) = 3 NINT(+3.5) = 4 ?
? INT(-3.5) = -3 NINT(-3.5) = -4 ?
The fractional part of a floating point number, X, can easily be found either by:
? X - AINT(X)?
or
? MOD(X, 1.0)?
In either case, if X is negative the result will also be negative. The ABS function can always be used to alter the sign if required.
The MOD function has other uses. For example it can find the day of the week from an absolute day count such as Modified Julian Date (MJD):
? MOD(MJD,7)?
has a value between 0 and 6 for days from Wednesday to Tuesday. Similarly if you use the ATAN2 function but want the result to lie in the range 0 to 2 \pi (rather than -\pi to +2 \pi) then, assuming the value of TWOPI is suitably defined, the required expression is:
? MOD(ATAN2(X,Y) + TWOPI, TWOPI)?
Arithmetic Assignment Statements
Assignment statements An arithmetic assignment statement has the form:
? ? arithmetic-var = arithmetic-expression
where arithmetic-var can be an arithmetic variable or array element. For example, the following assignment statement is valid provided that N, K, and ANGLE are all defined values:
? IMAGE(N/2+1,3*K-1) = SIN(ANGLE)**2 + 1.0?
If the object on the left has a different data type from that of the expression on the right then a data type conversion is applied automatically. The type conversion function (INT, REAL, DBLE, or CMPLX) is selected to match the object on the left. Note that many type conversions lose information. If the object on the left is an array element, its subscripts can be arbitrary integer expressions, but all the operands in these expressions must be defined before the statement is executed and each must be in the range declared for the corresponding subscript of the array.
Remember with an integer item on the left and an expression of one of the floating-point types, the INT function is invoked: if the NINT function is really needed then it must be used explicitly to convert the value of the expression.
--------------------------------------------------------------------------------
Character Handling and Logic
Character handling This section describes the facilities for handling non-numerical data in Fortran. Character data are actually present in almost all programs, if only in the form of file names and error messages, but the facilities for character manipulation are now quite powerful. The logical data type is even more indispensable since a logical expression is used in every IF statement.
Character Facilities
The character data type differs from all the others in one important respect: every character item has a fixed length. This specifies the number of characters it holds.
The length of a literal character constant is just the number of characters between the enclosing apostrophes (except that two consecutive apostrophe within the string count as one). Thus:
? 'it''s' ?
is a character constant of length four. Because the length of every character variable, array, and function has to be specified in advance it is nearly always necessary to use CHARACTER statements to declare them, for example:
? CHARACTER NAME*20, ADDRSS(3)*40, ZIP*7?
The same applies to named character constants but for these a special notation sets the length to that of the attached constant, which saves the trouble of counting characters:
CHARACTER TITLE*(*)
PARAMETER (TITLE = 'Latest mailing list')
The fixed length of character objects makes it easy to output data in a fixed format as when printing a table with neatly aligned columns, but sometimes it would be more convenient to have a variable length string type as some other languages do. The rules for character assignment go some way towards this: if an expression is too short then blanks are appended to it; if it is too long then characters are removed from the right-hand end. For many purposes, therefore, it is only necessary to ensure that character variables are at least as long as the longest string you need to store in them.
When transferring character information to procedures the length of the dummy argument can be set automatically to that of the corresponding actual argument. With this passed length notation it is easy to write general-purpose character handling procedures. This is described further in section 9.5.
The most common operations carried out on character strings are splitting them up and joining them together. Any section of a character variable or array element can be extracted by using the substring notation. Strings (and substrings) can be joined end to end by using the concatenation operator in a character expression. These are described in the next two sections.
Another fairly common requirement is to search for a particular sequence of characters within a longer string: this can be done with the intrinsic function INDEX.
Other intrinsic functions ICHAR and CHAR are provided to convert a single character to an integer or vice-versa according to its position within the native character set. More complicated conversions from a numerical data type to character form and vice-versa are best carried out using the internal file READ and WRITE statements which allow the power of the format specification to applied to the task. This mechanism is described in section 10.3.
Character strings can be compared to each other using relational operators or intrinsic functions. The latter use the ASCII collating sequence irrespective of the native character code. Further details are given in section 7.6.
Character Substrings
Substrings, character The substring notation can be used to select any contiguous section of any character variable or array element. The characters in any string are numbered starting from one on the left: the lower bound cannot be altered as it can in arrays. A substring is selected simply by giving the first and last character positions of the extract. For example, with:
CHARACTER METAL*10
METAL = 'CADMIUM'
then METAL(1:3) has the value 'CAD' while METAL(8:8) has the value blank because the value is padded out with blanks to its declared length.
Substrings must be at least one character long. They can be used in general in the same ways as character variables. Continuing with the last example, the assignment statement:
? METAL(3:4) = 'ES'?
will change the value of METAL to 'CAESIUM ' (with three blanks at the end, since the total length stays at 10).
Substring Rules
The parentheses denoting a substring must contain a colon: there may be an integer expression on either side of the colon. The first expression denotes the initial character position, the second one the last character position. Both values must be within the range 1 to LEN, where LEN is the length of the parent string, and the length of the resulting substring must not be less than one.
Although the colon must always be present, the two integer expressions are optional. The default value for the first one is one, the default for the second is the position of the last character of the parent string. Thus, staying with the last example: METAL

2) has the value 'CA' while METAL(7

has the value 'M' with three blanks.
With array elements the substring expression follows the sub-script expression, for example:
CHARACTER PLAY(30)*80
PLAY(10) = 'AS YOU LIKE IT'
Then the substring PLAY(10)(4:11) has the value 'YOU LIKE'. Substrings can be used in expressions anywhere except in the definition of a statement function; they can also be used on the left-hand side of an assignment statement, and can be also be defined by input/output statements.
Character Expressions
Character expressions The character operator // is used to concatenate, or join, two character strings. It is, in fact, the only character operator that Fortran provides. Thus:
'CUP' // 'BOARD' \Longrightarrow \Longrightarrow 'CUPBOARD'
The length of the result is just the sum of the lengths of the operands. Parentheses may be used in character expressions but make no difference to the result. Note that any embedded or trailing blanks (spaces) will be reproduced exactly in the resulting string.
The general form of a character-expression is thus:
+ + character-operand
or+ + character-expression // character-operand
where character-operand can be any of the following:
character constant (literal or named),
character variable,
character array element,
character substring,
character function reference.
There is one special restriction on character concatenation in procedures: a passed-length dummy argument can only be an operand of the concatenation operator in an assignment statement. This seemingly arbitrary rule allows the compiler to determine how much work-space is required.
Character Assignment Statements
Assignment, character The character assignment statement has the general form:
+ + char-var = character-expression
where char-var can be a character variable, array element, or substring.
There is one important restriction on character assignment statements: none of the characters being referenced in the expression on the right may be defined in char-var on the left, that is to say there can be no overlap. Thus the assignment statement:
? STRING(1:N) = STRING(10

?
is valid only as long as N is no higher than 9. It is, of course, easy to get around this restriction by using a temporary character variable with a suitable length.
Note when a value is assigned to a substring (as in the last example) the other characters in the parent string are not affected at all. If the string was previously undefined then the other character positions will still be undefined; otherwise they will retain their previous contents.
The expression and the character object to which its value is assigned may have different lengths: if the expression is longer then the excess characters on the right are lost; if it is shorter then blanks are appended. Care is needed to declare adequate lengths or else the results can be unexpected:
CHARACTER AUTHOR*30, SHORT*5, EXPAND*10
AUTHOR = 'SHAKESPEARE, WILLIAM'
SHORT = AUTHOR
EXPAND = SHORT
The resulting value of EXPAND will be ?'SHAKE '? where the last five characters are blanks.
Character Intrinsic Functions
Character functions The four main character intrinsic functions are described in this section. There another four functions provided to compare character strings with each other using the ASCII collating ASCII collating sequence sequence: these are described in section 7.6.
CHAR and ICHAR
These two functions perform integer to character conversion and vice-versa using the internal code of the machine. Although most computers now use the ASCII character code, it is by no means universal, so these functions can only be used in a very limited way in portable software.
CHAR(I) returns the character at position I in the code table. For example, on a machine using ASCII code, CHAR(74) = 'J', since " J" is the character number 74 in the ASCII code table.
ICHAR(STRING) returns the integer position in the code table of the first character of the argument STRING. For example, on a machine using ASCII code,
ICHAR('JOHN') \Longrightarrow 74
ICHAR('john') \Longrightarrow 106
INDEX
INDEX is a search function; it takes two character arguments and returns an integer result. INDEX(S1, S2) searches for the character-string S2 in another string S1, which is usually longer. If S2 is present in S1 the function returns the character position at which it finds starts. If there is no match (or S1 is shorter than S2) then it returns the value zero. For example:
CHARACTER*20 SPELL
SPELL = 'ABRACADABRA'
K = INDEX(SPELL, 'RA')
Here K will be set to 3 because this is the position of the first occurrence of the string 'RA'. To find the second occurrence it is necessary to restart the search at the next character in the main string, for example:
? L = INDEX(SPELL(K+1

, 'RA')?
This will return the value 7 because the first occurrence of 'RA' in the substring 'ACADABRA' is at position 7. To find its position in the parent string the offset, K, must be added, making 10.
The INDEX function is often useful when manipulating character information. Suppose, for example, we have an string NAME containing the a person's surname and initials, e.g.
? Mozart,W.A ?
The name can be reformatted to put the initials before the surname and omit the comma like this:
CHARACTER NAME*25, PERSON*25
*...
KCOMMA = INDEX(NAME, ',')
KSPACE = INDEX(NAME, ' ')
PERSON = NAME(KCOMMA+1:KSPACE-1) // NAME(1:KCOMMA-1)
Then PERSON will contain the string 'W.A.Mozart' (with blanks appended to the length of 25). Note that a separate variable, PERSON, was necessary because of the rule about overlapping strings in assignments.
LEN
The LEN function takes a character argument and returns its length as an integer. The argument may be a local character variable or array element but this will just return a constant. LEN is more useful in procedures where character dummy arguments (and character function names) may have their length passed over from the calling unit, so that the length may be different on each procedure call. The length returned by LEN is that declared for the item. Sometimes it is more useful to find the length excluding trailing blanks. The next function does just that, using LEN in the process.
INTEGER FUNCTION LENGTH(STRING)
*Returns length of string ignoring trailing blanks
CHARACTER*(*) STRING
DO 15, I = LEN(STRING), 1, -1
IF(STRING(I:I) .NE. ' ') GO TO 20
15 CONTINUE
20 LENGTH = I
END
Relational Expressions
Relational expressions Expressions, relational A relational expression compares the values of two arithmetic expressions or two character expressions: the result is a logical value, either true or false. Relational expressions are commonly used in IF statements, as in this example:
IF(SENSOR .GT. UPPER) THEN
CALL COOL
ELSE IF(SENSOR .LT. LOWER) THEN
CALL HEAT
END IF
The relational operators have forms such as .GT. and .LT. because the Fortran character set does not include the usual characters . and arg2 it returns (arg1 - arg2), otherwise zero.
D = DPROD(R,R) Computes the double precision product of two real values.
* = EXP(RDX) Returns the exponential, i.e. e to the power of the argument. This is the inverse of the natural logarithm.
I = ICHAR(C) Returns position of first character of the string in the local character code table.
I = INDEX(C,C) Searches first string and returns position of second string within it starting at 1, otherwise zero.
I = INT(IRDX) Converts to integer by truncation.
I = LEN(C) Returns length of the argument in characters.
L = LGE(C,C) Lexical comparison using ASCII collating sequence: returns true if arg1 >= arg2.
L = LGT(C,C) Lexical comparison using ASCII collating sequence: returns true if arg1 > arg2.
L = LLE(C,C) Lexical comparison using ASCII collating sequence: returns true if arg1 HTML by ltoh]
Russell W. Quong (quong@best.com.REMOVETHIS-SPAM-FILTER-PART)
Last modified: Feb 12 2001 (LaTeX doc modified: Nov 17 2003)