This is a development/technology-transfer repository, not a production
repository. It contains a GitHub Oct. 28, 2023 version of
SMOKE (.0.f
), a M3UTILIO
ized, I/O API 3.2-ized (.f
) of
SMOKE, an findent (.1.f90
) free-source-format reference version of
SMOKE constructed from the .f
, and a cleaned-up .f90
version of
SMOKE. This work has been done entirely on my own time (because many
of these changes should have been done long ago, and because I hated
to see something I was originally responsible for in the shape SMOKE
now is in). I had no external support for this work whatsoever.
-- Carlie J. Coats, Jr., Ph.D.
This code version has not been tested (I don't have the facilities to do that), but it does compile successfully with Intel ifort version 14, GNU gfortran version 4.9.2, and Sun sunf95 version 8.6. (Due to the idiosyncratic interpretation of the Fortran Standard in GNU gfortran versions 10 and later, those compiler-versions need testing, and I don't have a system that has them.)
In several places, copyright, attribution, and license notices that had been stripped from the code have been replaced.
SMOKE should not have its own private, out-of-date copy of I/O API
routine NEXTIME
. This has been removed.
Where appropriate (and easy), codes have been brought up to I/O API 3.2.
For example, make-symbol E132
now gives the compile-flags for the
(non-Standard) "extended-132-column fixed" Fortran source format used by
SMOKE (eliminating the previously-necessary compiler-dependency in
SMOKE/src/Makeinclude).
Argument-list bugs that were found, as well as the FORMAT
bugs in
src/movesmrg/rdmrclist and lib/efsetup (MCREFIDX
and VOLNAM
are
CHARACTER
, not INTEGER
as FORMAT 94010
demands; see
https://forum.cmascenter.org/t/error-running-movesmerg-for-rpd/4606/4)
have been fixed. There is a greatly-simplified lib/efsetup.2.f90.
Fortran MODULE M3UTILIO
does the
"proper F90 way" of encapsulating the Models-3 I/O API parameter, data
structure, and function-declarations, and has done so since I/O API 3.0
(2002). In addition to the I/O routines like WRITE3
, it also provides
INTERFACE
blocks for almost all the routines in the I/O API, which
provides for compile-time checking of argument lists, etc.: so far, the
conversion from I/O API-2 style code to USE M3UTILIO
has found
argument-list errors in every code to which it has been applied. This
now includes eliminating argument-list bugs in the current (GitHub Oct.
28, 2023) version of SMOKE, upon which this code distribution is based.
Note that MODULE M3UTILIO
also provides generic/polymorphic interfaces
for a number of routines: for example,
ENVGET
provides a
type-safe name for calling any of ENVINT(), ENVINT8(), ENVDBLE(), ENVREAL(), ENVSTR(), ENVYN(), BENVINT(), BENVINT8(), BENVDBLE(),
and
BENVREAL()
(These have not been introduced into this version of SMOKE.)
The resulting code has then been converted to Fortran-90 Standard (.f90) free source format, which should have been done easily and quickly as an automated process using Willem Vermin's findent program, with command-line options:
findent -i4 -k6 -ofree -RR !*
The resulting reference code was named .1.f90
. The final .f90
codes
needed fix-up on loop-nest and comment indentation, etc., as well as
fixing up numerous improper-blank botches in the original code, such as
"( /", "/ )", ". AND.", ". LT."
that should never have had embedded blanks. The corresponding free
source-format INCLUDE
-files are named .h90
.
Makefile
s were updated to include missing dependencies where the
changes to them had not kept up with the changes to the source-code
base.
The directory structure matches that of the default SMOKE/src subtree from GitHub.
There is a reference read-only .0.f
copy of each of the original GitHub
source files, e.g., src/biog/czangle.0.f.
New "fixed-132" codes go by the "standard" SMOKE naming, e.g.,
src/biog/czangle.f Intermediate "scratch" or "improved" versions of
the codes have other in-fixes, e.g., src/biog/tmpbeis4.1.f and
src/biog/tmpbeis4.2.f. Read-only reference-copy findent outputs use
a .1
in-fix, e.g., src/biog/czangle.1.f90, and final "free
.f90-style" codes go by .f90
(without an in-fix).
For comparing these code-versions, note that programs xxdiff, tkdiff, and kompare are graphical "difference" programs that display the codes side-by-side with highlighted differences, making it easy to see what the changes are between all these file versions, e.g.,
xxdiff temporal/wrtsup.0.f temporal/wrtsup.f
Note that the .f
and .f90
codes require different Makefile and
Makeinclude files. Files .f.Makefile
and .f.Makeinclude
are for
the former, and .f90.Makefile
and .f90.Makeinclude
are for the
latter. To build SMOKE: in your src directory, copy the relevant pair
of these to the standard Makefile and Makeinclude names, and then
make. For reference, the original Makefile
can be found in Makefile.0
All relevant codes have been M3UTILIO
-ized. This entailed removing
INCLUDE
statements for the I/O API include-files PARMS3.EXT, IODECL3.EXT, FDESC3.EXT
as well as external-function declarations for
I/O API fuctions that were called. Other external function declarations
were put into Fortran-90 style, e.g.
INTEGER, EXTERNAL :: GETFLINE
Argument-list bugs were found and fixed for certain calls to SORTI2
,
SORTR2
, and INDEX1
.
Error-checking was missing for almost all of the ENVINT, ENVREAL, ENVYN
calls. This error-checking has now been added, so if the
user does something inappropriate in a script like assigning a
non-integer where an environment variable should be an integer:
setenv IFOO dingbats
Then the new code now error-checks to detect this invalid value, report the problem, and exit (whereas the old code would have allowed the code to continue inappropriately with a potentially-bad value).
I/O API routine M3ERR
was deprecated (replaced by M3EXIT
and M3WARN
)
for I/O API Prototype 0.5 of July 1992, and TRIMLEN
was deprecated
(replaced by Fortran-90 intrinsic LEN_TRIM
) for I/O API version 3.0
(2002). Occurrences of both of these have been replaced.
ALLOCATE
s were grouped together for improved readability (and improved
performance: each ALLOCATE
makes a separate system-call, and these
are expensive). For example, the more-readable
ALLOCATE( WSAT( NCOLS, NROWS ), & RN( NCOLS, NROWS ), & RC( NCOLS, NROWS ), & TEMPG( NCOLS, NROWS ), & RADYNI( NCOLS, NROWS ), & RATM( NCOLS, NROWS ), & LAI( NCOLS, NROWS ), & Q2( NCOLS, NROWS ), & RSTOMI( NCOLS, NROWS ), & RSTOM( NCOLS, NROWS ), & USTAR( NCOLS, NROWS ), & RGRND( NCOLS, NROWS ), & PRES( NCOLS, NROWS ), & RAINFALL( NCOLS, NROWS, RHOURS ), & PTYPE( NCOLS, NROWS ), & PULSEDATE( NCOLS, NROWS ), & PULSETIME( NCOLS, NROWS ), & EMPOL( NCOLS, NROWS, NSEF ), & EMISL( NCOLS, NROWS, MSPCS ), & EMISS( NCOLS, NROWS, MSPCS ), & SEMIS( NCOLS, NROWS, NSEF ), & NONAGNO( NCOLS, NROWS ), & NGROWAGNO( NCOLS, NROWS ), & GROWAGNO( NCOLS, NROWS ), & SLAI( NCOLS, NROWS, NLAI ), STAT=IOS ) CALL CHECKMEM( IOS, 'WSAT...SLAI', PROGNAME )
replaces the far more cluttered
ALLOCATE( WSAT( NCOLS, NROWS ), STAT=IOS ) CALL CHECKMEM( IOS, 'WSAT', PROGNAME ) ALLOCATE( RN( NCOLS, NROWS ), STAT=IOS ) CALL CHECKMEM( IOS, 'RN', PROGNAME ) ALLOCATE( RC( NCOLS, NROWS ), STAT=IOS ) CALL CHECKMEM( IOS, 'RC', PROGNAME ) ALLOCATE( TEMPG(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'TEMPG', PROGNAME ) ALLOCATE( RADYNI(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'RADYNI', PROGNAME ) ALLOCATE( RATM(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'RATM', PROGNAME ) ALLOCATE( LAI(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'LAI', PROGNAME ) ALLOCATE( Q2(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'Q2', PROGNAME ) ALLOCATE( RSTOMI(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'RSTOMI', PROGNAME ) ALLOCATE( RSTOM(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'RSTOM', PROGNAME ) ALLOCATE( USTAR(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'USTAR', PROGNAME ) ALLOCATE( RGRND(NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'RGRND', PROGNAME ) ALLOCATE( RAINFALL( NCOLS, NROWS, RHOURS ),STAT=IOS ) CALL CHECKMEM( IOS, 'RAINFALL', PROGNAME ) ALLOCATE( PTYPE( NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'PTYPE', PROGNAME ) ALLOCATE( PULSEDATE( NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'PULSEDATE', PROGNAME ) ALLOCATE( PULSETIME( NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'PULSETIME', PROGNAME ) ALLOCATE( EMPOL( NCOLS, NROWS, NSEF ), STAT=IOS ) CALL CHECKMEM( IOS, 'EMPOL', PROGNAME ) ALLOCATE( PRES( NCOLS, NROWS ), STAT=IOS ) CALL CHECKMEM( IOS, 'PRES', PROGNAME ) ALLOCATE( EMISL( NCOLS, NROWS, MSPCS ), STAT=IOS ) CALL CHECKMEM( IOS, 'EMISL', PROGNAME ) ALLOCATE( EMISS( NCOLS, NROWS, MSPCS ), STAT=IOS ) CALL CHECKMEM( IOS, 'EMISS', PROGNAME ) ALLOCATE( SEMIS( NCOLS, NROWS, NSEF ), STAT=IOS ) CALL CHECKMEM( IOS, 'SEMIS', PROGNAME ) ALLOCATE( NONAGNO( NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'NONAGNO', PROGNAME ) ALLOCATE( NGROWAGNO( NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'NGROWAGNO', PROGNAME ) ALLOCATE( GROWAGNO( NCOLS, NROWS), STAT=IOS ) CALL CHECKMEM( IOS, 'GROWAGNO', PROGNAME ) ALLOCATE( SLAI( NCOLS, NROWS, NLAI ), STAT=IOS ) CALL CHECKMEM( IOS, 'SLAI', PROGNAME )
It was recognized early-on in Models-3 development that grid-coordinate
related calculations needed to be done in DOUBLE PRECISION
, i.e.,
REAL*8
— the critical situation being exactly the situation
dealt with in lib/ingrid. This routine has been corrected to do so.
A number of loop nests (especially in biogenics) were found to be in nest-order that is as cache-hostile as possible. These have been replaced by the cache-friendly versions. This should result in improved performance (where these nests occur), especially for large-grid scenarios. See Optimizing Environmental Models for Microprocessor Based Systems — The Easy Stuff.
It was recognized from the very beginning of the Models-3 project in
1991 (and documented as such) that because REAL
arithmetic is always
subject to machine dependent round-off problems 1 and therefore
REAL
values should never be tested for exact equality, a robust
scheme was needed for the detection of "missing" for REAL
values.
Therefore, (with over-kill) the I/O API provided parameters
REAL, PARAMETER :: BADVAL3 = -9.999E36 ! for "missing" REAL, PARAMETER :: AMISS3 = -9.000E36 ! for testing
with the intention that the code for testing REAL
"missing" should
look like
REAL X ... X = BADVAL3 ! set X to "missing" ... IF ( X .LT. AMISS3 ) THEN ! test X for "missing" ...
Even on systems with the most extreme round-off difficulty properties
(e.g., IBM 360 or POWER, Intel x86/x87, early versions of SunOS or
IRIX), this is still a robust test. In many places in SMOKE (especially
program UAM2NCF
), the tests were for equality to AMISS3
, for which
the programmer should consider himself or herself lucky if the answer
comes out "right". In this new edition of SMOKE, these have all been
chased down and fixed.
The I/O API version of the FLTERR
and DBLERR
"these REAL
s are
certainly unequal, up to reasonable round-off" functions were carefully
tuned with tolerances that handle the numerical inadequacies of all of
the reasonable compute platforms (including WMO GRIB, which is quite
bad). In several places, the tolerances had been tightened unreasonably
by those un-knowing of the actual problems. These have been fixed.
In the biogenics, the photolysis process needs the tangent of the zenith
angle. The code from the original biogenics authors computed the cosine
CZEN
of the zenith angle, then took the inverse cosine
ZEN=ARCCOS(CZEN)
, used bounds on ZEN
to detect day/night, and
finally computed
`SQRT( 1.0 + TAN(ZEN)**2 )`
which (by the Pythagorean identity) equals 1.0/CZEN
, with the effect
of both added computational costs due to the use of extremely-expensive
trigonometric and inverse-trigonometric functions, and additional
round-off error. The original SMOKE prototype biogenics took advantage
of the high-school trigonometry Pythagorean identities to compute this
1.0/CZEN
, with day/night detection re-phrased in terms of CZEN
.
Someone in the mean-time replaced the improved version with the (lower
quality) original. This has now been fixed.
Fortran-90 provides very simple and flexible array structure for "auto"
local-variable arrays (which, BTW, provide "leak-proof" re-use of
memory). SMOKE systematically avoids this simplicity, using ALLOCATE
and DEALLOCATE
for what should be "auto" local-variable arrays. Note
that DEALLOCATE
does not necessarily reverse the effects of the
matching ALLOCATE
: it generally causes "holes" in the program's
memory-map, and if performed repeated constitutes a memory leak. In a
number of places, ALLOCATE / DEALLOCATE
arrays were turned into auto
arrays. (Doing this to the maximum extent possible would have been an
extremely tedious task ;-( ) [See PERMUTI
, below.]
Many constants were moved from DATA
statements to PARAMETER
statements.
For file-name and variable-name arguments, src/filesetapi/ routines
OPENSET
, CLOSESET
, READSET
, 'WRITESETand
DESCSETinappropriately check
LEN(NAME)instead of
LEN_TRIM(NAME)`.
In many places, constructs involving CHARACTER
-string lengths of the
form FOO( 1 : LEN_TRIM( FOO ) )
are replaced by the cheaper,
much-simpler, and more readable but equivalent TRIM( FOO )
. Likeise,
some "stupid" double-assignments like:
ALLOCATE( ENDLEN( MXPTCHR3 ), STAT=IOS ) CALL CHECKMEM( IOS, 'ENDLEN', PROGNAME ) ENDLEN = 1 ! array ENDLEN( 1:MXPTCHR3 ) = PTENDL3( 1:MXPTCHR3 )
have been simplified to:
ALLOCATE( ENDLEN( MXPTCHR3 ), STAT=IOS ) CALL CHECKMEM( IOS, 'ENDLEN', PROGNAME ) ENDLEN( 1:MXPTCHR3 ) = PTENDL3( 1:MXPTCHR3 )
Numerous embedded tab-characters were found (these make code-indentation problematical, depending upon one's editor-settings); these were replaced by the appropriate numbers of blanks.
In many places, the original code-authors introduced potential bugs due to a failure to understand how declaration-time initialization for variables works. The following sort of declaration
INTEGER :: NFOUND = 0
should almost always instead have a declaration, followed by a separate initialization-statement at the beginning of the body of the routine: INTEGER :: NFOUND ... NFOUND = 0 A number of these were fixed. (It is conceivable that the potential bugs were never triggered, because the relevant routines were called only a single time. However, the routines do not error-check to ensure that they are not called repeatedly.)
SMOKE parameters IOVLEN3
, IOULEN3
, and IODLEN3
, which must of necessity
match I/O API parameters NAMLEN3
and MXDLEN3
(but were introduced by
a failure to understand how to do this sort of thing properly in Fortran-90),
have been replaced by the matching I/O API parameters, now obtained from
M3UTILIO
. This results in the effective elimination of
src/inc/IOSTRG3.EXT and src/inc/IOPRVT3.EXT.
PARAMETER ALLFILES
belongs in MODULE MODFILESETAPI
, not scattered
through the filesetapi source code and additionally in INCLUDE
-file
SETDECL
. This has been implemented for the .f90
version of the
code.
SMOKE was converted to Fortran-90 Standard (.f90) free source format,
(from the M3UTILIO
ized code, using Willem Vermin's findent
program, with command-line options:
findent -i4 -k6 -ofree -RR !*
creating the .1.f90
version of the code. From this, the final .f90
code was generated, fixing up continuation and comment-indentation
styles.
Lines were re-folded to allow for the wider effective source-width created by the new free-format source indentation, especially when doing so eliminated "breaks" within phrases and enhanced readability.
Scripts should be re-coded so that they respect and use the UNIX/LINUX
exit-status returned by the programs: according to the standards, this
status is 0 if the program is successful, and non-zero otherwise. I/O API
routine M3EXIT
terminates the programs with the status given by its
status-argument; if the program is terminated otherwise (e.g., by a
segmentation fault), the operating system will set an appropriate exit
status, also. The scripting system should terminate as soon as a non-zero
exit status is encountered.
The set of supported map projections should be expanded to include at
least the Mercator and polar map projection types, and ideally to
include the full set of I/O API map projections. This can easily be
done using routines GRID2XY
and XY2XY
from I/O API
MODULE MODGCTP
.
File timestep-consistency checks in the TMPBEIS*
programs are bogus:
they should be replaced by checks that ask "does this file contain the
data" (formulated in terms, perhaps, of I/O API routine JSTEP3()
)
instead of "does this file have exactly the time steps I want?".
Likewise, hard-coded one-hour-timestep assumptions are inconsistent with many kinds of modeling and should be removed. They are so pervasive that this effort will be quite tedious.
The whole allocation-and-sorting system can and should be simplified
enormously, using a variant of lib/getfline.f that also returns the
number of non-comment lines in ASCII files, together with the new I/O
API generic routine PERMUTI
PERMUTI(N, INDX, ARR1 [, ARR2 [, ARR3]] )
that sorts its array-arguments ARR1
etc. in-place, on the basis of
the INDX
array returned by the SORTI
, without the need for extra
temporary scratch-arrays. Note that retrofitting PERMUTI
would be a
major task.
There is a new filesetapi implementation found mostly in
src/filesetapi/modfileset.new.f90 that (a) eliminates
src/inc/SETDECL.h90 in favor of USE MODFILESET
and moves
routines OPENSET, DESCSET, CLOSESET, PROMPTSET
that manipulate
fileset data structures into the CONTAINS
part of the module,
together with now-PRIVATE
worker routines CREATESET, CHKFILESET, CHKSETDESC, APPENDNAME
. For this, the Makefiles need to be
modified to eliminate source-files for the now-CONTAIN
ed routines, and
the rest of the SMOKE codes edited to USE MODFILESET
where they now
INCLUDE SETDECL.h90
.
The "generate new files in a file-set" constructs ensuring that file sizes do not exceed the (netCDF-2, 1990's) 2GB file size limit have not been needed since the introduction of netCDF-3 in the late 1990's. These constructs are no longer needed and should go away. Possibly (since I/O API version 3.2 supports up to 2048 variables and I/O API 3.2-large supports up to 16384 variables), all of src/filesetapi should be eliminated, using standard I/O API routines instead. (Note that netCDF-3 supports file sizes larger than SMOKE's 2 GB assumption, with a 2GB-per-timestep limit prior to netCDF-3.6 (2002) which does away with that limit...)
In this matter, lib/ioapi_grd_size needs to be changed from type
INTEGER
to INTEGER*8
, since with large grids and variable-counts,
timestep size in files may well exceed 2 GB.
Since SMOKE scripts always set environment variable PROMPTFLAG
to
NO
, the interactive-style prompt-for-file routines PROMPTMFILE
and
PROMPTFFILE
should not have been used; these should be replaced by the
script-style OPEN3 and
GETEFILE
I/O API functions
STR2INT,STR2REAL,STR2DBLE
were originally developed for use in SMOKE (starting with the SMOKE
prototpe version 0.2 (1993), where they are used to read
INTEGER,REAL,REAL*8
numbers from character strings with error
checking, returning IMISS3
or BADVAL3
in case of errors. As such,
the new-fangled SMOKE CHKINT
and CHKREAL
are superfluous; such
sequences
as
IF( .NOT. CHKINT( SEGMENT( 2 ) ) ... ) THEN ... ELSE IMON = STR2INT( SEGMENT( 2 ) ) ... END IF
are more complicated and much more expensive than
IMON = STR2INT( SEGMENT( 2 ) ) IF( IMON .EQ. IMISS3 ) THEN ... END IF
The original coding should be preferred.
REAL, PARAMETER :: A = 1.0 / 3.0 REAL, PARAMETER :: B = 0.333333333333333333 ! with extra-digit overkill REAL, PARAMETER :: C = FLOAT( 1 ) / FLOAT( 3 ) REAL D, E, F, G ... D = 1.0 / 3.0 E = FLOAT( 1 ) / FLOAT( 3 ) F = 0.333333333333333333 READ( '0.333333333333333333', * ) G ...
Footnotes
-
The Fortran Standard explicitly refuses to dictate the quality of how round-off behaves. As an extreme example, no two of the following are guaranteed to be equal (a problem found especially on Cray vector, IBM mainfreame and POWER, Intel x86/x87, and SGI platforms), although the naive impression is that they should all be equal: ↩