Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/urbanjost/m_starpac

An update of NIST STARPAC 2.0.8 targetted at fpm(1) users
https://github.com/urbanjost/m_starpac

Last synced: about 2 months ago
JSON representation

An update of NIST STARPAC 2.0.8 targetted at fpm(1) users

Host: GitHub
URL: https://github.com/urbanjost/m_starpac
Owner: urbanjost
Created: 2022-05-13T20:05:30.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2024-03-29T08:13:01.000Z (11 months ago)
Last Synced: 2024-11-08T03:48:02.277Z (3 months ago)
Language: Fortran
Size: 7.01 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# WORK IN PROGRESS
# WORK IN PROGRESS
# WORK IN PROGRESS
# WORK IN PROGRESS
# WORK IN PROGRESS
# WORK IN PROGRESS
# WORK IN PROGRESS
# WORK IN PROGRESS
* This is just a collection of files to use as seeds for a starpac
library callable from fpm(Fortran Package Manager). *

https://urbanjost.github.io/M_starpac/

[details="User's Guide"]
```text
USER'S GUIDE

STARPAC

THE STANDARDS TIME SERIES AND REGRESSION PACKAGE

------------------------------------------------

National Institute of Standards and Technology
(formerly the National Bureau of Standards)
Internal Report NBSIR 86-3448
(text revised 3/15/90)
(STARPAC sample output revised 03/15/90)

Janet R. Donaldson
Peter V. Tryon

U.S. Department of Commerce
Center for Computing and Applied Mathematics
National Institute of Standards and Technology
Boulder, Colorado 80303-3328
1

In memory of
Peter V. Tryon
1941 - 1982

Disclaimer

No warranties, express or implied, are made
by the distributors or developers that
STARPAC or its constituent parts are free of
error. They should not be relied upon as
the sole basis for solving a problem whose
incorrect solution could result in injury to
person or property. If the programs are
employed in such a manner, it is at the
user's own risk and the distributors and
developers disclaim all liability for such
misuse.

Computers have been identified in this
paper in order to adequately specify the
sample programs and test results. Such
identification does not imply recommendation
or endorsement by the National
Institute of Standards and Technology, nor
does it imply that the equipment identified
is necessarily the best available for the
purpose.

1 Preface

STARPAC, the Standards Time Series and Regression Package, is a library
of Fortran subroutines for statistical data analysis developed by the
Statistical Engineering Division (SED) of the National Institute of Standards
and Technology (NIST), formerly the National Bureau of Standards (NBS),
Boulder, Colorado. Earlier versions of this library were distributed by the
SED under the name STATLIB [Tryon and Donaldson, 1978]. Chapter 1 and
chapter 9 of this document were previously distributed as NBS Technical Notes
1068-1 and 1068-2, respectively [Donaldson and Tryon, 1983a and 1983b].
STARPAC incorporates many changes to STATLIB, including additional
statistical techniques, improved algorithms and enhanced portability.

STARPAC consists of families of subroutines for nonlinear least squares
regression, time series analysis (in both time and frequency domains), line
printer graphics, basic statistical analysis, and linear least squares
regression. These subroutines feature:

* ease of use, alone and with other Fortran subroutine libraries;

* extensive error handling facilities;

* comprehensive printed reports;

* no problem size restrictions other than effective machine size; and

* portability.

Notation, format and naming conventions are constant throughout the
STARPAC documentation, allowing the documentation for each family of
subroutines to be used alone or in conjunction with the documentation for
another.

STARPAC is written in ANSI Fortran 77 [American National Standards
Institute, 1977]. Workspace and machine-dependent constants are supplied
using subroutines based on the Bell Laboratories "Framework for a Portable
Library" [Fox et al., 1978a]. We have also used subroutines from LINPACK
[Dongarra et al., 1979], from the "Basic Linear Algebra Subprograms for
Fortran Usage" [Lawson et al., 1979], from DATAPAC [Filliben, 1977] and from
the portable special function subroutines of Fullerton [1977]. The analyses
generated by several of the subroutine families have been adapted from
OMNITAB II [Hogben et al., 1971]; users are directed to Peavy et al. [1985]
for information about OMNITAB 80, the current version of OMNITAB.

Computer facilities for the STARPAC project have been provided in part by
the National Oceanic and Atmospheric Administration Mountain Administrative
Support Center Computer Division, Boulder, Colorado, and we gratefully
acknowledge their support. The STARPAC subroutine library is the result of
the programming efforts of Janet R. Donaldson and John E. Koontz, with
assistance from Ginger A. Caldwell, Steven M. Keefer, and Linda L. Mitchell.
Valuable contributions have also been made by each of the members of the
Statistical Engineering Division in Boulder, and from many within the STARPAC
user community. We are grateful for the many valuable comments that we have
received on early drafts of the STARPAC documentation; we wish especially to

1thank Paul T. Boggs, Ginger A. Caldwell, Sally E. Howe, John E. Koontz, James
T. Ringland, Ralph J. Slutz, and Dominic F. Vecchia. Finally, we wish to
thank Lorna Buhse for excellent manuscript support.

Janet R. Donaldson
Peter V. Tryon (deceased)
October 1985
Revised September 1987
Revised February 1990

1 Contents

Preface
1. INTRODUCTION TO USING STARPAC
A. Overview of STARPAC and Its Contents
B. Documentation Conventions
C. A Sample Program
D. Using STARPAC
D.1 The PROGRAM Statement
D.2 The Dimension Statements
D.3 The CALL Statements
D.4 STARPAC Output
D.5 STARPAC Error Handling
D.6 Common Programming Errors When Using STARPAC
2. LINE PRINTER GRAPHICS
A. Introduction
B. Subroutine Descriptions
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Details
F. Examples
3. NORMAL RANDOM NUMBER GENERATION
A. Introduction
B. Subroutine Descriptions
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments
E. Computational Methods
F. Example
G. Acknowledgments
4. HISTOGRAMS
A. Introduction
B. Subroutine Descriptions
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments
E. Computational Methods
E.1 Algorithms
E.2 Computed Results and Printed Output
F. Example
G. Acknowledgments
5. STATISTICAL ANALYSIS OF A UNIVARIATE SAMPLE
A. Introduction
B. Subroutine Descriptions
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 Algorithms
E.2 Computed Results and Printed Output
F. Example
G. Acknowledgments

16. ONE-WAY ANALYSIS OF VARIANCE
A. Introduction
B. Subroutine Descriptions
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 Algorithms
E.2 Computed Results and Printed Output
F. Example
G. Acknowledgments
7. CORRELATION ANALYSIS
A. Introduction
B. Subroutine Descriptions
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 Algorithms
E.2 Computed Results and Print
F. Example
G. Acknowledgments
8. LINEAR LEAST SQUARES
A. Introduction
B. Subroutine Descriptions
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 The Linear Least Squares Algorithm
E.2 Computed Results and Printed Output
F. Examples
G. Acknowledgments
9. NONLINEAR LEAST SQUARES
A. Introduction
B. Subroutine Descriptions
B.1 Nonlinear Least Squares Estimation Subroutines
B.2 Derivative Step Size Selection Subroutines
B.3 Derivative Checking Subroutines
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 Algorithms
E.1.a Nonlinear Least Squares Estimation
E.1.b Derivative Step Size Selection
E.1.c Derivative Checking
E.2 Computed Results and Printed Output
E.2.a The Nonlinear Least Squares Estimation
Subroutines
E.2.b The Derivative Step Size Selection Subroutines
E.2.c The Derivative Checking Subroutines
F. Examples
G. Acknowledgments

110. DIGITAL FILTERING
A. Introduction
B. Subroutine Descriptions
B.1 Symmetric Linear Filter Subroutines
B.2 Autoregressive or Difference Linear Filter
Subroutines
B.3 Gain and Phase Function Subroutines
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 Algorithms
E.2 Computed Results and Printed Output
F. Examples
G. Acknowledgments
11. COMPLEX DEMODULATION
A. Introduction
B. Subroutine Descriptions
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 Algorithms
E.2 Computed Results and Printed Output
F. Example
G. Acknowledgments
12. CORRELATION AND SPECTRUM ANALYSIS
A. Introduction
B. Subroutine Descriptions
B.1 Correlation Analysis
B.1.a Univariate Series
B.1.b Bivariate Series
B.2 Spectrum Estimation
B.2.a Univariate Series
B.2.b Bivariate Series
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 Algorithms
E.1.a Correlation Analysis
E.1.b Spectrum Analysis
E.2 Computed Results and Printed Output
E.2.a Correlation Analysis
E.2.b Spectrum Analysis
F. Examples
G. Acknowledgments

113. ARIMA MODELING
A. Introduction
B. Subroutine Descriptions
B.1 ARIMA Estimation Subroutines
B.2 ARIMA Forecasting Subroutines
C. Subroutine Declaration and CALL Statements
D. Dictionary of Subroutine Arguments and COMMON Variables
E. Computational Methods
E.1 Algorithms
E.1.a ARIMA Estimation
E.1.b ARIMA Forecasting
E.2 Computed Results and Printed Output
E.2.a The ARIMA Estimation Subroutines
E.2.b The ARIMA Forecasting Subroutines
F. Examples
G. Acknowledgments
Appendix A. CONTINUITY OF VERTICAL PLOTS ON THE CDC CYBER 840 AND 855
Appendix B. WEIGHTED LEAST SQUARES
Appendix C. ESTIMATING THE NUMBER OF RELIABLE DIGITS IN THE RESULTS OF A
FUNCTION
Appendix D. LIST OF STARPAC SUBPROGRAM NAMES
D.1 Subprograms specifically written for STARPAC
D.2 Subprograms from NL2SOL
D.3 Subprograms from miscellaneous public domain sources
D.4 Subprograms from LINPACK and BLAS
D.5 Subprograms specifying machine dependent constants
Appendix E. LIST OF STARPAC LABELED COMMON NAMES
References.

1 STARPAC

The Standards Time Series and Regression Package

Janet R. Donaldson and Peter V. Tryon

U.S. Department of Commerce
Center for Computing and Applied Mathematics
National Institute of Standards and Technology
Boulder, Colorado 80303-3328

STARPAC, the Standards Time Series and Regression Package, is a
library of Fortran subroutines for statistical data analysis
developed by the Statistical Engineering Division of the National
Institute of Standards and Technology (formerly the National
Bureau of Standards), Boulder, Colorado. Earlier versions of this
library were distributed by the SED under the name STATLIB [Tryon
and Donaldson, 1978]. STARPAC incorporates many changes to STATLIB,
including additional statistical techniques, improved algorithms and
enhanced portability.

STARPAC emphasizes the statistical interpretation of results,
and, for this reason, comprehensive printed reports of auxiliary
statistical information, often in graphical form, are automatically
provided to augment the basic statistical computations performed by
each user-callable STARPAC subroutine. STARPAC thus provides the
best features of many stand-alone statistical software programs
within the flexible environment of a subroutine library.

Key words: data analysis; nonlinear least squares; STARPAC;
statistical computing; statistical subroutine library; statistics;
STATLIB; time series analysis.

1----- CHAPTER 1 -----

INTRODUCTION TO USING STARPAC

A. Overview of STARPAC and Its Contents

STARPAC is a portable library of approximately 150 user-callable ANSI 77
Fortran subroutines for statistical data analysis. Designed primarily for
time series analysis and for nonlinear least squares regression, STARPAC also
includes subroutines for normal random number generation, line printer plots,
basic statistical analyses and linear least squares. Emphasis has been placed
on facilitating the interpretation of statistical analyses, and, for this
reason, comprehensive printed reports of auxiliary statistical information,
often in graphical form, are automatically provided to augment the basic
statistical computations performed by each user-callable STARPAC subroutine.
STARPAC thus provides the best features of many stand-alone statistical
software programs within the flexible environment of a subroutine library.

STARPAC is designed to be easy to use; in many situations, only a few
lines of elementary Fortran code are required for the users' main programs. A
fundamental STARPAC philosophy is to provide two or more user-callable
subroutines for each method of analysis: one which minimizes the complexity
of the CALL statement, automatically producing a comprehensive printed report
of the results; and one or more others which provide user control of the
computations, allow suppression of all or part of the printed reports, and/or
provide storage of computed results for further analyses.

STARPAC was developed and is maintained by the Center for Computing and
Applied Mathematics of the National Institute of Standards and Technology
(NIST), Boulder, Colorado. Users' comments and suggestions, which have had
significant impact already, are highly valued and always welcomed. Comments
and suggestions should be directed to:

Janet R. Donaldson
NIST Center for Computing and Applied Mathematics
Mail Code 719
325 Broadway
Boulder, CO 80303-3328.

B. Documentation Conventions

The documentation for the various STARPAC subroutine families uses a
standard format description of the information needed to use a STARPAC
subroutine, including one or more examples.

References to chapter sections within the STARPAC documentation refer to
the identified section within the current chapter unless explicitly stated
otherwise. Figures are identified by the section in which they occur. For
example, figure B-1 refers to the first figure in section B of this chapter
(chapter 1).

Names of INTEGER and REAL STARPAC subroutine arguments are consistent
with the implicit Fortran convention for specifying variable type. That is,
variable names beginning with I through N are type INTEGER while all others
are type REAL unless otherwise explicitly typed DOUBLE PRECISION or COMPLEX.
All dimensioned variables are explicitly declared in STARPAC documentation by

<1-1>
1means of INTEGER, REAL, DOUBLE PRECISION, or COMPLEX statements, as
appropriate. The convention used to specify the dimension statements is
discussed below in section D.2.

The precision of the STARPAC library is indicated in the printed reports
generated by STARPAC: an S following the STARPAC version number in the output
heading indicates the single precision version is being used, while a D
indicates the double precision version. The STARPAC documentation is designed
for use with both single and double precision versions. Subroutine arguments
which are double precision in both versions are declared DOUBLE PRECISION;
similarly, arguments which are single precision in both versions are declared
REAL. Arguments whose precision is dependent upon whether the single or
double precision version of STARPAC is being used are declared . If the
double precision version of the STARPAC library is being used, then the user
should substitute DOUBLE PRECISION for ; if the single precision version
is being used, then the user should substitute REAL for . Other
precision-dependent features are discussed as they occur.

C. A Sample Program

The sample program shown below illustrates the use of STARPAC
subroutines. The code shown is portable ANSI 77 Fortran. Section D below
uses this example to discuss Fortran programming as it relates to STARPAC.

The data used in this example are 84 relative humidity measurements taken
at Pikes Peak, Colorado. STARPAC subroutine PP, documented in chapter 2,
plots the data versus time-order and STARPAC subroutine STAT, documented in
chapter 5, prints a comprehensive statistical analysis of the data.

<1-2>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE STAT AND PP USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y AND X MUST BE CHANGED TO DOUBLE PRECISION
C IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(100), X(100)
DOUBLE PRECISION DSTAK(100)
C
COMMON /CSTAK/ DSTAK
COMMON /ERRCHK/ IERR
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C DEFINE LDSTAK, THE LENGTH OF DSTAK
C
LDSTAK = 100
C
C READ NUMBER OF OBSERVATIONS INTO N AND
C DATA INTO VECTOR Y
C
READ (5,100) N
READ (5,101) (Y(I), I=1,N)
C
C CREATE A VECTOR OF ORDER INDICES IN X
C
DO 10 I=1,N
X(I) = I
10 CONTINUE
C
C PRINT TITLE, PLOT OF DATA AND ERROR INDICATOR
C
WRITE (IPRT,102)
CALL PP (Y, X, N)
WRITE (IPRT,103) IERR
C
C PRINT TITLE, STATISTICAL ANALYSIS OF DATA AND ERROR INDICATOR
C
WRITE (IPRT,102)
CALL STAT (Y, N, LDSTAK)
WRITE (IPRT,103) IERR
C
C FORMAT STATEMENTS
C
100 FORMAT (I5)
101 FORMAT (11F7.4)
102 FORMAT ('1DAVIS-HARRISON PIKES PEAK RELATIVE HUMIDITY DATA')
103 FORMAT (' IERR = ', I1)
END

<1-3>
1Data:

84
0.6067 0.6087 0.6086 0.6134 0.6108 0.6138 0.6125 0.6122 0.6110 0.6104 0.7213
0.7078 0.7021 0.7004 0.6981 0.7242 0.7268 0.7418 0.7407 0.7199 0.6225 0.6254
0.6252 0.6267 0.6218 0.6178 0.6216 0.6192 0.6191 0.6250 0.6188 0.6233 0.6225
0.6204 0.6207 0.6168 0.6141 0.6291 0.6231 0.6222 0.6252 0.6308 0.6376 0.6330
0.6303 0.6301 0.6390 0.6423 0.6300 0.6260 0.6292 0.6298 0.6290 0.6262 0.5952
0.5951 0.6314 0.6440 0.6439 0.6326 0.6392 0.6417 0.6412 0.6530 0.6411 0.6355
0.6344 0.6623 0.6276 0.6307 0.6354 0.6197 0.6153 0.6340 0.6338 0.6284 0.6162
0.6252 0.6349 0.6344 0.6361 0.6373 0.6337 0.6383

1DAVIS-HARRISON

.7418 -
I
I
I
I
.7271 -
I
I
I
I
.7125 -
I
I
I
I
.6978 -
I
I
I
I
.6831 -
I
I
I
I
.6685 -
I
I
I
I
.6538 -
I
I
I
I
.6391 -
I
I
I
I
.6244 -
I
I
I

.6098 - ++ +
I +
I
I
I
.5951 -

IERR = 0 <1-4> PIKES PEAK RELATIVE HUMIDITY DATA STARPAC 2.08S (03/15/90) -I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I- + + - I I I I + - + I + + I I I - I + I I + + I + - I I I I - I I I I - I + I I I + - I I ++ I + + + + I + + + - + + + + + + I + + + ++ + + I + + ++ + ++ + + + I + + + + + I + + + + + + + - + + + +++ + I + ++ + + I + + + I I + +++ + I ++ - I I I I ++ - -I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I- 1.0000 9.3000 17.6000 25.9000 34.2000 42.5000 50.8000 59.1000 67.4000 75.7000 84.0000

<1-5>
1DAVIS-HARRISON PIKES PEAK RELATIVE HUMIDITY DATA
STARPAC 2.08S (03/15/90)
+STATISTICAL ANALYSIS

N = 84

FREQUENCY DISTRIBUTION (1-6) 5 25 35 8 1 0 0 4 4 2

MEASURES OF LOCATION (2-2) MEASURES OF DISPERSION (2-6)

UNWEIGHTED MEAN = 6.3734048E-01 WTD STANDARD DEVIATION = 3.2405213E-02
WEIGHTED MEAN = 6.3734048E-01 WEIGHTED S.D. OF MEAN = 3.5356987E-03
MEDIAN = 6.2915000E-01 RANGE = 1.4670000E-01
MID-RANGE = 6.6845000E-01 MEAN DEVIATION = 2.1076417E-02
25 PCT UNWTD TRIMMED MEAN= 6.2885952E-01 VARIANCE = 1.0500979E-03
25 PCT WTD TRIMMED MEAN = 6.2885952E-01 COEF. OF. VAR. (PERCENT) = 5.0844430E+00

A TWO-SIDED 95 PCT CONFIDENCE INTERVAL FOR MEAN IS 6.3030811E-01 TO 6.4437284E-01 (2-2)
A TWO-SIDED 95 PCT CONFIDENCE INTERVAL FOR S.D. IS 2.8137110E-02 TO 3.8211736E-02 (2-7)

LINEAR TREND STATISTICS (5-1) OTHER STATISTICS

SLOPE = -2.4736661E-04 MINIMUM = 5.9510000E-01
S.D. OF SLOPE = 1.4414086E-04 MAXIMUM = 7.4180000E-01
SLOPE/S.D. OF SLOPE = T = -1.7161450E+00 BETA ONE = 3.7288258E+00
PROB EXCEEDING ABS VALUE OF OBS T = .090 BETA TWO = 5.9283926E+00
WTD SUM OF VALUES = 5.3536600E+01
WTD SUM OF SQUARES = 3.4208200E+01
TESTS FOR NON-RANDOMNESS WTD SUM OF DEV SQUARED = 8.7158122E-02
STUDENTS T = 1.8025871E+02
NO. OF RUNS UP AND DOWN = 47 WTD SUM ABSOLUTE VALUES = 5.3536600E+01
EXPECTED NO. OF RUNS = 55.7 WTD AVE ABSOLUTE VALUES = 6.3734048E-01
S.D. OF NO. OF RUNS = 3.82
MEAN SQ SUCCESSIVE DIFF = 3.6382337E-04
MEAN SQ SUCC DIFF/VAR = .346

DEVIATIONS FROM WTD MEAN

NO. OF + SIGNS = 22
NO. OF - SIGNS = 62
NO. OF RUNS = 14
EXPECTED NO. OF RUNS= 33.5
S.D. OF RUNS = 3.51
DIFF./S.D. OF RUNS = -5.550

NOTE - ITEMS IN PARENTHESES REFER TO PAGE NUMBER IN NBS HANDBOOK 91 (NATRELLA, 1966)
IERR = 0

<1-6>
1D. Using STARPAC

The following subsections provide general information needed when using
STARPAC, including a discussion of Fortran programming as it relates to
STARPAC usage. Although only elementary knowledge of Fortran is required to
use STARPAC, users may still have to consult with a Fortran text and/or their
Computing Center staff when questions arise.

D.1 The PROGRAM Statement

The PROGRAM statement is used to name the user's main program. The name
EXAMPL is assigned to the main program in this example. The program name
cannot be the name of any variable in the user's main program and, in
addition, cannot be the name of any other subroutine or function called during
execution of the user's code. Specifically, it cannot be the name of any
subroutine within STARPAC. To ensure that the name of a STARPAC subroutine is
not inadvertently chosen for the name of the main program, users should
consult with the local installer of STARPAC to obtain a list of the STARPAC
subroutine names.

D.2 The Dimension Statements

The user's program must include dimension statements to define the sizes
and types of the vectors, matrices and three-dimensional arrays required by
each STARPAC subroutine used; STARPAC itself has no inherent upper limit on
problem size.

Within the STARPAC documentation for the subroutine declaration and CALL
statements, lowercase identifiers in the dimension statements
represent integer constants which must equal or exceed the value
of the identically-spelled uppercase argument. For example,
if the documentation specifies the minimum dimension of a variable
as XM(n,m), and if the number of observations N is 15,
and the number of columns of data M is 3, then (assuming the single
precision version of STARPAC is being used) the minimum array size is
given by the dimension statement REAL XM(15,3).

The exact dimensions assigned to some vectors and matrices must be
supplied in the CALL statements to some STARPAC subroutines. For example, the
argument IXM is defined as "the exact value of the first dimension of the
matrix XM as declared in the calling program." Continuing the example from
the preceding paragraph, if the statement REAL XM(20,5) is used to dimension
the matrix XM for a particular subroutine, and IXM is an argument in the CALL
statement, then IXM must have the value 20 regardless of the value assigned to
the variable N.

Many STARPAC subroutines require a work area for internal computations.
This work area is provided by the DOUBLE PRECISION vector DSTAK. The rules
for defining DSTAK are as follows.

1. Programs which call subroutines requiring the work vector DSTAK must
include the statements

DOUBLE PRECISION DSTAK (ldstak)
COMMON /CSTAK/ DSTAK

where ldstak indicates the integer constant used to dimension DSTAK.

<1-7>
1
2. Since all STARPAC subroutines use the same work vector, the length of
DSTAK must equal or exceed the longest length required by any of the
individual STARPAC subroutines called by the user's program.

3. The length, LDSTAK, of the work vector DSTAK must be specified in the
CALL statement of any STARPAC subroutine using DSTAK to enable STARPAC
to verify that there will be sufficient work area for the problem.

It is recommended that a variable LDSTAK be set to the length of DSTAK,
and that this variable be used in each CALL statement requiring the length of
DSTAK to be specified. Then, if a future modification to the user's program
requires the length of DSTAK to be changed, the only alterations required in
the existing code would be to the DOUBLE PRECISION dimension statement and to
the statement which assigns the length of DSTAK to LDSTAK.

STARPAC manages its work area using subroutines modeled after those in
ACM Algorithm 528: Framework for a Portable Library [Fox et al. 1978a].
Although STARPAC and the Framework share the same COMMON for their work areas,
there are differences between the STARPAC management subroutines and those of
the Framework. In particular, the STARPAC management subroutines
reinitialize DSTAK each time the user invokes a STARPAC subroutine requiring
work area, destroying all data previously stored in DSTAK; the Framework only
initializes DSTAK the first time any of its management subroutines are
invoked, preserving work area allocations still in use. Thus, users must be
cautious when utilizing STARPAC with other libraries which employ the
Framework, such as PORT [Fox et al., 1978b].

The sample program shown in figure C-1 provides an example of the use of
dimensioned variables with STARPAC. The REAL vector Y, used by both
subroutines PP and STAT, contains the 84 relative humidity measurements; its
minimum length, N (the number of observations), is 84. The REAL vector X used
by subroutine PP contains the corresponding time order indices of the data;
its minimum length is also 84. The DOUBLE PRECISION vector DSTAK contains the
work area needed by STAT for intermediate computations; its minimum length, 49
in this case, is defined in section D of chapter 4. In this example, the
dimensions of Y, X, and DSTAK, are each 100, exceeding the required minimum
values.

D.3 The CALL Statements

The STARPAC CALL statement arguments provide the interface for specifying
the data to be used, controlling the computations, and providing space for any
returned results. The CALL statements used in the example (fig. C-1) are
CALL PP(Y, X, N) and CALL STAT(Y, N, LDSTAK). Note that scalar arguments may
be specified either by a variable preset to the desired value, as was done in
the example, or by the actual numerical values. For example, CALL
PP(Y, X, 84) and CALL STAT(Y, 84, 100) could have been used instead of the
forms shown. We recommend using variables rather than the actual numerical
values in order to simplify future changes in the program. When variables are
used, changes need to be made in only one place; numerical values have to be
changed every place they occur. The use of variables can also clarify the
meaning of the program.

<1-8>
1D.4 STARPAC Output

Most STARPAC subroutines produce extensive printed reports, freeing the
user from formatting and printing all statistics of interest. The standard
output device is used for these reports. The user has the options of titling
the reports and changing the output device.

The first page of the report from each STARPAC subroutine does not start
on a new page. This allows the user to supply titles. For example,

WRITE (6, 100)
100 FORMAT ('1DAVIS-HARRISON PIKES PEAK RELATIVE HUMIDITY DATA')
CALL PP (Y, X, N)

will print the title DAVIS-HARRISON PIKES PEAK RELATIVE HUMIDITY DATA on the
top line of a new page, immediately preceding the output produced by the call
to subroutine PP. Users should note that titles more than one line in length
can cause a printed report designed for one page to extend beyond the bottom
of the page.

The unit number, IPRT, of the output device used by STARPAC is returned
by STARPAC subroutine IPRINT. Users can change the output device unit number
by including with their program a subroutine IPRINT which will supersede the
STARPAC subroutine of the same name. The subroutine must have the form

SUBROUTINE IPRINT(IPRT)
IPRT = u
RETURN
END

where u is an integer value specifying the output unit to which all STARPAC
output will be written.

D.5 STARPAC Error Handling

STARPAC provides extensive error-checking facilities which include both
printed reports and a program-accessible error flag variable. There are
essentially two types of errors STARPAC can detect.

The first type of error involves incorrect problem specification, i.e.,
one or more of the input arguments in the subroutine statement has an improper
value. For example, the number of observations, N, might have an obviously
meaningless non-positive value. In the case of improper problem specification
STARPAC generates a printed report identifying the subroutine involved, the
error detected, and the proper form of the subroutine CALL statement. The
latter is provided because improper input is often the result of an
incorrectly specified subroutine argument list.

A second type of error can be thought of as a computation error: either
the initiated calculation cannot be completed or the results from the called
subroutine are questionable. For example, when the least squares model and
data are found to be singular, the desired computations cannot be completed;
when one or more of the standardized residuals from a least squares fit cannot
be computed because the standard deviation of the residual is zero, the
results of the error estimates from the least squares regression may be
questionable. If a computation error is detected, STARPAC generates a report
which identifies the error, and, to aid the user in determining the cause of

<1-9>
1the error, summarizes the completed results in a printed report.

STARPAC error reports cannot be suppressed, even when the normal output
from the STARPAC subroutine has been suppressed. (STARPAC output must be
directed to a separate output device [see section D.4] when users do not want
any STARPAC reports displayed under any conditions.) Because of this, users
seldom have to consciously handle STARPAC error conditions in their code.

When proper execution of the user's program depends on knowing whether or
not an error has been detected, the error flag can be examined from within the
user's code. When access to the error flag is desired, the statement

COMMON /ERRCHK/ IERR

must be placed with the Fortran declaration statements in the user's program.
Following the execution of a STARPAC subroutine, the variable IERR will be set
to zero if no errors were detected, and to a nonzero value otherwise; the
value of IERR may indicate the type of error [e.g., see chapter 9, section D,
argument IERR]. If the CALL statement is followed with a statement of the
form

IF (IERR .NE. 0) STOP

then the program will stop when an error is detected. (In figure C-1, the
value of IERR is printed following each CALL statement to show the value
returned.)

D.6 Common Programming Errors When Using STARPAC

STARPAC error-checking procedures catch many programming errors and print
informative diagnostics when such errors are detected. However, there are
some errors which STARPAC cannot detect. The more common of these are
discussed below.

1. The most common error involves array dimensions which are too small.
Although certain arguments are checked by STARPAC to verify that array
dimensions are adequate, if incorrect information is supplied to
STARPAC, or if the dimension of an array which is not checked is too
small, the program will produce erroneous results and/or will stop
prematurely. Users should check the dimension statements in their
program whenever difficulties are encountered in using STARPAC.

2. The second most common error involves incorrect CALL statements, that
is, CALL statements in which the STARPAC subroutine name is
misspelled, the arguments are incorrectly ordered, one or more
arguments are omitted, or the argument types (INTEGER, REAL, DOUBLE
PRECISION, and COMPLEX) are incorrect. Users having problems using
STARPAC should carefully check their declaration and CALL statements
to verify that they agree with the documentation.

3. The third most common error involves incorrect specification of the
work vector DSTAK. Programs which call STARPAC subroutines requiring
work area must include both the DOUBLE PRECISION statement dimension
DSTAK and the COMMON /CSTAK/ DSTAK statement.

4. The final common error involves user-supplied subroutines which have
the same name as a subroutine in the STARPAC library. Users should

<1-10>
1 consult with the local installer of STARPAC to obtain a list of all
STARPAC subroutine names. This list can then be used to ensure that a
STARPAC subroutine name has not been duplicated.

Users who have not found the cause of a problem after checking the
possibilities mentioned above should consult with their Computing Center
advisers.

<1-11>
1----- CHAPTER 2 -----

LINE PRINTER GRAPHICS

A. Introduction

STARPAC contains 36 subroutines for producing 2 basic styles of line
printer plots.

The first, called a page plot, uses a single 11 x 14 inch page of line
printer paper.

The second, called a vertical plot, is designed for plotting time series.
The user specifies only the y-axis values since the x-axis values (independent
variable) are assumed to be equally spaced and ordered consecutively. The
independent variable in the resulting plot is oriented vertically and the plot
will continue over as many pages as necessary to plot one point per line.

Within these two basic styles the user has many options, including
controlling the plot symbol, plotting multivariate values, designating missing
data, using log scales and specifying plot limits and plot size.

Users are directed to section B for a brief description of the
subroutines. The actual declaration and CALL statements are given in section
C and the subroutine arguments are defined in section D. The algorithms used
and output produced by these subroutines are discussed in section E. Sample
programs are shown in section F.

B. Subroutine Descriptions

PP (Page Plot) and VP (Vertical Plot) are the simplest of the STARPAC
line printer plot subroutines. For both, plot limits are automatically set by
the range of the data and other control parameters are set to the default
values given in section D. The remaining plotting subroutines are named by
adding letters to the beginning and/or end of PP and VP.

* Prefix S (e.g., SPP) indicates the user controls the plot symbol for
each point.

* Prefix M indicates the subroutine will accept multivariate y-axis
values (e.g., MPP).

* Suffix M subroutines allow data with missing observations to be plotted
(e.g., VPM).

* Suffix L indicates log scales can be used (e.g., PPL).

* Suffix C subroutines allow control of the parameters which specify the
plot limits, size, scale, etc. (e.g., VPC).

The following table, which indicates the capabilities of each of the
STARPAC plotting subroutines, can be used to select from the available
subroutines. Subroutine declaration and CALL statements are given in section
C, listed in the same order as in the table.

<2-1>
1
STARPAC Plotting Subroutines

Plot
Symbol Multiple Page Vertical Missing Log Control
Control Y-Axis Plot Plot Data Scale Parameters
Name (S) (M) (PP) (VP) (M) (L) (C)

PP *
PPL * *
PPC * * *
PPM * *
PPML * * *
PPMC * * * *

SPP * *
SPPL * * *
SPPC * * * *
SPPM * * *
SPPML * * * *
SPPMC * * * * *

MPP * *
MPPL * * *
MPPC * * * *
MPPM * * *
MPPML * * * *
MPPMC * * * * *

VP *
VPL * *
VPC * * *
VPM * *
VPML * * *
VPMC * * * *

SVP * *
SVPL * * *
SVPC * * * *
SVPM * * *
SVPML * * * *
SVPMC * * * * *

MVP * *
MVPL * * *
MVPC * * * *
MVPM * * *
MVPML * * * *
MVPMC * * * * *

<2-2>
1C. Subroutine Declaration and CALL Statements

NOTE: Argument definitions and sample programs are given in section D and
section F, respectively. The conventions used to present the following
declaration and CALL statements are given in chapter 1, sections B and D.

Page Plots

PP: Print Y versus X scatterplot; linear axes; default control values and
axis limits; no missing values allowed

Y(n), X(n)
:
:
CALL PP (Y, X, N)

===

PPL: Print Y versus X scatterplot; log or linear axes; default control
values and axis limits; no missing values allowed

Y(n), X(n)
:
:
CALL PPL (Y, X, N, ILOG)

===

PPC: Print Y versus X scatterplot; log or linear axes; user-supplied
control values and axis limits; no missing values allowed

Y(n), X(n), YLB, YUB, XLB, XUB
:
:
CALL PPC (Y, X, N, ILOG,
+ ISIZE, NOUT, YLB, YUB, XLB, XUB)

===

PPM: Print Y versus X scatterplot; linear axes; default control values and
axis limits; missing values allowed

Y(n), YMISS, X(n), XMISS
:
:
CALL PPM (Y, YMISS, X, XMISS, N)

===

<2-3>
1PPML: Print Y versus X scatterplot; log or linear axes; default control
values and axis limits; missing values allowed

Y(n), YMISS, X(n), XMISS
:
:
CALL PPML (Y, YMISS, X, XMISS, N, ILOG)

===

PPMC: Print Y versus X scatterplot; log or linear axes; user-supplied
control values and axis limits; missing values allowed

Y(n), YMISS, X(n), XMISS, YLB, YUB, XLB, XUB
:
:
CALL PPMC (Y, YMISS, X, XMISS, N, ILOG,
+ ISIZE, NOUT, YLB, YUB, XLB, XUB)

===

SPP: Print Y versus X scatterplot with individual plot symbols specified
by user; linear axes; default control values and axis limits; no
missing values allowed

Y(n), X(n)
INTEGER ISYM(n)
:
:
CALL SPP (Y, X, N, ISYM)

===

SPPL: Print Y versus X scatterplot with individual plot symbols specified
by user; log or linear axes; default control values and axis limits;
no missing values allowed

Y(n), X(n)
INTEGER ISYM(n)
:
:
CALL SPPL (Y, X, N, ISYM, ILOG)

===

SPPC: Print Y versus X scatterplot with individual plot symbols specified
by user; log or linear axes; user-supplied control values and axis
limits; no missing values allowed

Y(n), X(n), YLB, YUB, XLB, XUB
INTEGER ISYM(n)
:
:
CALL SPPC (Y, X, N, ISYM, ILOG,
+ ISIZE, NOUT, YLB, YUB, XLB, XUB)

===

<2-4>
1
SPPM: Print Y versus X scatterplot with individual plot symbols specified
by user; linear axes; default control values and axis limits; missing
values allowed

Y(n), YMISS, X(n), XMISS
INTEGER ISYM(n)
:
:
CALL SPPM (Y, YMISS, X, XMISS, N, ISYM)

===

SPPML: Print Y versus X scatterplot with individual plot symbols specified
by user; log or linear axis; default control values and axis limits;
missing values allowed

Y(n), YMISS, X(n), XMISS
INTEGER ISYM(n)
:
:
CALL SPPML (Y, YMISS, X, XMISS, N, ISYM, ILOG)

===

SPPMC: Print Y versus X scatterplot with individual plot symbols specified
by user; log or linear axes; user-supplied control values and axis
limits; missing values allowed

Y(n), YMISS, X(n), XMISS, YLB, YUB, XLB, XUB
INTEGER ISYM(n)
:
:
CALL SPPMC (Y, YMISS, X, XMISS, N, ISYM, ILOG,
+ ISIZE, NOUT, YLB, YUB, XLB, XUB)

===

MPP: Print plot of multiple Y vectors versus a common X vector; linear
axes; default control values and axis limits; no missing values
allowed

YM(n,m), X(n)
:
:
CALL MPP (YM, X, N, M, IYM)

===

<2-5>
1MPPL: Print plot of multiple Y vectors versus a common X vector; log or
linear axes; default control values and axis limits; no missing values
allowed

YM(n,m), X(n)
:
:
CALL MPPL (YM, X, N, M, IYM, ILOG)

===

MPPC: Print plot of multiple Y vectors versus a common X vector; log or
linear axes; user-supplied control values and axis limits; no missing
values allowed

YM(n,m), X(n), YLB, YUB, XLB, XUB
:
:
CALL MPPC (YM, X, N, M, IYM, ILOG,
+ ISIZE, NOUT, YLB, YUB, XLB, XUB)

===

MPPM: Print plot of multiple Y vectors versus a common X vector; linear
axes; default control values and axis limits; missing values allowed

YM(n,m), YMMISS(m), X(n), XMISS
:
:
CALL MPPM (YM, YMMISS, X, XMISS, N, M, IYM)

===

MPPML: Print plot of multiple Y vectors versus a common X vector; log or
linear axes; default control values and axis limits; missing values
allowed

YM(n,m), YMMISS(m), X(n), XMISS
:
:
CALL MPPML (YM, YMMISS, X, XMISS, N, M, IYM, ILOG)

===

MPPMC: Print plot of multiple Y vectors versus a common X vector; log or
linear axes; user-supplied control values and axis limits; missing
values allowed

YM(n,m), YMMISS(m), X(n), XMISS, YLB, YUB, XLB, XUB
:
:
CALL MPPMC (YM, YMMISS, X, XMISS, N, M, IYM, ILOG,
+ ISIZE, NOUT, YLB, YUB, XLB, XUB)

===

<2-6>
1 Vertical Plots

VP: Print vertical plot of Y versus input order; linear axes; default
control values and axis limits; no missing values allowed

Y(n)
:
:
CALL VP (Y, N, NS)

===

VPL: Print vertical plot of Y versus input order; log or linear horizontal
(Y) axis; default control values and axis limits; no missing values
allowed

Y(n)
:
:
CALL VPL (Y, N, NS, ILOG)

===

VPC: Print vertical plot of Y versus input order; log or linear horizontal
(Y) axis; user-supplied control values and axis limits; no missing
values allowed

Y(n), YLB, YUB, XLB, XINC
:
:
CALL VPC (Y, N, NS, ILOG,
+ ISIZE, IRLIN, IBAR, YLB, YUB, XLB, XINC)

===

VPM: Print vertical plot of Y versus input order; linear axis; default
control values and axis limits; missing values allowed

Y(n), YMISS
:
:
CALL VPM (Y, YMISS, N, NS)

===

VPML: Print vertical plot of Y versus input order; log or linear horizontal
(Y) axis; default control values and axis limits; missing values
allowed

Y(n), YMISS
:
:
CALL VPML (Y, YMISS, N, NS, ILOG)

===

<2-7>
1VPMC: Print vertical plot of Y versus input order; log or linear horizontal
(Y) axis; user-supplied control values and axis limits; missing values
allowed

Y(n), YMISS, YLB, YUB, XLB, XINC
:
:
CALL VPMC (Y, YMISS, N, NS, ILOG,
+ ISIZE, IRLIN, IBAR, YLB, YUB, XLB, XINC)

===

SVP: Print vertical plot of Y versus input order with individual plot
symbols specified by user; linear axis; default control values and
axis limits; no missing values allowed

Y(n)
INTEGER ISYM(n)
:
:
CALL SVP (Y, N, NS, ISYM)

===

SVPL: Print vertical plot of Y versus input order with individual plot
symbols specified by user; log or linear horizontal (Y) axis; default
control values and axis limits; no missing values allowed

Y(n)
INTEGER ISYM(n)
:
:
CALL SVPL (Y, N, NS, ISYM, ILOG)

===

SVPC: Print vertical plot of Y versus input order with individual plot
symbols specified by user; log or linear horizontal (Y) axis;
user-supplied control values and axis limits; no missing values
allowed

Y(n), YLB, YUB, XLB, XINC
INTEGER ISYM(n)
:
:
CALL SVPC (Y, N, NS, ISYM, ILOG,
+ ISIZE, IREFLN, IBAR, YLB, YUB, XLB, XINC)

===

<2-8>
1SVPM: Print vertical plot of Y versus input order with individual plot
symbols specified by user; linear axis; default control values and
axis limits; missing values allowed

Y(n), YMISS
INTEGER ISYM(n)
:
:
CALL SVPM (Y, YMISS, N, NS, ISYM)

===

SVPML: Print vertical plot of Y versus input order with individual plot
symbols specified by user; log or linear horizontal (Y) axis; default
control values and axis limits; missing values allowed

Y(n), YMISS
INTEGER ISYM(n)
:
:
CALL SVPML (Y, YMISS, N, NS, ISYM, ILOG)

===

SVPMC: Print vertical plot of Y versus input order with individual plot
symbols specified by user; log or linear horizontal (Y) axis;
user-supplied control values and axis limits; missing values allowed

Y(n), YMISS, YLB, YUB, XLB, XINC
INTEGER ISYM(n)
:
:
CALL SVPMC (Y, YMISS, N, NS, ISYM, ILOG,
+ ISIZE, IRLIN, IBAR, YLB, YUB, XLB, XINC)

===

MVP: Print vertical plot of multiple Y vectors versus input order; linear
axis; default control values and horizontal (Y) axis limits; no
missing values allowed

YM(n,m)
:
:
CALL MVP (YM, N, M, IYM, NS)

===

<2-9>
1MVPL: Print vertical plot of multiple Y vectors versus input order; log or
linear horizontal (Y) axis; default control values and axis limits; no
missing values allowed

YM(n,m)
:
:
CALL MVPL (YM, N, M, IYM, NS, ILOG)

===

MVPC: Print vertical plot of multiple Y vectors versus input order; log or
linear horizontal (Y) axis; user-supplied control values and axis
limits; no missing values allowed

YM(n,m), YLB, YUB, XLB, XINC
:
:
CALL MVPC (YM, N, M, IYM, NS, ILOG,
+ ISIZE, YLB, YUB, XLB, XINC)

===

MVPM: Print vertical plot of multiple Y vectors versus input order; linear
axis; default control values and axis limits; missing values allowed

YM(n,m), YMMISS(m)
:
:
CALL MVPM (YM, YMMISS, N, M, IYM, NS)

===

MVPML: Print vertical plot of multiple Y vectors versus input order; log or
linear horizontal (Y) axis; default control values and axis limits;
missing values allowed

YM(n,m), YMMISS(m)
:
:
CALL MVPML (YM, YMMISS, N, M, IYM, NS, ILOG)

===

MVPMC: Print vertical plot of multiple Y vectors versus input order; log or
linear horizontal (Y) axis; user-supplied control values and axis
limits; missing values allowed

YM(n,m), YMMISS(m), YLB, YUB, XLB, XINC
:
:
CALL MVPMC (YM, YMMISS, N, M, IYM, NS, ILOG,
+ ISIZE, YLB, YUB, XLB, XINC)

===

<2-10>
1D. Dictionary of Subroutine Arguments and COMMON Variables

NOTE: --> indicates that the argument is input to the subroutine and that
the input value is preserved;
<-- indicates that the argument is returned by the subroutine;
<-> indicates that the argument is input to the subroutine and that
the input value is overwritten by the subroutine;
--- indicates that the argument is input to some subroutines and is
returned by others;
*** indicates that the argument is a subroutine name;
... indicates that the variable is passed via COMMON.

IBAR --> The indicator variable used to designate whether a vertical plot
is to be a bar plot or not. Bar plots connect the plotted points
to the reference line [see argument IRLIN] with a string of plot
symbols, as is done for example in the correlation plots. [See
chapter 12.] If IBAR >= 1, the plot is a bar plot. If IBAR <= 0,
it is not. The default value is IBAR = 0. When IBAR is not an
argument in the subroutine CALL statement the default value is
used.

IERR ... An error flag returned in COMMON /ERRCHK/. [See chapter 1,
section D.5.] Note that using (or not using) the error flag will
not affect the printed error messages that are automatically
provided.

IERR = 0 indicates that no errors were detected and that the plot
was completed satisfactorily.

IERR = 1 indicates that improper input was detected or that some
error prevented the plot from being completed.

ILOG --> The indicator variable used to designate whether the axes are to
be on a log or linear scale. ILOG is a two-digit integer, pq,
where the value of p is used to designate the scale of the x-axis
and the value of q is used to designate the scale of the y-axis.
If p = 0 (q = 0) the x-axis (y-axis) is on a linear scale; if
p <> 0 (q <> 0) the x-axis (y-axis) is on a log scale. For
vertical plots, the value of q is used to specify the scale on the
horizontal-axis and the value of p is ignored. The default value
is ILOG = 0, corresponding to linear scale for both the x-axis and
the y-axis. When ILOG is not an argument in the subroutine CALL
statement the default value is used.

IRLIN --> The indicator variable used to designate whether zero or the
series mean is to be plotted as a reference line on the vertical
plots or whether no reference line should be used. If IRLIN <=
-1, no reference line is plotted. If IRLIN = 0, a reference line
is plotted showing the location of zero on the plot. If IRLIN >=
1, a reference line is plotted showing the series mean. The
default value is IRLIN = -1. When IRLIN is not an argument in the
subroutine CALL statement the default value is used.

ISIZE --> The indicator variable used to designate the size of a page plot.
ISIZE is a two-digit integer, pq, where the value of p is used to
designate the size of the x-axis and the value of q is used to

<2-11>
1 designate the size of the y-axis. If p = 0 (q = 0) the x-axis
(y-axis) is the maximum possible, 101 columns (51 rows), i.e., 101
(51) plot positions. If p <> 0 (q <> 0) the x-axis (y-axis) is
half the maximum, or 51 columns (26 rows). For vertical plots,
the value of q is used to specify the size of the horizontal-axis
and the value of p is ignored. The default value is ISIZE = 0,
corresponding to a plot of 51 rows by 101 columns. When ISIZE is
not an argument in the subroutine CALL statement the default value
is used.

ISYM --> The vector of dimension at least N that contains the values
designating the plotting symbol to be used for each point. The
plot symbols designated by each possible integer value are given
below.

ISYM()<= 1 -> + ISYM() = 9 -> E ISYM() =16 -> L ISYM() =23 -> S
ISYM() = 2 -> . ISYM() =10 -> F ISYM() =17 -> M ISYM() =24 -> T
ISYM() = 3 -> * ISYM() =11 -> G ISYM() =18 -> N ISYM() =25 -> U
ISYM() = 4 -> - ISYM() =12 -> H ISYM() =19 -> 0 ISYM() =26 -> V
ISYM() = 5 -> A ISYM() =13 -> I ISYM() =20 -> P ISYM() =27 -> W
ISYM() = 6 -> B ISYM() =14 -> J ISYM() =21 -> Q ISYM() =28 -> Y
ISYM() = 7 -> C ISYM() =15 -> K ISYM() =22 -> R ISYM()>=29 -> Z
ISYM() = 8 -> D

IYM --> The exact value of the first dimension of the matrix YM as
specified in the calling program.

M --> The number of columns of data in YM.

N --> The number of observations.

NOUT --> The number of points falling outside the plot limits that are to
be listed following the plot. If NOUT >= 1, a message giving the
total number of points falling outside the plot limits and a list
of the coordinates of these points (up to a maximum of NOUT or 50,
whichever is smaller) is printed. If NOUT = 0, only a message
listing the number of points falling outside the limits is
printed. If NOUT < 0, no points are listed and no message is
given. The default value is NOUT = 0. When NOUT is not an
argument in the subroutine CALL statement the default value is
used.

NS --> The sampling frequency of the points plotted on a vertical plot.
If NS = 1, every point is plotted; if NS = 2, every second point
is plotted; if NS = 3, every third point is plotted, etc. The
default value is NS = 1. When NOUT <= 0 or NS is not an argument
in the subroutine CALL statement the default value is used.

X --> The vector of dimension at least N that contains the x-axis
values.

XINC --> The increment to be used for labeling the x-axis (i.e., the
vertical-axis) on vertical plots. The x-axis labels are XLB,
XLB + NS*XINC, XLB + 2*NS*XINC, etc. The default value is
XINC = 1.0. When XINC is not an argument in the subroutine CALL
statement the default value is used.

<2-12>
1XLB --> The lower bound for the x-axis.

For page plots:

The default value is the smallest x-axis value within the range
of the y-axis values to be plotted. If XLB >= XUB, the default
value is used.

For vertical plots:

The default value is 1.0.

For both page and vertical plots, when XLB is not an argument in
the subroutine CALL statements the default value is used. (The
plot limits may be adjusted slightly from the user-supplied values
when the plotting subroutine uses a log scale.)

XMISS --> The missing value code used within the vector X to indicate that a
value is missing. The user must indicate missing observations by
putting the missing value code in place of each missing
observation. Missing data are not indicated on page plots in any
way.

XUB --> The upper bound for the x-axis. The default value is the largest
x-axis value within the range of the y-axis values to be plotted.
If XLB >= XUB or XUB is not an argument in the subroutine CALL
statement the default value is used. (The plot limits may be
adjusted slightly from the user-supplied value when the plotting
subroutines use a log scale.)

Y --> The vector of dimension at least N that contains the y-axis
values.

YLB --> The lower bound for the y-axis. The default value is the smallest
y-axis value within the range of the x-axis values to be plotted.
If YLB >= YUB or YLB is not an argument in the subroutine CALL
statement the default value is used. (The plot limits may be
adjusted slightly from the user-supplied value when the plotting
subroutines use a log scale.)

YM --> The matrix of dimension at least N by M whose columns each contain
one of the M sets of N observations to be plotted against a common
X vector.

YMISS --> The missing value code used within the vector Y to indicate a
value is missing. The user must indicate missing observations by
putting the missing value code in place of each missing
observation. Missing data are indicated on vertical plots by the
word "MISSING" next to the right axis of the appropriate line.
Missing data are not indicated on page plots in any way.

YMMISS --> The vector of dimension at least M that contains the codes used
within each of the M columns of YM to indicate a value is missing,
where the first element of YMMISS is the missing value code for
the first column of YM, etc. The user must indicate missing
observations by putting the appropriate missing value code in
place of each missing observation. Missing data are indicated on

<2-13>
1 vertical plots by the word "MISSING" next to the right axis of the
appropriate line. Missing data are not indicated on page plots in
any way.

YUB --> The upper bound for the y-axis. The default value is the largest
y-axis value within the range of the x-axis values to be plotted.
If YLB >= YUB or YUB is not an argument in the subroutine CALL
statement the default value is used. (The plot limits may be
adjusted slightly from the user-supplied value when the plotting
subroutines use a log scale.)

E. Computational Details

Plotting Symbols. The plotting symbol used depends on the type of plot
and whether or not more than one point falls on a given plot position. If two
to nine points fall on a single plot position, the integer corresponding to
the number of points is used as the plotting symbol. When 10 or more values
fall on a single position the plotting symbol X is used. This is the only way
that integers or X are used as plot symbols.

Subroutines without an S or M prefix use the plotting symbol + to
indicate one point on a single printer position.

For subroutines with an S prefix the user-supplied vector ISYM of integer
values is used to specify the plotting symbol for each data point. The
Fourier spectrum plot shown in chapter 12 is an example of this option.

Subroutines with an M prefix use a different letter as the plot symbol
for each column of the matrix of the dependent variables (y-axis): A for the
first, B for the second, ..., Z for columns 25 and beyond, with X still only
used to indicate that 10 or more points fell on a single plot position.

Continuity of Vertical Plots. Normally, a line printer will
automatically provide margins at the top and bottom of each page, causing a
break in the continuity of a vertical plot or any other output continuing over
two or more pages. However, these automatic page-ejects can be suppressed by
the user on many systems. Appendix A gives the control sequence necessary to
suppress automatic page-ejects on a Cyber computer. Users of other systems
should consult their Computer Center staff for any equivalent method
available.

F. Examples

A sample program, its data and the resulting output for the STARPAC
plotting routine MPP is listed at the end of this section. The program uses
MPP to display the 12 years of monthly airline data listed on page 531 of Box
and Jenkins [1976] versus month. The year is indicated by the plotting symbol
(A = 1949, B = 1950, etc.).

Other examples of STARPAC plots can be found in the output of many of the
subroutines discussed elsewhere. The output from the complex demodulation
subroutines includes a sample of the simple vertical plot style (VP) and of
the vertical plot of multivariate data style (MVPC) (chapter 11); the output
from the autocorrelation and cross correlation subroutines includes vertical
plots using the bar plot option style (VPC) (chapter 12); and the output from

<2-14>
1the Fourier spectrum subroutines (chapter 12) is produced using the symbol
plot style (SPPC).

Program:

PROGRAM EXAMPL
C
C DEMONSTRATE MPP USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF X AND YM MUST BE CHANGED TO DOUBLE PRECISION
C IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL X(20), YM(20,20)
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
IYM = 20
C
C READ NUMBER OF OBSERVATIONS AND NUMBER OF COLUMNS OF DATA
C X-AXIS VALUES
C Y-AXIS VALUES
C
READ (5,100) N, M
READ (5,101) (X(I), I=1,N)
READ (5,101) ((YM(I,J), I=1,N), J=1,M)
C
C PRINT TITLE AND CALL MPP FOR PLOT
C
WRITE (IPRT,102)
CALL MPP (YM, X, N, M, IYM)
C
C FORMAT STATEMENTS
C
100 FORMAT (2I5)
101 FORMAT (12F6.1)
102 FORMAT ('1RESULTS OF STARPAC PLOT SUBROUTINE MPP')
END

<2-15>
1Data:

12 12
1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0
112.0 118.0 132.0 129.0 121.0 135.0 148.0 148.0 136.0 119.0 104.0 118.0
115.0 126.0 141.0 135.0 125.0 149.0 170.0 170.0 158.0 133.0 114.0 140.0
145.0 150.0 178.0 163.0 172.0 178.0 199.0 199.0 184.0 162.0 146.0 166.0
171.0 180.0 193.0 181.0 183.0 218.0 230.0 242.0 209.0 191.0 172.0 194.0
196.0 196.0 236.0 235.0 229.0 243.0 264.0 272.0 237.0 211.0 180.0 201.0
204.0 188.0 235.0 227.0 234.0 264.0 302.0 293.0 259.0 229.0 203.0 229.0
242.0 233.0 267.0 269.0 270.0 315.0 364.0 347.0 312.0 274.0 237.0 278.0
284.0 277.0 317.0 313.0 318.0 374.0 413.0 405.0 355.0 306.0 271.0 306.0
315.0 301.0 356.0 348.0 355.0 422.0 465.0 467.0 404.0 347.0 305.0 336.0
340.0 318.0 362.0 348.0 363.0 435.0 491.0 505.0 404.0 359.0 310.0 337.0
360.0 342.0 406.0 396.0 420.0 472.0 548.0 559.0 463.0 407.0 362.0 405.0
417.0 391.0 419.0 461.0 472.0 535.0 622.0 606.0 508.0 461.0 390.0 432.0

<2-16>
1RESULTS OF STARPAC PLOT SUBROUTINE MPP
STARPAC 2.08S (03/15/90)
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
622.0000 - L -
I I
I L I
I I
I I
570.2000 - -
I K I
I K I
I L I
I I
518.4000 - -
I J L I
I I
I J I
I L K I
466.6000 - I I K -
I L L I
I I
I J L I
I K I I
414.8000 - L L H -
I K H 2 K K I
I L K L I
I I
I H I
363.0000 - K J J G J K -
I I 2 I H I
I J K G I I
I 2 I
I J H H I
311.2000 - I H G G J -
I I F H I H I
I F I
I H H G I
I G G G E G H I
259.4000 - F E F -
I I
I G 2 E F E D E G I
I G F E D F F I
I D I
207.6000 - F D E F -
I E E D C C 2 I
I F D C D I
I D C D C C 2 I
I D C B B C C I
155.8000 - B -
I C C B B A A C I
I A B A A B B I
I B A 2 I
I 2 A A B A I
104.0000 - A -
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
1.0000 2.1000 3.2000 4.3000 5.4000 6.5000 7.6000 8.7000 9.8000 10.9000 12.0000

<2-17>
1----- CHAPTER 3 -----

NORMAL RANDOM NUMBER GENERATION

A. Introduction

STARPAC contains two subroutines for generating pseudo-random numbers
(noise) which obey a normal probability law with mean mu and standard
deviation sigma. Such random numbers are often useful for evaluating data
analysis procedures or computer programs.

Users are directed to section B for a brief description of the
subroutines. The declaration and CALL statements are given in section C and
the subroutine arguments are defined in section D. The algorithm used by
these subroutines is discussed in section E. A sample program showing the use
of these subroutines is given in section F.

B. Subroutine Descriptions

STARPAC subroutine NRAND generates a vector of standard (zero mean and
unit standard deviation) normal (Gaussian) random numbers. There is no
printed output from this subroutine.

STARPAC subroutine NRANDC generates Gaussian noise with mean mu and
standard deviation sigma using the transformation

z = sigma*y + mu

where

y is a standard normal pseudo-random number;

mu is the desired mean (see section D, argument YMEAN); and

sigma is the desired standard deviation (see section D, argument SIGMA).

There is no printed output from NRANDC.

C. Subroutine Declaration and CALL Statements

NOTE: Argument definitions and a sample program are given in section D and
section F, respectively. The conventions used to present the following
declaration and CALL statements are given in chapter 1, sections B and D.

NRAND: Generate a vector of normal pseudo-random numbers with zero mean and
unit standard deviation

Y(n)
:
:
CALL NRAND (Y, N, ISEED)

===

<3-1>
1NRANDC: Generate a vector of normal pseudo-random numbers with mean YMEAN and
standard deviation SIGMA

Y(n)
:
:
CALL NRANDC (Y, N, ISEED, YMEAN, SIGMA)

===

D. Dictionary of Subroutine Arguments

IERR = 0 indicates that no errors were detected.

IERR = 1 indicates that improper input was detected.

ISEED --> The basis for the pseudo-random number generation. ISEED must lie
between 0 and 2**31 - 1, inclusive. If ISEED is not equal to 0,
ISEED must be odd.

For ISEED > 0, use of the same value of ISEED will always
generate the same data set.

For ISEED = 0, a different data set will be produced by each call
to NRAND or NRANDC in the user's program, although
the numbers generated will not differ from run to
run.

N --> The number of random numbers to be generated.

SIGMA --> The standard deviation of the generated random numbers.

Y <-- The vector of dimension at least N that contains the generated
normal pseudo-random numbers.

YMEAN --> The mean of the generated random numbers.

E. Computational Methods

The normal pseudo-random number generation procedure is that of Marsaglia

<3-2>
1and Tsang [1984]. The same pseudo-random numbers (to within final round-off
error) will be generated on all computers with at least 32 binary digits
available for representing integers. The code was written by Boisvert and

<3-3>
1Kahanar of the NIST Applied and Computational Mathematics Division.

F. Example

The sample program shown below illustrates the use of both NRAND and
NRANDC. NRAND is used to generate a standard normal pseudo-random sample of
size 50 from a normal population with zero mean and unit standard deviation.
NRANDC is then used to generate a sequence of normal pseudo-random numbers
with a mean of 4 and standard deviation 0.5. The same seed is used for both
NRAND and NRANDC. Therefore, the values generated by NRANDC are YMEAN plus
SIGMA times the values generated by NRAND, i.e.,

YMEAN(I,2) = YMEAN + SIGMA*YMEAN(I,1) for I = 1, ..., N.

The generated random numbers are displayed using STARPAC plot subroutine MVP.
There is no output from NRAND and NRANDC.

<3-4>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE NRAND AND NRANDC AND DISPLAY RESULTS WITH MVP USING
C SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF YM MUST BE CHANGED TO DOUBLE PRECISION IF
C DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL YM(100,2)
C
C SET UP OUTPUT FILE
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
C
C SPECIFY NECESSARY DIMENSIONS
C
IYM = 100
C
C SET THE SEED
C THE NUMBER OF VALUES TO BE GENERATED
C THE NUMBER OF SETS OF DATA TO BE GENERATED
C
ISEED = 531
N = 50
M = 2
C
C GENERATE STANDARD NORMAL PSEUDO-RANDOM NUMBERS INTO COLUMN 1 OF YM
C AND NORMAL PSEUDO-RANDOM NUMBERS WITH MEAN 4.0 AND
C STANDARD DEVIATION 0.5 INTO COLUMN 2 OF YM
C
CALL NRAND (YM(1,1), N, ISEED)
C
YMEAN = 4.0
SIGMA = 0.5
CALL NRANDC (YM(1,2), N, ISEED, YMEAN, SIGMA)
C
C PRINT TITLE AND CALL MVP TO PLOT RESULTS,
C SAMPLING EVERY OBSERVATION
C
WRITE (IPRT,100)
CALL MVP (YM, N, M, IYM, 1)
C
C FORMAT STATEMENTS
C
100 FORMAT ('1RESULTS OF STARPAC NORMAL PSEUDO-RANDOM NUMBER',
1 ' GENERATION SUBROUTINES',
2 ' DISPLAYED WITH STARPAC PLOT SUBROUTINE MVP')
END

Data:

NO DATA NEEDED FOR THIS EXAMPLE

<3-5>
1RESULTS OF STARPAC NORMAL PSEUDO-RANDOM NUMBER GENERATION SUBROUTINES DISPLAYED WITH STARPAC PLOT SUBROUTINE MVP
STARPAC 2.08S (03/15/90)
-2.4249 -1.6383 -.8518 -.0652 .7213 1.5078 2.2944 3.0809 3.8675 4.6540 5.4406
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
1.0000 I A B I
2.0000 I A B I
3.0000 I A B I
4.0000 I A B I
5.0000 I A B I
6.0000 I A B I
7.0000 I A B I
8.0000 I A B I
9.0000 I A B I
10.000 I A B I
11.000 I A B I
12.000 I A B I
13.000 I A B I
14.000 I A B I
15.000 I A B I
16.000 I A B I
17.000 I A BI
18.000 I A B I
19.000 I A B I
20.000 I A B I
21.000 I A B I
22.000 I A B I
23.000 I A B I
24.000 I A B I
25.000 IA B I
26.000 I A B I
27.000 I A B I
28.000 I A B I
29.000 I A B I
30.000 I A B I
31.000 I A B I
32.000 I A B I
33.000 I A B I
34.000 I A B I
35.000 I A B I
36.000 I A B I
37.000 I A B I
38.000 I A B I
39.000 I A B I
40.000 I A B I
41.000 I A B I
42.000 I A B I
43.000 I A B I
44.000 I A B I
45.000 I A B I
46.000 I A B I
47.000 I A B I
48.000 I A B I
49.000 I A B I
50.000 I A B I

<3-6>
1G. Acknowledgments

The code used to generate the pseudo-random numbers was written by
Boisvert and Kahanar of the NIST Applied and Computational Mathematics
Division.

<3-7>
1----- CHAPTER 4 -----

HISTOGRAMS

A. Introduction

STARPAC contains two subroutines for producing histograms. Both
subroutines produce a one-page printout which includes, in addition to the
histogram, a number of summary statistics (mean, median, standard deviation,
cell fractions, etc.) and several tests for normality.

Users are directed to section B for a brief description of the
subroutines. The declaration and CALL statements are given in section C and
the subroutine arguments are defined in section D. The algorithms used and
output produced by these subroutines are discussed in section E. Sample
programs and their output are shown in section F.

B. Subroutine Descriptions

HIST provides the analysis described in section A using a preset
procedure for choosing the number of cells. The lower and upper bounds of the
histogram are chosen from the range of the observations.

HISTC provides the same analysis as HIST but allows the user to specify
the number of cells and the upper and lower cell boundaries. Statistics are
based only on the data within the user-supplied bounds.

C. Subroutine Declaration and CALL Statements

NOTE: Argument definitions and sample programs are given in sections D and F,
respectively. The conventions used to present the following declaration and
CALL statements are given in chapter 1, sections B and D.

HIST: Compute and print a histogram and summary statistics, with automatic
selection of number of cells and cell boundaries

Y(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL HIST (Y, N, LDSTAK)

===

<4-1>
1HISTC: Compute and print a histogram and summary statistics with user
control of number of cells and cell boundaries

Y(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL HISTC (Y, N, NCELL, YLB, YUB, LDSTAK)

===

D. Dictionary of Subroutine Arguments

DSTAK ... The DOUBLE PRECISION vector in COMMON /CSTAK/ of dimension at
least LDSTAK. DSTAK provides workspace for the computations. The
first LDSTAK locations of DSTAK will be overwritten during
subroutine execution.

IERR = 0 indicates that no errors were detected.

IERR = 1 indicates that improper input was detected.

LDSTAK --> The length of the DOUBLE PRECISION workspace vector DSTAK. LDSTAK
must equal or exceed the appropriate value given below, where if
the single precision version of STARPAC is being used P = 0.5,
otherwise P = 1.0. [See chapter 1, section B.]

For HIST: LDSTAK >= (17+N)/2 + 26*P

For HISTC: LDSTAK >= (17+N)/2 + max(NCELL, 26)*P

N --> The number of observations.

NCELL --> The number of cells in the histogram. If NCELL <= 0 or NCELL is
not an argument in the subroutine CALL statement the subroutine
will choose the number of cells.

Y --> The vector of dimension at least N that contains the observed
data.

<4-2>
1YLB --> The lower bound for constructing the histogram. The interval
[YLB, YUB] is divided into NCELL increments. If YLB >= YUB, the
lower and upper bounds for constructing the histogram will be the
minimum and maximum observations.

YUB --> The upper bound for constructing the histogram. The interval
[YLB, YUB] is divided into NCELL increments. If YLB >= YUB, the
lower and upper bounds for constructing the histogram will be the
minimum and maximum observations.

E. Computational Methods

E.1 Algorithms

The code and output for the histogram subroutines are modeled after early
versions of MINITAB [Ryan et al., 1974].

E.2 Computed Results and Printed Output

The output from the histogram family of subroutines includes a summary of
the input data in addition to the actual histogram. This summary includes the
following information, where the actual output headings are given by the
uppercase phrases enclosed in angle braces (<...>). Results which correspond
to subroutine CALL statements arguments are identified by the argument name in
uppercase. In the formulas, x(1), x(2), ..., x(k) denotes the ordered
observations of Y such that YLB <= Y(i) <= YUB, i = 1, ..., N, i.e., x(1) is
the smallest observation of Y such that YLB <= Y(i), x(k) is the largest
observation of Y such that Y(i) <= YUB, etc. The value of expressions
enclosed in square brackets, e.g., [(k/2) + 1], is the largest integer less
than or equal to the value of the expression.

* , N

* , YLB

* , YUB

* , NCELL

* , k, where

k = the number of observations for which
YLB <= Y(i) <= YUB, i = 1, ..., N

* , x(1), where

x(1) = the smallest observation such that YLB <= Y(i), i = 1, ..., N

* , x(k), where

x(k) = the largest observation such that Y(i) <= YUB, i = 1, ..., N

<4-3>
1 * , xmean, where

k
xmean = 1/k SUM x(i)
i=1

* , xmedian, where

xmedian = x([(k+1)/2]) if k is odd

xmedian = 0.5*( x([k/2]) + x([(k/2)+1]) ) if k is even

* <25 PCT TRIMMED MEAN>, xtrim, where

k-[k/4]
xtrim = 1/(k-(2*[k/4])) SUM x(i)
i=1+[k/4]

* , s, where

k
s = sqrt[ 1/(k-1) SUM (x(i)-xmean)**2 ]
i=1

* , r, where

k
r = 1/(s*k) SUM |x(i)-xmean|
i=1

* , sqrt(beta1), where

k
beta1 = k/((k-1)**3 * s**6) (SUM (x(i)-xmean)**3)**2
i=1

* , beta2, where,

k
beta2 = k/((k-1)**2 * s**4) SUM (x(i)-xmean)**4
i=1

Information provided for each cell, u = 1, ..., NCELL, of the histogram
includes the following.

* , cmid, where

cmid = YLB + (YUB-YLB)/(2*NCELL)

* , the cumulative fraction of the observations which are in
cells 1 through u

* <1-CUM. FRACT.>, the cumulative fraction of the observations which are
in cells u through NCELL

* , the fraction of the observations which are in cell u

* , the actual number of observations which are in cell u.

<4-4>
1
The histogram itself displays the actual number of observations in each
cell when the largest number of observations per cell does not exceed 50.
When the largest number of observations per cell does exceed 50 then the
histogram displays the cell fraction.

F. Example

The example program below uses HIST to analyze the 39 measurements of the
velocity of light shown on page 81 of Mandel [1964].

<4-5>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE HIST USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y MUST BE CHANGED TO DOUBLE PRECISION IF
C DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(200)
DOUBLE PRECISION DSTAK(200)
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 200
C
C READ NUMBER OF OBSERVATIONS
C OBSERVED DATA
C
READ (5,100) N
READ (5,101) (Y(I), I=1,N)
C
C PRINT TITLE AND CALL HIST TO ANALYZE RESULTS
C
WRITE (IPRT,102)
CALL HIST (Y, N, LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (I5)
101 FORMAT (13F5.1)
102 FORMAT ('1RESULTS OF STARPAC HISTOGRAM SUBROUTINE HIST')
END

Data:

39
0.4 0.6 1.0 1.0 1.0 0.5 0.6 0.7 1.0 0.6 0.2 1.9 0.2
0.4 0.0 -0.4 -0.3 0.0 -0.4 -0.3 0.1 -0.1 0.2 -0.5 0.3 -0.1
0.2 -0.2 0.8 0.5 0.6 0.8 0.7 0.7 0.2 0.5 0.7 0.8 1.1

<4-6>
1RESULTS OF STARPAC HISTOGRAM SUBROUTINE HIST
STARPAC 2.08S (03/15/90)
HISTOGRAM

NUMBER OF OBSERVATIONS = 39
MINIMUM OBSERVATION = -5.00000000E-01
MAXIMUM OBSERVATION = 1.90000000E+00

HISTOGRAM LOWER BOUND = -5.00000000E-01
HISTOGRAM UPPER BOUND = 1.90000000E+00

NUMBER OF CELLS = 9
OBSERVATIONS USED = 39 25 PCT TRIMMED MEAN = 4.23809524E-01
MIN. OBSERVATION USED = -5.00000000E-01 STANDARD DEVIATION = 5.06689395E-01
MAX. OBSERVATION USED = 1.90000000E+00 MEAN DEV./STD. DEV. = 7.99040249E-01
MEAN VALUE = 4.10256410E-01 SQRT(BETA ONE) = 3.10646614E-01
MEDIAN VALUE = 5.00000000E-01 BETA TWO = 3.33793260E+00

FOR A NORMAL DISTRIBUTION, THE VALUES (MEAN DEVIATION/STANDARD DEVIATION), SQRT(BETA ONE), AND BETA TWO ARE APPROXIMATELY
0.8, 0.0 AND 3.0, RESPECTIVELY. TO TEST THE NULL HYPOTHESIS OF NORMALITY, SEE TABLES OF CRITICAL VALUES PP. 207-208,
BIOMETRIKA TABLES FOR STATISTICIANS, VOL. 1. SEE PP. 67-68 FOR A DISCUSSION OF THESE TESTS.

INTERVAL CUM. 1-CUM. CELL NO. NUMBER OF OBSERVATIONS
MID POINT FRACT. FRACT. FRACT. OBS.
+ 0 10 20 30 40 50
------------------------------------------ +---------+---------+---------+---------+---------+
-3.666667E-01 .128 1.000 .128 5 +++++
-1.000000E-01 .256 .872 .128 5 +++++
1.666667E-01 .410 .744 .154 6 ++++++
4.333333E-01 .564 .590 .154 6 ++++++
7.000000E-01 .846 .436 .282 11 +++++++++++
9.666667E-01 .949 .154 .103 4 ++++
1.233333E+00 .974 .051 .026 1 +
1.500000E+00 .974 .026 .000 0
1.766667E+00 1.000 .026 .026 1 +

<4-7>
1G. Acknowledgments

The code and output for the histogram subroutines is modeled on that in
early versions of MINITAB [Ryan et al., 1974].

<4-8>
1----- CHAPTER 5 -----

STATISTICAL ANALYSIS OF A UNIVARIATE SAMPLE

A. Introduction

STARPAC contains 4 subroutines for performing a comprehensive statistical
analysis of a univariate sample. They each compute 53 different statistics
which summarize the sample through measures of location (mean, median, etc),
measures of dispersion (standard deviation, mean deviation, etc.) and
diagnostic features such as tests for outliers, non-normality, trends and
non-randomness (assuming the input order of the data is a meaningful time
sequence). Common statistics such as Student's t and confidence intervals for
the mean and standard deviation are also included. NBS Technical Note 756, A
User's Guide to the OMNITAB Command "STATISTICAL ANALYSIS," by H. H. Ku [1973]
provides a complete discussion of the output of these subroutines, which is
the same output as that provided by the OMNITAB II Command STATISTICAL [Hogben
et al., 1971].

Users are directed to section B for a brief description of the
subroutines. The declaration and CALL statements are given in section C and
the subroutine arguments are defined in section D. The algorithms used and
output produced by these subroutines are discussed in section E. Sample
programs and their output are shown in section F.

B. Subroutine Descriptions

STAT computes and prints the 53 descriptive statistics described in
section A.

STATS provides the same analysis as STAT but allows the user to suppress
the printed output and store the computed statistics for further use.

STATW and STATWS perform a weighted analysis and are otherwise identical
to STAT and STATS, respectively.

C. Subroutine Declaration and CALL Statements

STAT: Compute and print 53 statistics describing the input data

Y(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL STAT (Y, N, LDSTAK)

===

<5-1>
1STATS: Compute and optionally print 53 statistics describing input data;
return statistics

Y(n), STS(53)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL STATS (Y, N, LDSTAK, STS, NPRT)

===

STATW: Compute and print 53 statistics describing weighted input data

Y(n), WT(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL STATW (Y, WT, N, LDSTAK)

===

STATWS: Compute and optionally print 53 statistics describing weighted input
data; return statistics

Y(n), WT(n), STS(53)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL STATWS (Y, WT, N, LDSTAK, STS, NPRT)

===

D. Dictionary of Subroutine Arguments and COMMON Variables

<5-2>
1IERR ... An error flag returned in COMMON /ERRCHK/. [See chapter 1,
section D.5.] Note that using (or not using) the error flag will
not affect the printed error messages that are automatically
provided.

IERR = 0 indicates that no errors were detected.

IERR = 1 indicates that improper input was detected.

LDSTAK --> The length of the DOUBLE PRECISION workspace vector DSTAK. LDSTAK
must equal or exceed (N/2) + 7.

N --> The number of observations.

NPRT --> The argument controlling printed output.

If NPRT = 0 the printed output is suppressed.

If NPRT <> 0 the printed output is provided.

STS --> The vector of dimension at least 53 that contains the computed
statistics. The contents of STS are listed below, along with
applicable references; the number in parenthesis is the location
within STS that the given statistic is stored. In the formulas, x
denotes the ordered observations of Y for which the weight is
nonzero, i.e., x(1) is the smallest observation of Y with a
nonzero weight, x(k) is the largest observation of Y with a
nonzero weight, etc. The weight associated with x(i) is denoted
w(i). Zero weighted observations are not included in the
analysis. The value of expressions enclosed in square brackets,
e.g., [(k/2) + 1], is the largest integer less than or equal to
the value of the expression.

(1) NUMBER OF OBSERVATIONS, N

(2) NUMBER OF NONZERO WEIGHTS, k

(3) UNWEIGHTED MEAN [Dixon and Massey, 1957, p. 14],

k
xmean = 1/k SUM x(i)
i=1

(4) WEIGHTED MEAN [Brownlee, 1965, pp. 95-97],

k
SUM w(i)*x(i)
i=1
xwtdmean = ---------------
k
SUM w(i)
i=1

(5) MEDIAN [Dixon and Massey, 1957, p. 70],

xmedian = x([(k+1)/2]) if k is odd

xmedian = 0.5*( x([k/2]) + x([(k/2)+1]) ) if k is even

<5-3>
1
(6) MID-RANGE [Dixon and Massey, 1957, p. 71],

xmid = 0.5*(x(1) + x(k))

(7) 25 PCT UNWTD TRIMMED MEAN [Crow and Siddiqui, 1967],

k-[k/4]
xtrim = 1/[k-2*[k/4]] SUM x(i)
i=1+[k/4]

(8) 25 PCT WTD TRIMMED MEAN,

k-[k/4]
SUM w(i)*x(i)
i=1+[k/4]
xwtdtrim = -------------------
k-[k/4]
SUM w(i)
i=1+[k/4]

(9) WEIGHTED STANDARD DEVIATION [Snedecor and Cochran, 1967,
p. 44],

k
s = sqrt( 1/(k-1) SUM w(i)*(x(i)-xwtdmean)**2 )
i=1

(10) WTD S.D. OF MEAN [Brownlee, 1965, p. 80],

k
smean = s / sqrt( SUM w(i) )
i=1

(11) RANGE [Snedecor and Cochran, 1967, p. 39],

xrange = x(k) - x(1)

(12) MEAN DEVIATION [Duncan, 1965, p. 50],

k
xmeandev = 1/k SUM |x(i)-xwtdmean|
i=1

(13) VARIANCE [Snedecor and Cochran, 1967, p. 44],

k
s2 = 1/(k-1) SUM w(i)*(x(i)-xwtdmean)**2
i=1

(14) COEF. OF VAR. (PERCENT) [Snedecor and Cochran, 1967, p. 62],

cvar = |100*s/xwtdmean|

<5-4>
1 (15) LOWER CONFIDENCE LIMIT, MEAN [Natrella, 1966, pp. 2-2, 2-3],

xmean - t(0.025)*smean

where t(0.025) is the appropriate t-statistic with (k-1)
degrees of freedom

(16) UPPER CONFIDENCE LIMIT, MEAN [Natrella, 1966, pp. 2-2, 2-3],

xmean + t(0.025)*smean

where t(0.025) is the appropriate t-statistic with (k-1)
degrees of freedom

(17) LOWER CONFIDENCE LIMIT, S.D. [Natrella, 1966, p. 2-7],

s*sqrt((k-1)/chi2(0.975))

where chi2(0.975) is the appropriate chi-square statistic
with (k-1) degrees of freedom

(18) UPPER CONFIDENCE LIMIT, S.D. [Natrella, 1966, p. 2-7],

s*sqrt((k-1)/chi2(0.025))

where chi2(0.025) is the appropriate chi-square statistic
with (k-1) degrees of freedom

(19) SLOPE [Fisher, 1950, p. 136],

k
B = 12/(k*(k**2-1)) SUM i*(x(i) - xwtdmean)
i=1

(20) S.D. OF SLOPE,

k
sqrt( -B**2*k*(k**2-1) + 12 SUM (x(i)-xwtdmean)**2 )
i=1
sB = ----------------------------------------------------
sqrt( k*(k**2-1)*(k-2) )

(21) SLOPE/S.D. OF SLOPE = T,

t0 = B / sB with (k-2) degrees of freedom

(22) PROB EXCEEDING ABS VALUE OF OBS T [Brownlee, 1965, p. 344],

Prob (t < -|t0| and t > +|t0| )

(23) NO. OF RUNS UP AND DOWN [Brownlee, 1965, p. 223], r

(24) EXPECTED NO. OF RUNS [Bradley, 1965, p. 279],

E(r) = (2k-1)/3

<5-5>
1 (25) S.D. OF NO. OF RUNS [Bradley, 1965, p. 279],

rsd = sqrt( [(16*k-29)/90] )

(26) MEAN SQ SUCCESSIVE DIFF [Brownlee, 1965, p. 222],

k-1
D2 = 1/(k-1) SUM (x(i+1)-x(i))**2
i=1

(27) MEAN SQ SUCC DIFF/VAR [Brownlee, 1965, p. 222],

D2/s2

(28) NO. OF + SIGNS,

u = number of times sign of (x(i)-xwtdmean) is positive

(29) NO. OF - SIGNS,

v = number of times sign of (x(i)-xwtdmean) is negative

(30) NO. OF RUNS [Brownlee, 1965, p. 224],

RUNS = 1 + number of changes in sign of (x(i)-xwtdmean)

(31) EXPECTED NO. OF RUNS [Brownlee, 1965, p. 227],

E(RUNS) = 1 + (2*u*v)/k

(32) S.D. OF RUNS [Brownlee, 1965, p. 230],

RUNSsd = sqrt( 2*u*v*(2*u*v-u-v)/((u+v)**2*(k-1)) )

(33) DIFF./S.D. OF RUNS [Brownlee, 1965, p. 230],

[RUNS - E(RUNS)] / RUNSsd

(34) MINIMUM [Natrella, 1966, p. 19-1],

x(1) = smallest value with nonzero weight

(35) MAXIMUM [Natrella, 1966, p. 19-3],

x(k) = largest value with nonzero weight

(36) BETA ONE [Snedecor and Cochran, 1967, p. 86],

k
beta1 = k/((k-1)**3*s**6) * ( SUM (x(i)-xwtdmean)**3 )**2
i=1

(37) BETA TWO [Snedecor and Cochran, 1967, p. 87],

k
beta2 = k/((k-1)**2*s**4) SUM (x(i)-xwtdmean)**4
i=1

<5-6>
1 (38) WTD SUM OF VALUES,

k
SUM w(i)*x(i)
i=1

(39) WTD SUM OF SQUARES,

k
SUM w(i)*x(i)**2
i=1

(40) WTD SUM OF DEVS SQUARED,

k
SUM w(i)*(x(i)-xwtdmean)**2
i=1

(41) STUDENT'S T [Brownlee, 1965, p. 296],

k
t = sqrt( SUM w(i) )*xwtdmean/s with (k-1) degrees of freedom
i=1

(42) WTD SUM ABSOLUTE VALUES,

k
SUM w(i)*|xi|
i=1

(43) WTD AVE ABSOLUTE VALUES,

k
SUM w(i)*|xi|
i=1
-------------
k
SUM w(i)
i=1

(44-53) FREQUENCY DISTRIBUTION [Freund and Williams, 1958, p. 17].

WT <-- The vector of dimension at least N that contains the weights. A
zero weight excludes the corresponding observation from the
analysis. If the weights are all equal to 1.0, the resulting
analysis is equivalent to an unweighted analysis.

Y --> The vector of dimension at least N that contains the observations.
The tests for trend and randomness will not be meaningful unless
the sample is ordered with respect to time or some other relevant
variable.

<5-7>
1E. Computational Methods

E.1. Algorithms

Formulas for the computed statistics are given in section D under
argument STS. The code for the statistical analysis subroutines is adapted
from OMNITAB II [Hogben et al., 1971].

E.2. Computed Results and Printed Output

The output consists of a one-page display of the 53 statistics listed for
argument STS in section D. The argument, NPRT, controlling the printed output
is discussed in section D.

F. Example

The example program below uses STAT to analyze the 39 measurements of the
velocity of light shown on page 81 of Mandel [1964].

<5-8>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE STAT USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y MUST BE CHANGED TO DOUBLE PRECISION IF
C DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(200)
DOUBLE PRECISION DSTAK(200)
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 200
C
C READ NUMBER OF OBSERVATIONS
C OBSERVED DATA
C
READ (5,100) N
READ (5,101) (Y(I), I=1,N)
C
C PRINT TITLE AND CALL STAT TO PERFORM STATISTICAL ANALYSIS
C
WRITE (IPRT,102)
CALL STAT (Y, N, LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (I5)
101 FORMAT (13F5.1)
102 FORMAT ('1RESULTS OF STARPAC STATISTICAL ANALYSIS',
* ' SUBROUTINE STAT')
END

Data:

39
0.4 0.6 1.0 1.0 1.0 0.5 0.6 0.7 1.0 0.6 0.2 1.9 0.2
0.4 0.0 -0.4 -0.3 0.0 -0.4 -0.3 0.1 -0.1 0.2 -0.5 0.3 -0.1
0.2 -0.2 0.8 0.5 0.6 0.8 0.7 0.7 0.2 0.5 0.7 0.8 1.1

<5-9>
1RESULTS OF STARPAC STATISTICAL ANALYSIS SUBROUTINE STAT
STARPAC 2.08S (03/15/90)
+STATISTICAL ANALYSIS

N = 39

FREQUENCY DISTRIBUTION (1-6) 5 3 8 3 7 7 5 0 0 1

MEASURES OF LOCATION (2-2) MEASURES OF DISPERSION (2-6)

UNWEIGHTED MEAN = 4.1025641E-01 WTD STANDARD DEVIATION = 5.0668940E-01
WEIGHTED MEAN = 4.1025641E-01 WEIGHTED S.D. OF MEAN = 8.1135237E-02
MEDIAN = 5.0000000E-01 RANGE = 2.4000000E+00
MID-RANGE = 7.0000000E-01 MEAN DEVIATION = 4.0486522E-01
25 PCT UNWTD TRIMMED MEAN= 4.2380952E-01 VARIANCE = 2.5673414E-01
25 PCT WTD TRIMMED MEAN = 4.2380952E-01 COEF. OF. VAR. (PERCENT) = 1.2350554E+02

A TWO-SIDED 95 PCT CONFIDENCE INTERVAL FOR MEAN IS 2.4600615E-01 TO 5.7450667E-01 (2-2)
A TWO-SIDED 95 PCT CONFIDENCE INTERVAL FOR S.D. IS 4.1408928E-01 TO 6.5301023E-01 (2-7)

LINEAR TREND STATISTICS (5-1) OTHER STATISTICS

SLOPE = -4.0080972E-03 MINIMUM = -5.0000000E-01
S.D. OF SLOPE = 7.2760495E-03 MAXIMUM = 1.9000000E+00
SLOPE/S.D. OF SLOPE = T = -5.5086172E-01 BETA ONE = 9.6501319E-02
PROB EXCEEDING ABS VALUE OF OBS T = .585 BETA TWO = 3.3379326E+00
WTD SUM OF VALUES = 1.6000000E+01
WTD SUM OF SQUARES = 1.6320000E+01
TESTS FOR NON-RANDOMNESS WTD SUM OF DEV SQUARED = 9.7558974E+00
STUDENTS T = 5.0564517E+00
NO. OF RUNS UP AND DOWN = 23 WTD SUM ABSOLUTE VALUES = 2.0600000E+01
EXPECTED NO. OF RUNS = 25.7 WTD AVE ABSOLUTE VALUES = 5.2820513E-01
S.D. OF NO. OF RUNS = 2.57
MEAN SQ SUCCESSIVE DIFF = 2.8289474E-01
MEAN SQ SUCC DIFF/VAR = 1.102

DEVIATIONS FROM WTD MEAN

NO. OF + SIGNS = 20
NO. OF - SIGNS = 19
NO. OF RUNS = 8
EXPECTED NO. OF RUNS= 20.5
S.D. OF RUNS = 3.08
DIFF./S.D. OF RUNS = -4.056

NOTE - ITEMS IN PARENTHESES REFER TO PAGE NUMBER IN NBS HANDBOOK 91 (NATRELLA, 1966)

<5-10>
1G. Acknowledgments

The code and output for the statistical analysis subroutines is adapted
from OMNITAB II [Hogben et al., 1971].

<5-11>
1----- CHAPTER 6 -----

ONE-WAY ANALYSIS OF VARIANCE

A. Introduction

STARPAC contains two subroutines for one-way analysis of variance. The
output from these subroutines includes the usual analysis of variance table
plus the robust Kruskal-Wallis rank test. Comprehensive summary statistics
are also given, including means, standard deviations, standard deviations of
the mean and confidence intervals for the mean of each group. Within group
standard deviations, standard deviations of the mean and 95-percent confidence
intervals for the mean are given assuming fixed effects and random effects
models and also assuming ungrouped data. The output also includes pair-wise
multiple comparisons using the Newman-Keuls and Scheffe' techniques; the
Cochran's C, the Bartlett-Box F and variance ratio tests for homogeneity of
variances; and the random effects model components of variance estimate. The
analysis performed by these subroutines is the same as that performed by the
OMNITAB II command ONEWAY [Hogben et al., 1971]. A reference for one-way
analysis of variance is Brownlee [1965], chapter 10.

Users are directed to section B for a brief description of the
subroutines. The declaration and CALL statements are given in section C and
the subroutine arguments are defined in section D. The algorithms used and
output produced by these subroutines are discussed in section E. Sample
programs and their output are shown in section F.

B. Subroutine Descriptions

Subroutine AOV1 computes and prints the one-way analysis of variance
described in section A.

Subroutine AOV1S provides the same analysis as AOV1 but allows the user
to suppress the printed output and to store the number of observations in each
group, the group means and the group standard deviations.

C. Subroutine Declaration and CALL Statements

AOV1: Compute and print a one-way analysis of variance of the input data

Y(n), TAG(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL AOV1 (Y, TAG, N, LDSTAK)

===

<6-1>
1AOV1S: Compute and optionally print a one-way analysis of variance of the
input data; return tag value of each group, number of observations in
each group, group averages, and group standard deviations

Y(n), TAG(n), GSTAT(igstat,4)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL AOV1S (Y, TAG, N, LDSTAK, NPRT, GSTAT, IGSTAT, NG)

===

D. Dictionary of Subroutine Arguments and COMMON Variables

GSTAT <-- The matrix of dimension at least NG by four whose columns contain,
in order, the tag value of the group, number of observations in
the group, group average and group standard deviation. The groups
are in order of ascending tag values.

IERR = 0 indicates that no errors were detected.

IERR = 1 indicates that improper input was detected.

IGSTAT --> The exact value of the first dimension of the matrix GSTAT as
specified in the calling program.

For AOV1: LDSTAK >= 22 + N + (8*NG+N)*P

For AOV1S: LDSTAK >= 22 + N + (4*NG+N)*P

<6-2>
1N --> The total number of observations (all of the groups combined).

NG <-- The number of distinct groups, that is, the number of different
positive tag values.

NPRT --> The parameter controlling printed output.

If NPRT = 0 the printed output is suppressed.

If NPRT <> 0 the printed output is provided.

TAG --> The vector of dimension at least N that contains the tag value for
each observation. The tag values may be any number.
Groups are formed from observations having the same positive tag
value. Observations having a zero or negative tag value will not
be included in the analysis.

Y --> The vector of dimension at least N that contains the observed
data. The order of the observations in Y is arbitrary since
groups are specified by the values in the corresponding elements
of vector TAG.

E. Computational Methods

E.1 Algorithms

The code and output for the one-way analysis of variance subroutines are
adapted from OMNITAB II [Hogben et al., 1971]. The computations performed are
discussed below.

E.2 Computed Results and Printed Output

The argument controlling the printed output, NPRT, is discussed in
section D.

Each of the five sections of automatic printing is described below under
the headings which appear on the printed page as shown in example F-1b. The
discussion is taken from the OMNITAB II User's Reference Manual [Hogben et al.
1971], pages 122 to 124.

ANALYSIS OF VARIANCE. The traditional analysis of variance for a one-way
classification is printed. This shows the sources of variation, the degrees
of freedom, the sums of squares, the mean squares, the F-ratio for testing for
differences between group means and the significance level of the F-ratio.
The usual assumptions of normality, independence and constant variance of
measurement errors are made. A discussion of the statistical treatment of a
one-way classification can be found in section 10.2 of Brownlee [1965].

If the significance level of the F ratio (F PROB.) for

F = (Between Groups Mean Square) / (Within Groups Mean Square)

is less than 0.10 and the number of groups exceeds two, the between groups
(means) sum of squares is separated into two components: the first associated
with the slope (one degree of freedom) and the second representing deviations
about the straight line regression of group averages on group number. This

<6-3>
1information, which does not appear in a traditional analysis of variance, can
be used to examine the effect of time. A discussion of some of the
statistical aspects of this procedure are found in section 11.12 of Brownlee
[1965].

Following the above mentioned analysis of variance, the results for the
Kruskal-Wallis non-parametric H-test for testing for differences between group
means (averages) are printed. The value of H is printed along with its
significance level (F PROB.). The H-test uses the ranks of the measurements
and avoids any assumption about the distribution of measurement errors.
Details of this test may be found in section 7.7 of Brownlee [1965].

ESTIMATES. The following items are printed for each group:
(1) group number,
(2) number of observations in the group,
(3) mean,
(4) within standard deviation,
(5) standard deviation of the mean,
(6) minimum (i.e., smallest) observation,
(7) maximum (i.e., largest) observation,
(8) the sum of the ranks of the observations, and
(9) a 95-percent confidence interval for the mean.

The results are printed with the group numbers (tags) in consecutive,
increasing order regardless of the order in which the numbers were entered.

In printing the means and standard deviations of the groups, the
characters + and - are put immediately after the largest and smallest values.
If two or more values are tied for the largest value, the character + is put
immediately after all of the tied values. Ties for the smallest values are
handled in an analogous manner using the character -. If the number of
observations in a group equals one, ESTIMATE NOT AVAILABLE is printed under
WITHINS S.D. and S.D. OF MEAN. Also, ********** TO ********** appears under
95PCT CONF INT FOR MEAN.

The total number of observations, mean, minimum observation and maximum
observation are also printed for the whole dataset combined. In addition, the
within standard deviation, standard deviation of the mean and 95-percent
confidence interval for the mean are printed for three different models: a
fixed effects model (Model I), a random effects model (MODEL II) and a model
which assumes that all observations are from a single group. The confidence
limits are formed by taking the grand mean plus (and minus) the product of the
percentage point of Student's t distribution and the standard deviation of the
mean. Let k be the number of groups and n be the total number of observations
with positive tag. Then, the standard deviation of the mean is the square
root of the variance of the mean formed as follows:

Model Variance Variance of Degrees of
Mean Freedom

I VI = Within groups mean square VI/n n-k

k
II VII = SUM (Y(i)-mean(Y))**2/(k-1) VII/k k-1
i=1

Ungrouped Vu = Total mean square Vu/n n-1

<6-4>
1

PAIRWISE MULTIPLE COMPARISON OF MEANS. This section only appears if the
significance level (value of F PROB.) of the between groups F-ratio is less
than 0.10. The Newman-Keuls-Hartley procedure is not performed if the number
of measurements with positive tag is less than four plus the number of
groups.

The purpose of this comparison is to divide the groups in such a way that
the group means within a division are not significantly different at the 0.05
significance level, whereas the group means in different divisions are
significantly different at the 0.05 level. Two different procedures are used:
the Newman-Keuls-Hartley method and the Scheffe' method. The two methods are
similar but not identical and frequently give slightly different results. The
Newman-Keuls-Hartley method is described in section 10.6 of Snedecor [1956]
and section 10.8 of Snedecor and Cochran [1967]. The Scheffe' method is
discussed in section 10.3 of Brownlee [1965]. Groups are separated by a
string of five asterisks. If two divisions have no group means in common, the
two divisions are separated by two strings of five asterisks.

Both the Newman-Keuls-Hartley method and the Scheffe' method require
percentage points of the studentized range: an approximation developed by
Mandel is used to compute them. Since the Newman-Keuls-Hartley method is
designed for use when the number of observations in each group is the same,
the number of observations in each of the two groups is approximated by m,
where

1/m = (1/2)*(1/mi + 1/mj)

and mi and mj are the actual number of measurements in each of the two
groups.

TEST FOR HOMOGENEITY OF VARIANCES. The usual analysis of variance for a
one-way classification assumes that the variances of all groups are the same.
This section of output provides information for assessing the validity of this
assumption. Small values of the significance level P indicate lack of
homogeneity of variance. The Cochran's C statistic printed is discussed on
page 180 of Dixon and Massey [1957] and in more detail in chapter 15 of
Eisenhart et al. [1947]. The Bartlett-Box F-test is a modification of
Bartlett's test which uses the F-distribution rather than the chi-squared
distribution and is less sensitive to non-normality. It is discussed on pages
179 and 180 of Dixon and Massey [1957]. A table of critical values of
(maximum variance)/(minimum variance) for equal sample sizes is given on pages
100 and 101 of Owen [1962].

If either P value is less than or equal to 0.10, the approximate between
mean F-test in the presence of heterogeneous variance and its significance
level P are also printed. This approximate F-test for testing for differences
between means is described on pages 287-289 of Snedecor [1956]. This
information does not appear in figure F-1b because both P values (significance
levels) exceed 0.10.

MODEL II - COMPONENTS OF VARIANCE. This is the usual analysis of
variance estimate for the between component in a random effects model (Model
II). For a discussion of this analysis, see section 10.6 and section 10.7 of
Brownlee [1965].

<6-5>
1

F. Example

The example program below uses AOV1 to analyze 16 determinations of the
gravitational constant, grouped according to the material used to make the
measurements. A discussion of this example can be found on pages 314-316 of
Brownlee [1965].

<6-6>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE AOV1 USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y AND TAG MUST BE CHANGED TO DOUBLE PRECISION
C IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(20), TAG(20)
DOUBLE PRECISION DSTAK(200)
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 200
C
C READ NUMBER OF OBSERVATIONS
C OBSERVED DATA
C TAG VALUES
C
READ (5,100) N
READ (5,101) (Y(I), I=1,N)
READ (5,101) (TAG(I), I=1,N)
C
C PRINT TITLE AND CALL AOV1 FOR ANALYSIS OF VARIANCE
C
WRITE (IPRT,102)
CALL AOV1 (Y, TAG, N, LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (I5)
101 FORMAT (20F5.1)
102 FORMAT ('1RESULTS OF STARPAC ONE-WAY ANALYSIS OF VARIANCE',
* ' SUBROUTINE AOV1')
END

Data:

16
83.0 81.0 76.0 78.0 79.0 72.0 61.0 61.0 67.0 67.0 64.0 78.0 71.0 75.0 72.0 74.0
1.0 1.0 1.0 1.0 1.0 1.0 2.0 2.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 3.0

<6-7>
1RESULTS OF STARPAC ONE-WAY ANALYSIS OF VARIANCE SUBROUTINE AOV1
STARPAC 2.08S (03/15/90)

ANALYSIS OF VARIANCE

*GROUP NUMBERS HAVE BEEN ASSIGNED ACCORDING TO TAG VALUES GIVEN, WHERE THE SMALLEST TAG GREATER THAN ZERO HAS BEEN ASSIGNED *
*GROUP NUMBER 1, THE NEXT SMALLEST, GROUP NUMBER 2, ETC. TAGS LESS THAN OR EQUAL TO ZERO HAVE NOT BEEN INCLUDED IN ANALYSIS.*
*NUMBER OF VALUES EXCLUDED FROM ANALYSIS IS 0 *

SOURCE D.F. SUM OF SQUARES MEAN SQUARES F RATIO F PROB.

BETWEEN GROUPS 2 5.651042E+02 2.825521E+02 .261E+02 .000
SLOPE 1 6.450893E+01 6.450893E+01 .141E+01 .257
DEVS. ABOUT LINE 1 5.005952E+02 5.005952E+02 .462E+02 .000
WITHIN GROUPS 13 1.408333E+02 1.083333E+01
TOTAL 15 7.059375E+02

KRUSKAL-WALLIS RANK TEST FOR DIFFERENCE BETWEEN GROUP MEANS * H = .114E+02, F PROB = .000 (APPROX.)

ESTIMATES
SUM OF
TAG NO. MEAN WITHIN S.D. S.D. OF MEAN MINIMUM MAXIMUM RANKS 95PCT CONF INT FOR MEAN

1.000000E+00 6 7.81667E+01+ 3.86868E+00+ 1.57938E+00 7.20000E+01 8.30000E+01 76.0 7.41067E+01 TO 8.22266E+01
2.000000E+00 5 6.40000E+01- 3.00000E+00 1.34164E+00 6.10000E+01 6.70000E+01 15.0 6.02750E+01 TO 6.77250E+01
3.000000E+00 5 7.40000E+01 2.73861E+00- 1.22474E+00 7.10000E+01 7.80000E+01 45.0 7.05996E+01 TO 7.74004E+01

TOTAL 16 7.24375E+01 6.10000E+01 8.30000E+01

FIXED EFFECTS MODEL 3.29140E+00 8.22851E-01 7.06594E+01 TO 7.42156E+01
RANDOM EFFECTS MODEL 7.29576E+00 4.21221E+00 5.43138E+01 TO 9.05612E+01
UNGROUPED DATA 6.86021E+00 1.71505E+00 6.87815E+01 TO 7.60935E+01

PAIRWISE MULTIPLE COMPARISON OF MEANS. THE MEANS ARE PUT IN INCREASING ORDER IN GROUPS SEPARATED BY *****. A MEAN IS
ADJUDGED NON-SIGNIFICANTLY DIFFERENT FROM ANY MEAN IN THE SAME GROUP AND SIGNIFICANTLY DIFFERENT AT THE .05 LEVEL FROM
ANY MEAN IN ANOTHER GROUP. ***** ***** INDICATES ADJACENT GROUPS HAVE NO COMMON MEAN.

NEWMAN-KEULS TECHNIQUE, HARTLEY MODIFICATION. (APPROXIMATE IF GROUP NUMBERS ARE UNEQUAL.)
6.40000E+01,
***** *****
7.40000E+01, 7.81667E+01,

SCHEFFE TECHNIQUE.
6.40000E+01,
***** *****
7.40000E+01, 7.81667E+01,

TESTS FOR HOMOGENEITY OF VARIANCES.
COCHRANS C = MAX. VARIANCE/SUM(VARIANCES) = .4756, P = .395 (APPROX.)
BARTLETT-BOX F = .269, P = .764
MAXIMUM VARIANCE / MINIMUM VARIANCE = 1.9956

MODEL II - COMPONENTS OF VARIANCE.
ESTIMATE OF BETWEEN COMPONENT 5.114706E+01

<6-8>
1G. Acknowledgments

The code and output for the one-way analysis of variance subroutines is
adapted from OMNITAB II [Hogben et al., 1971]. The discussion of the results
(section E.2) is taken from the OMNITAB II User's Reference Manual [Hogben et
al., 1971].

<6-9>
1----- CHAPTER 7 -----

CORRELATION ANALYSIS

A. Introduction

STARPAC contains two subroutines for correlation analysis of a multivariate
random sample. The analysis provided by these subroutines consists of
seven tables, which, when used together, aid the user in using correlation
techniques effectively for prediction and model building. The analysis is the
same as that provided by the OMNITAB II command CORRELATION [Hogben et al.
1971]. For further information on correlation techniques users should consult
Kendall and Stuart [1973], Brownlee [1965] and Anderson [1958].

Users are directed to section B for a brief description of the
subroutines. The declaration and CALL statements are given in section C and
the subroutine arguments are defined in section D. The algorithms used and
output produced by these subroutines are discussed in section E. Sample
programs and their output are shown in section F.

B. Subroutine Descriptions

Subroutine CORR computes and prints 1) the simple correlation matrix and
2) the significance levels of the simple correlation coefficients; 3) the
partial correlation coefficients and 4) their significance levels; 5) the
Spearman rank correlation coefficients; 6) a test for a quadratic relationship
among the variables; and 7) 95-percent and 99-percent confidence intervals for
the simple correlation coefficients.

Subroutine CORRS provides the same analysis as CORR but returns the
variance-covariance matrix used to compute the correlation coefficients. The
user can also optionally suppress the printed output.

C. Subroutine Declaration and CALL Statements

CORR: Compute and print a correlation analysis of a multivariate random
sample

YM(n,m)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL CORR (YM, N, M, IYM, LDSTAK)

===

<7-1>
1CORRS: Compute and optionally print a correlation analysis of a multivariate
random sample; return variance-covariance matrix

YM(n,m), VCV(m,m)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL CORRS (YM, N, M, IYM, LDSTAK, NPRT, VCV, IVCV)

===

D. Dictionary of Subroutine Arguments and COMMON Variables

IERR = 0 indicates that no errors were detected.

IERR = 1 indicates that improper input was detected.

IVCV --> The exact value of the first dimension of the matrix VCV as
specified in the calling program.

IYM --> The exact value of the first dimension of the matrix YM as
specified in the calling program.

LDSTAK --> The length of the double precision workspace vector DSTAK. LDSTAK
must equal or exceed the appropriate value given below, where if
the single precision version of STARPAC is being used P = 0.5,
otherwise P = 1.0. [See chapter 1, section B.]

For CORR:

LDSTAK >= (47+max(N,M))/2 + (max(N,M)+N*(M+3)+M+7*M**2)*P

<7-2>
1 For CORRS:

LDSTAK >= (47+IO*max(N,M))/2 + IO*(max(N,M)+N*(M+3)+M+6*M**2)*P

where IO = 0 if NPRT = 0 and IO = 1 if NPRT <> 0.

M --> The number of variables measured for each observation, that is,
the number of columns of data in YM.

N --> The number of observations, that is, the number of rows of data in
YM.

NPRT --> The argument controlling printed output.

If NPRT = 0 the printed output is suppressed.

If NPRT ] 0 the printed output is provided.

VCV <-- The matrix of dimension at least M by M that contains the
variance-covariance matrix,

1 N
VCV(j,k) = ----- * SUM (YM(i,j)-YjMEAN)*(YM(i,k)-YkMEAN)
(N-1) i=1

1 N 1 N
where YjMEAN = - * SUM YM(i,j) and YkMEAN = - * SUM YM(i,k).
N i=1 N i=1

YM --> The matrix of dimension at least N by M that contains the observed
multivariate data. The element in the ith row and jth column is
the ith observation on the jth variable.

E. Computational Methods

E.1 Algorithms

Formulas for the computed tables are given in section E.2. The code and
output for the correlation analysis subroutines are adapted from OMNITAB II
[Hogben et al., 1971].

E.2 Computed Results and Printed Output

The argument controlling the printed output, NPRT, is discussed in
section D.

STARPAC correlation analysis subroutines compute and print a seven-part
correlation analysis. The output for each part is discussed individually
below; the text for this discussion is taken from the OMNITAB II User's
Reference Manual [Hogben et al., 1971].

The Simple Correlation Coefficient Matrix. The (j,k)th entry of this
matrix is the simple (product moment) correlation coefficient, rjk, between
the data in columns j and k defined by

rjk = VCV(j,k)/sqrt( VCV(j,j)*VCV(k,k) )

<7-3>
1
Note that when more than two variables are under study, the simple correlation
coefficient can be seriously distorted by the effect of other variables. The
partial correlation coefficient (see below) can be used to identify such
distortion.

The Significance Levels of the Simple Correlation Coefficients. The
(j,k)th entry of this table is the significance level, Sr(j,k) of the
corresponding partial correlation coefficient, rjk,

Sr(j,k) = probability of exceeding F0(1,N-2)

where F0(1,N-2) is an F-statistic with 1 and N-2 degrees of freedom,

(N-2)*rjk**2
F0(1,N-2) = ------------
1-rjk**2

If the "true" correlation coefficient is equal to zero, then Sr(j,k) is the
probability that in a random sample (of the same size) the absolute value of a
sample correlation coefficient will exceed the absolute value of the observed
correlation coefficient, rjk.

The Partial Correlation Coefficients. The partial correlation
coefficient, pjk, between the data in columns j and k, j<>k, with the
remaining variables fixed, i.e., held constant, is given by

pjk = -cjk/sqrt(cjj*ckk)

where cjk denotes the (j,k)th element of the inverse of the simple correlation
matrix. Because the partial correlation coefficient measures the correlation
between two variables after eliminating the effect of the remaining variables,
it may avoid the distortion suffered by the simple correlation coefficient
when more than two variables are under study. The user should therefore
compare the simple correlation coefficients with the partial correlation
coefficients. Any "large" discrepancy indicates that one or more of the
remaining variables is having an important effect on the relationship. [See
Kendall and Stuart, 1961, section 27.5, page 318.]

The Significance Levels of the Partial Correlation Coefficients. The
(j,k)th entry of this table is the significance level, Sp(j,k) of the
corresponding partial correlation coefficient, pjk,

Sp(j,k) = probability of exceeding F0(1,N-M)

where F0(1,N-M) is an F-statistic with 1 and N-M degrees of freedom,

(N-M)*pjk**2
F0(1,N-M) = ------------
1-pjk**2

If the "true" partial correlation coefficient is equal to zero, then Sp(j,k)
is the probability that in a random sample (of the same size) the absolute
value of a partial correlation coefficient will exceed the absolute value of
the observed partial correlation coefficient, pjk.

Spearman Rank Correlation Coefficient. The rank correlation coefficient
does not require the assumption that the data have a bivariate normal

<7-4>
1distribution. The Spearman rank correlation coefficient, sjk, for the data in
columns j and k is computed from

A - Djk**2 - Tj - Tk
sjk = ----------------------------- ,
sqrt( (A - 2*Tj)*(A - 2*Tk) )

where

N
Djk**2 = SUM (rank(YM(i,j)) - rank(YM(i,k)))**2,
i=1

A = (N-1)*N*(N+1)/6,

Tj = (1/12) SUM (tj-1)*tj*(tj+1),
j

Tk = (1/12) SUM (tk-1)*tk*(tk+1),
k

tj = number of ties in a set of tied values in column j of YM, and

tk = number of ties in a set of tied values in column k of YM.

The quantities Tj and Tk adjust for ties in the ranks. If there are no
ties, Tj and Tk equal zero and

sjk = 1 - (Djk**2 / A).

A comparison should be made between the rank correlation coefficients and the
corresponding simple and partial correlation coefficients. Again, a "large"
discrepancy between two comparable coefficients is an indicator of some
abnormality in the data. See Kendall [1948] for further details.

Significance Level of Quadratic Fit over Linear Fit. Underlying the use
of a correlation coefficient is the assumption that the two variables are
linearly related. The results in this part are useful in assessing the
validity of this assumption of linearity. The variables are all assumed to be
normally distributed. The numbers printed are the significance levels for a
F-test of the hypothesis that the quadratic term in a quadratic model is zero.
The F-statistic used is

RSS(linear model) - RSS(quadratic model)
F0(1,N-3) = ----------------------------------------
RSS(quadratic model)/(N-3)

with 1 and (N-3) degrees of freedom, where RSS is the residual sum of squares
function. The values of 1 and (N-3) are printed in the heading. The
significance level, SQ, is then computed as

SQ = Prob(F > F0).

Small values of the significance level (less than 0.05, for example) indicate
lack of linearity. The test results differ depending upon which variable of a
pair is considered the dependent variable and which one is considered the
independent (or predictor) variable. Hence, the entire table is printed,
rather than just the lower triangle. The diagonal entries are always equal to

<7-5>
1one and have no particular relevance. Tests of hypotheses in linear
regression are discussed in section 13.8 of Brownlee [1965].

Confidence Intervals For Simple Correlation Coefficients. Both 95-percent
and 99-percent confidence intervals for the simple correlation coefficients
are printed in this two-way table. There are two entries in each cell of the
table. The values 0.95 and 0.99 are printed along the upper left to lower
right diagonal. The 95-percent confidence limits are printed below the
diagonal and the 99-percent confidence limits are printed above the diagonal.
The number in the lower left of each cell is the lower confidence limit and
the number in the upper right is the upper confidence limit.

The confidence intervals are based on a normal approximation. [See
Morrison, 1967, chapter 3, page 101.] They are computed as follows:

Lower confidence limit: tanh( z - u/sqrt(N-3) )

Upper confidence limit: tanh( z + u/sqrt(N-3) )

where

inv(tanh)(rjk)
z = <
0.5 * ln( (1+rjk)/(1-rjk) )
and

1.96 for 95-percent confidence interval
u = <
2.58 for 99-percent confidence interval.

F. Example

The sample program shown below illustrates the use of CORR. The data are
taken from Draper and Smith [1968], page 216. The data are part of a study to
determine the effect of relative urbanization, educational level, and relative
income (columns 1, 2, and 3, respectively) on product usage (column 4).

<7-6>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE CORR USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF YM MUST BE CHANGED TO DOUBLE PRECISION
C IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL YM(20,6)
DOUBLE PRECISION DSTAK(200)
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 200
IYM = 20
C
C READ NUMBER OF OBSERVATIONS AND NUMBER OF VARIABLES
C OBSERVED MULTIVARIATE DATA
C
READ (5,100) N, M
READ (5,101) ((YM(I,J), J=1,M), I=1,N)
C
C PRINT TITLE AND CALL CORR TO PERFORM CORRELATION ANALYSIS
C
WRITE (IPRT,102)
CALL CORR (YM, N, M, IYM, LDSTAK)
C FORMAT STATEMENTS
C
100 FORMAT (2I5)
101 FORMAT (4F6.1)
102 FORMAT ('1RESULTS OF STARPAC',
* ' CORRELATION ANALYSIS SUBROUTINE CORR')
END

<7-7>
1Data:

9 4
42.2 11.2 31.9 167.1
48.6 10.6 13.2 174.4
42.6 10.6 28.7 160.8
39.0 10.4 26.1 162.0
34.7 9.3 30.1 140.8
44.5 10.8 8.5 174.6
39.1 10.7 24.3 163.7
40.1 10.0 18.6 174.5
45.9 12.0 20.4 185.7

<7-8>
1RESULTS OF STARPAC CORRELATION ANALYSIS SUBROUTINE CORR
STARPAC 2.08S (03/15/90)

CORRELATION ANALYSIS FOR 4 VARIABLES WITH 9 OBSERVATIONS

CORRELATION MATRIX
- STANDARD DEVIATIONS ARE ON THE DIAGONAL
- CORRELATION COEFFICIENTS ARE BELOW THE DIAGONAL

COLUMN 1 2 3 4

1 4.1764552
2 .68374212 .74628711
3 -.61596989 -.17249295 7.9279218
4 .80175221 .76795025 -.62874595 12.645157

SIGNIFICANCE LEVELS OF SIMPLE CORRELATION COEFFICIENTS (ASSUMING NORMALITY)

COLUMN 1 2 3 4

1 0.
2 .42269734E-01 0.
3 .77358710E-01 .65719690 0.
4 .93530479E-02 .15660429E-01 .69711268E-01 0.

PARTIAL CORRELATION COEFFICIENTS WITH 2 REMAINING VARIABLES FIXED

COLUMN 1 2 3 4

1 1.0000000
2 .43170939 1.0000000
3 -.45663591 .69717000 1.0000000
4 .10539042 .72682008 -.64778927 1.0000000

SIGNIFICANCE LEVELS OF PARTIAL CORRELATION COEFFICIENTS (ASSUMING NORMALITY)

COLUMN 1 2 3 4

1 0.
2 .33344895 0.
3 .30301787 .81676691E-01 0.
4 .82207563 .64249351E-01 .11565980 0.

<7-9>
1SPEARMAN RANK CORRELATION COEFFICIENTS (ADJUSTED FOR TIES)

COLUMN 1 2 3 4

1 1.0000000
2 .61088401 1.0000000
3 -.56666667 -.12552411 1.0000000
4 .68333333 .60251573 -.71666667 1.0000000

SIGNIFICANCE LEVEL OF QUADRATIC FIT OVER LINEAR FIT BASED ON F RATIO WITH 1 AND 6 DEGREES OF FREEDOM
(FOR EXAMPLE, .1703 IS THE SIGNIFICANCE LEVEL OF THE QUADRATIC TERM WHEN COLUMN 2 IS FITTED TO COLUMN 1)

COLUMN 1 2 3 4

1 1.0000000 .40442827 .94936019 .85222263
2 .17034675 1.0000000 .80987340 .93773837
3 .71654009 .56763094 1.0000000 .84988526
4 .15652586 .59973987 .36810270 1.0000000

CONFIDENCE INTERVALS FOR SIMPLE CORRELATION COEFFICIENTS (USING FISHER TRANSFORMATION)
95 PER CENT LIMITS BELOW DIAGONAL, 99 PER CENT LIMITS ABOVE DIAGONAL

COLUMN 1 2 3 4

1 99.000000 .95517075 .32129731 .97349304
95.000000 -.21219610 -.94361630 .51874096E-01

2 .92694792 99.000000 .70508573 .96846093
.35940598E-01 95.000000 -.84136050 -.36249864E-01

3 .81486051E-01 .55523436 99.000000 .30247207
-.90845980 -.75062574 95.000000 -.94585730

4 .95654889 .94838422 .60737579E-01 99.000000
.29437227 .21190035 -.91203488 95.000000

<7-10>
1G. Acknowledgments

The code and output for the correlation subroutines is adapted from
OMNITAB II [Hogben et al., 1971]. The discussion of the results (section E.2)
is taken from the OMNITAB II Users' Reference Manual [Hogben et al., 1971].

<7-11>
1----- CHAPTER 8 -----

LINEAR LEAST SQUARES

A. Introduction

STARPAC contains eight subroutines for linear least squares analysis.
For four of these, the user specifies the model by supplying the design matrix
(the matrix whose columns are the independent variables plus a column of ones
if a constant term is being estimated). The other four perform the same
analysis for the special case of a polynomial model, where the need for the
user to explicitly create the design matrix is eliminated.

Each of the subroutines described in this chapter assumes that the
observations of the dependent variable, Y(i), which are measured with error,
are modeled by

NPAR
Y(i) = SUM PAR(j)*x(i,j) + e(i) for i = 1, ..., N,
j=1

where

N is the number of observations;

NPAR is the number of parameters;

x(i,j) is the jth element of the ith row of the design matrix (for the
user-specified model, x(i,j) = XM(i,j) for i = 1, ..., N and j = 1, ...,
NPAR, while for the polynomial model, x(i,j) = X(i)**(j-1) for i =
1,..., N and j = 1, ..., NPAR);

PAR is the vector of the NPAR unknown parameters (coefficients); and

e(i) is the random error in the ith observation.

The least squares solution, PARhat, is that which minimizes (with respect to
PAR) the residual sum of squares function

N
RSS(PAR) = SUM e(i)**2*wt(i)
i=1

N NPAR
= SUM wt(i)*( Y(i) - SUM PAR(j)*x(i,j) )**2
i=1 j=1

where ''hat'' (e.g., PARhat, Yhat, etc.) denotes the estimated quantity, and

wt(i) is the weight assigned to the ith observation (wt(i) = 1.0 in
the ''unweighted'' case). Appendix B discusses several common
applications for weighted least squares.

<8-1>
1output produced by these subroutines are discussed in section E. Sample
programs and their output are shown in section F.

B. Subroutine Descriptions

The linear least squares estimation subroutines permit both weighted and
unweighted analysis. The user has two levels of control over the computations
and printed output.

* In level one, a four-part printed report is automatically provided
and the residuals are returned to the user via the subroutine
argument list.

* Level two also returns the residuals. It allows the user to specify
the amount of printed output, and, in addition, returns the
following estimated values via the argument list:
- the estimated parameters;
- the residual standard deviation;
- the predicted values;
- the standard deviations of the predicted values;
- the standardized residuals; and
- the variance-covariance matrix of the estimated parameters.

The simplest of the linear least squares subroutines are LLS for the
user-specified model and LLSP for the polynomial model. They perform
unweighted analyses, provide a four-part printed report and return the
residuals via the argument list (level one control). The other six
subroutines provide greater flexibility by adding the weighting and/or level
two control features. These features are each indicated by a different suffix
letter on the subroutine name (e.g., LLSS and LLSPWS).

* Suffix W indicates user-supplied weights.

* Suffix S indicates level two control of the computations and output.

C. Subroutine Declaration and CALL Statements

NOTE: Argument definitions and sample programs are given in sections D and
section F, respectively. The conventions used to present the following
declaration and CALL statements are given in chapter 1, sections B and D.

LLS: Compute and print a four-part unweighted linear least squares
analysis with user-specified model (design matrix); return residuals

Y(n), XM(n,npar), RES(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL LLS (Y, XM, N, IXM, NPAR, RES, LDSTAK)

===

<8-2>
1LLSS: Compute and optionally print a four-part unweighted linear least
squares analysis with user-specified model (design matrix); return
residuals, parameter estimates, residual standard deviation, predicted
values, standard deviations of the predicted values, standardized
residuals, and variance-covariance matrix of parameters

Y(n), XM(n,npar), RES(n)
PAR(npar), PV(n), SDPV(n), SDRES(n), VCV(npar,npar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL LLSS (Y, XM, N, IXM, NPAR, RES, LDSTAK,
+ NPRT, PAR, RSD, PV, SDPV, SDRES, VCV, IVCV)

===

LLSW: Compute and print a four-part weighted linear least squares analysis
with user-specified model (design matrix); return residuals

Y(n), XM(n,npar), RES(n)
WT(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL LLSW (Y, WT, XM, N, IXM, NPAR, RES, LDSTAK)

===

LLSWS: Compute and optionally print a four-part weighted linear least
squares analysis with user-specified model (design matrix); return
residuals, parameter estimates, residual standard deviation, predicted
values, standard deviations of the predicted values, standardized
residuals, and variance-covariance matrix of parameters

Y(n), XM(n,npar), RES(n)
WT(n), PAR(npar), PV(n), SDPV(n), SDRES(n), VCV(npar,npar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL LLSWS (Y, WT, XM, N, IXM, NPAR, RES, LDSTAK,
+ NPRT, PAR, RSD, PV, SDPV, SDRES, VCV, IVCV)

===

LLSP: Compute and print a four-part unweighted linear least squares
analysis with polynomial model (design matrix); return residuals

Y(n), X(n), RES(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL LLSP (Y, X, N, NDEG, RES, LDSTAK)

===

<8-3>
1
LLSPS: Compute and optionally print a four-part unweighted linear least
squares analysis with polynomial model (design matrix); return
residuals, parameter estimates, residual standard deviation, predicted
values, standard deviations of the predicted values, standardized
residuals, and variance-covariance matrix of parameters

Y(n), X(n), RES(n)
PAR(npar), PV(n), SDPV(n), SDRES(n), VCV(npar,npar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL LLSPS (Y, X, N, NDEG, RES, LDSTAK,
+ NPRT, LPAR, PAR, NPAR, RSD,
+ PV, SDPV, SDRES, VCV, IVCV)

===

LLSPW: Compute and print a four-part weighted linear least squares analysis
with polynomial model (design matrix); return residuals

Y(n), X(n), RES(n)
WT(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL LLSPW (Y, WT, X, N, NDEG, RES, LDSTAK)

===

LLSPWS: Compute and optionally print a four-part weighted linear least
squares analysis with polynomial model (design matrix); return
residuals, parameter estimates, residual standard deviation, predicted
values, standard deviations of the predicted values, standardized
residuals, and variance-covariance matrix of parameters

Y(n), X(n), RES(n)
WT(n), PAR(npar), PV(n), SDPV(n), SDRES(n), VCV(npar,npar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL LLSPWS (Y, WT, X, N, NDEG, RES, LDSTAK,
+ NPRT, LPAR, PAR, NPAR, RSD,
+ PV, SDPV, SDRES, VCV, IVCV)

===

<8-4>
1D. Dictionary of Subroutine Arguments and COMMON Variables

IERR ... An error flag returned in COMMON /ERRCHK/ [see chapter 1,
section D.5]. Note that using (or not using) the error flag will
not affect the printed error messages that are automatically
provided.

IERR = 0 indicates that no errors were detected and that the
least squares solution was found.

IERR = 1 indicates that improper input was detected.

IERR = 2 indicates that the model is computationally singular,
which may mean the model has too many parameters. The
user should examine the model and data to determine and
remove the cause of the singularity.

IERR = 3 indicates that at least one of the standardized residuals
could not be computed because its standard deviation was
zero. The validity of the variance-covariance matrix is
questionable.

IVCV --> The exact value of the first dimension of the matrix VCV as
specified in the calling program.

IXM --> The exact value of the first dimension of the matrix XM as
specified in the calling program.

LDSTAK --> The length of the DOUBLE PRECISION workspace vector DSTAK. LDSTAK
must equal or exceed the value given below, where if the single
precision version of STARPAC is being used P = 0.5, otherwise
P = 1.0 [see chapter 1, section B].

LDSTAK >= 28 + [6*N+1+5*NPAR+NPAR*N+2*NPAR**2]*P

LPAR --> The actual length of the vector PAR as specified in the calling
program.

N --> The number of observations.

NDEG --> The degree of the polynomial model. The number of estimated
parameters is NPAR = NDEG + 1.

<8-5>
1
NPAR --- The number of parameters to be estimated. NPAR is input to the
subroutines with a user-specified model; for the subroutines with
a polynomial model, NPAR = NDEG + 1 is returned.

NPRT --> The argument controlling printed output. NPRT is a four-digit
integer, where the value of the Ith digit (counting from left to
right) controls the Ith section of the output.

If the Ith digit = 0, the output from the Ith section is
suppressed;
= 1, the brief form of the Ith section is given;
>= 2, the full form of the Ith section is given.

The default value for NPRT is 1112. If the user-supplied value of
NPRT is less than zero or NPRT is not an argument in the
subroutine CALL statement the default value will be used.

A full discussion of the printed output is given in section E.2
and is summarized as follows.

Section 1 provides information for each observation based on the
solution. Brief output includes information for the
first 40 observations, while full output provides the
information for all of the data.

Section 2 is a set of four residual plots. Brief output and full
output are the same for this section.

Section 3 is an analysis of variance. Brief output and full
output are the same for this section.

Section 4 is the final summary of the estimated parameters. Brief
output does not include printing the variance-covariance
matrix while full output does.

PAR <-- The vector of dimension at least NPAR that contains the estimated
parameter values.

PV <-- The vector of dimension at least N that contains the predicted
values of the dependent variable,

NPAR
PV(i) = SUM PARhat(j)*x(i,j) = Yhat(i) for i = 1, ..., N.
j=1

RES <-- The vector of dimension at least N that contains the residuals at
the solution,

NPAR
RES(i) = Y(i) - SUM PARhat(j) x(i,j)
j=1

= Y(i) - Yhat(i) = ehat(i)

for i = 1, ..., N.

<8-6>
1RSD <-- The residual standard deviation at the solution,

RSD = sqrt( RSS(PARhat)/(Nnzw-NPAR) )

where Nnzw is the number of observations with nonzero weights.

SDPV <-- The vector of dimension at least N that contains the standard
deviation of each predicted value at the solution,

SDPV(i) = the ith diagonal element of sqrt[D*VCV*trans(D)]

for i = 1, ..., N, where D is the design matrix, D(i,j) = x(i,j)
with x(i,j) defined in section A, and trans(D) is the transpose of
D.

SDRES <-- The vector of dimension at least N that contains the standardized
residuals at the solution, i.e., the ith residual divided by its
estimated standard deviation,

SDRES(i) = RES(i)/sqrt( RSD**2/wt(i) - SDPV(i)**2 )

for i = 1, ..., N.

VCV <-- The matrix of dimension at least NPAR by NPAR that contains the
variance-covariance matrix of the estimated parameters at the
solution,

VCV = RSD**2*inv(trans(D)*W*D)

where W is the N by N diagonal matrix of weights,

W = diag( wt(i), i=1, ..., N),

and D is the design matrix, D(i,j) = x(i,j) with x(i,j) defined in
section A, trans(D) is the transpose of D, and inv(.) is the
inverse of the designated matrix.

WT --> The vector of dimension at least N that contains the weights.
Negative weights are not allowed and the number of nonzero weights
must equal or exceed the number of parameters being estimated. A
zero weight eliminates the corresponding observation from the
analysis, although the residual, the predicted value and the
standard deviation of the predicted value of a zero-weighted
observation are still computed [see Appendix B].

X --> The vector of dimension at least N that contains the independent
variable used to construct the design matrix for the polynomial
model.

XM --> The matrix of dimension at least N by NPAR that contains the
design matrix, i.e., the matrix whose columns are the independent
variables plus a column of ones if a constant term is being
estimated.

Y --> The vector of dimension at least N that contains the dependent
variable.

<8-7>
1E. Computational Methods

E.1. The Linear Least Squares Algorithm

The linear least squares estimation subroutines use a modified
Gram-Schmidt algorithm [Davis, 1962; Walsh, 1962]. The printed output for the
linear least squares subroutines has been modeled on the linear least squares
output used by OMNITAB II [Hogben et al., 1971].

E.2 Computed Results and Printed Output

The argument controlling the printed output, NPRT, is discussed in
section D.

The output from the linear least squares estimation subroutines consists
of four sections, several of which include tables summarizing the results. In
the following descriptions, the actual table headings are given by the
uppercase phrases enclosed in angle brackets (<...>). Results which
correspond to input or returned subroutine CALL statement arguments are
identified by the argument name in uppercase (not enclosed in angle braces).

Section 1 provides the following information for each observation, i, i = 1,
..., N, based on the solution.

* : the row number of the observation.

* : the values for up to the first three columns of
the independent variable (design matrix). For subroutines with a
user-supplied model, this is up to the first three columns of the
matrix XM (excluding the first column if it is all ones, indicating a
constant term); for the polynomial model subroutines, this is the
variable X.

* : the values of the dependent variable, Y.

* : the predicted values, PV, from the fit.

* : the standard deviations of the predicted
values, SDPV.

* : the error estimate, RES.

* : the standardized residual, SDRES.

* : the user-supplied weights, WT, printed only when weighted
analysis is performed.

Section 2 displays the following plots of the standardized residuals.

* The standardized residuals versus row numbers.

* The standardized residuals versus predicted values.

* The autocorrelation function of the (non-standardized) residuals.

* The normal probability plot of the standardized residuals.

<8-8>
1

Section 3 provides an analysis of variance. The results of this analysis
depend upon the order of the columns of the design matrix unless the
columns are orthogonal. The analysis includes the following
information.

* : the index, j, of the parameter being examined, PAR(j).

* : SSj, the reduction in the sum of
squares due to fitting PAR(j) after having fit parameters PAR(1),
PAR(2), ..., PAR(j-1). SSj depends on the order of the parameters
unless the design matrix has orthogonal columns. This is a
decomposition of the total sum of squares, TSS, into NPAR + 1 parts

NPAR
TSS = RSS(PARhat) + SUM SSj .
j=1

The residual sum of squares and total sum of squares is also listed in
this column.

* : the cumulative mean square reduction,

j
MSREDj = SUM SSk/j for j = 1, ..., NPAR.
k=1

* : the degrees of freedom associated with the cumulative
mean square reduction for each parameter, DF(MSREDj) = j for j = 1,
..., NPAR. The degrees of freedom for the residuals, Nnzw-NPAR, and
the total degrees of freedom, Nnzw, where Nnzw is the number of
observations with nonzero weights, are also listed in this column.

* : the cumulative residual mean square,

NPAR
RMSj = (RSS(PARhat) + SUM SSk) / (Nnzw-j) for j = 1, ..., NPAR.
k=j+1

* : the degrees of freedom associated with the cumulative
residual mean square for each parameter, DF(RMSj) = Nnzw-j for j = 1,
..., NPAR.

* , and : the F-ratio and its significance level
under the null hypotheses that PAR(j) is zero after allowance has been
made for parameters PAR(1), PAR(2), ..., PAR(j-1). This F-ratio is

Fj = SSj/RMSnpar for j = 1, ..., NPAR,

with 1 and Nnzw-NPAR degrees of freedom. The significance level
listed is the probability of exceeding the calculated F-ratio under
the null hypothesis that the corresponding parameter, PAR(j), is zero.

* , and : the F-ratio and its significance level
under the null hypothesis that parameters PAR(j), PAR(j+1), ...,
PAR(NPAR), are zero after allowance has been made for parameters
PAR(1), PAR(2), ..., PAR(j-1). This F-ratio is

<8-9>
1
NPAR
Fj = (SUM SSk/(NPAR-j+1)) / RMSnpar for j = 1, ..., NPAR,
k=j

with NPAR-j+1 and Nnzw-NPAR degrees of freedom. The significance
level listed is the probability of exceeding the calculated F-ratio
under the null hypothesis that all of the parameters PAR(j), PAR(j+1),
..., PAR(NPAR) are zero.

The numerator of this ratio is the extra sum of squares accounted for
by inclusion of the terms PAR(j)*x(j) + PAR(j+1)*x(j+1) + ... +
PAR(npar)*x(npar) in the model, divided by its degrees of freedom; the
denominator is the residual mean square of the full model. This ratio
is a means of comparing the extra sum of squares to its expected value
as estimated by the residual mean square. When the terms of the model
have a logical order of entry, this series of F-tests can be used to
judge how many terms should be included in the model [see Draper and
Smith, 1981, pages 97 and 98].

Section 4 summarizes the following information about the final parameter
estimates and their variances.

* The variance-covariance matrix, VCV, of the estimated parameters, and
the corresponding correlation matrix,

rjk = VCV(j,k)/sqrt(VCV(j,j)*VCV(k,k)) for j = 1, ..., NPAR
and k = 1, ..., NPAR.

* :

- : the final estimate for each parameter, PAR(j)
for j = 1, ..., NPAR.

- : the standard deviation of the estimated parameter,
sqrt(VCV(j,j)) for j = 1, ..., NPAR.

- : the Student's t statistic under the null hypothesis that
PAR(j) is actually zero,

T(PAR=0)j = PAR(j)/sqrt(VCV(j,j)) for j = 1, ..., NPAR.

- : the two-sided significance level of T(PAR=0)j. This is
the probability of exceeding the given t value under the null
hypothesis that the parameter PAR(j) is actually zero.

- : an estimate of the number of reliable digits in the
parameter estimates, i.e., an indication of the computational
accuracy of the solution. A computationally accurate solution will
produce values between DIGITS-2 and DIGITS, where DIGITS is the
number of decimal digits carried by the user's computer for a single
precision value when the single precision version of STARPAC is
being used and is the number carried for a double precision value
otherwise. Values less than DIGITS-4 may indicate some
computational difficulty such as poor scaling or near singularity.

* :
1 PARAMETER>, , and values for a fit
omitting the last column of the design matrix and thus omitting the
last parameter from the model.

* The residual standard deviation, RSD.

* The residual degrees of freedom, Nnzw-NPAR, where Nnzw is the number
of observations with nonzero weights.

* The squared multiple correlation coefficient,

N
SUM wt(i)*Y(i)
RSS(PARhat) i=1
R2 = 1.0 - ------------------------ where Yw = -------------- .
N N
SUM wt(i) * (Y(i)-Yw)**2 SUM wt(i)
i=1 i=1

R2 is a measure of how well the fitted equation accounts for the total
variation of the dependent variable, Y. It is only computed when the
first parameter of the model is a constant, i.e., when the elements of
the first column of the design matrix are all equal.

* an approximation to the condition number of the design matrix, D(i,j)=
x(i,j) with x(i,j) defined in section A, under the assumption that the
absolute error in each column of D is roughly equal. The approximation
will be meaningless if this assumption is not valid; otherwise it
will usually underestimate the actual condition number by a factor of
from 2 to 10 [see Dongarra et al., 1979, p. 1.8 to 1.12]. (Note that
the condition number returned by the linear least squares subroutines
is not exactly the same as that returned by the nonlinear least
subroutines because of differences in the computational procedures
used by the two families of subroutines.)

F. Examples

User-Specified Model (Design Matrix). In the first example below, LLS is
used to compute the least squares solution for the example given on pages 61-
65 of Daniel and Wood [1971]. The results for this problem are also discussed
in Draper and Smith [1981], pages 372 and 373.

Polynomial Model (Design Matrix). In the second example, LLSP is used to
compute the least squares solution for the example given on page 311 of Miller
and Freund [1977].

<8-11>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE LLS USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y, XM AND RES MUST BE CHANGED TO
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(30), XM(30,5), RES(30)
DOUBLE PRECISION DSTAK(500)
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 500
IXM = 30
C
C READ NUMBER OF OBSERVATIONS AND NUMBER OF UNKNOWN PARAMETERS
C INDEPENDENT VARIABLES
C DEPENDENT VARIABLES
C
READ (5,100) N, NPAR
READ (5,101) ((XM(I,J), I=1,N), J=1,NPAR)
READ (5,101) (Y(I), I=1,N)
C
C PRINT TITLE AND CALL LLS TO PERFORM LINEAR LEAST SQUARES ANALYSIS
C
WRITE (IPRT,102)
CALL LLS (Y, XM, N, IXM, NPAR, RES, LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (2I5)
101 FORMAT (21F3.0)
102 FORMAT ('1RESULTS FROM STARPAC',
* ' LINEAR LEAST SQUARES SUBROUTINE LLS')
END

Data:

21 4
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
80 80 75 62 62 62 62 62 58 58 58 58 58 58 50 50 50 50 50 56 70
27 27 25 24 22 23 24 24 23 18 18 17 18 19 18 18 19 19 20 20 20
89 88 90 87 87 87 93 93 87 80 89 88 82 93 89 86 72 79 80 82 91
42 37 37 28 18 18 19 20 15 14 14 13 11 12 8 7 8 8 9 15 15

<8-12>
1RESULTS FROM STARPAC LINEAR LEAST SQUARES SUBROUTINE LLS
STARPAC 2.08S (03/15/90)
+***************************************************************
* LINEAR LEAST SQUARES ESTIMATION WITH USER-SPECIFIED MODEL *
***************************************************************

RESULTS FROM LEAST SQUARES FIT
-------------------------------

DEPENDENT PREDICTED STD DEV OF STD
ROW PREDICTOR VALUES VARIABLE VALUE PRED VALUE RESIDUAL RES

1 80.000000 27.000000 89.000000 42.000000 38.765363 1.7810630 3.2346372 1.19
2 80.000000 27.000000 88.000000 37.000000 38.917485 1.8285238 -1.9174853 -.72
3 75.000000 25.000000 90.000000 37.000000 32.444467 1.3553032 4.5555330 1.55
4 62.000000 24.000000 87.000000 28.000000 22.302226 1.1626690 5.6977742 1.88
5 62.000000 22.000000 87.000000 18.000000 19.711654 .74116599 -1.7116536 -.54
6 62.000000 23.000000 87.000000 18.000000 21.006940 .90284068 -3.0069397 -.97
7 62.000000 24.000000 93.000000 19.000000 21.389491 1.5186314 -2.3894907 -.83
8 62.000000 24.000000 93.000000 20.000000 21.389491 1.5186314 -1.3894907 -.48
9 58.000000 23.000000 87.000000 15.000000 18.144379 1.2143513 -3.1443789 -1.05
10 58.000000 18.000000 80.000000 14.000000 12.732806 1.4506369 1.2671941 .44
11 58.000000 18.000000 89.000000 14.000000 11.363703 1.2770503 2.6362968 .88
12 58.000000 17.000000 88.000000 13.000000 10.220540 1.5114771 2.7794604 .97
13 58.000000 18.000000 82.000000 11.000000 12.428561 1.2872987 -1.4285609 -.48
14 58.000000 19.000000 93.000000 12.000000 12.050499 1.4714398 -.50499291E-01 -.02
15 50.000000 18.000000 89.000000 8.0000000 5.6385816 1.4154780 2.3614184 .81
16 50.000000 18.000000 86.000000 7.0000000 6.0949492 1.1742308 .90505080 .30
17 50.000000 19.000000 72.000000 8.0000000 9.5199506 2.0821373 -1.5199506 -.61
18 50.000000 19.000000 79.000000 8.0000000 8.4550930 1.2997464 -.45509295 -.15
19 50.000000 20.000000 80.000000 9.0000000 9.5982566 1.3549990 -.59825656 -.20
20 56.000000 20.000000 82.000000 15.000000 13.587853 .91842680 1.4121473 .45
21 70.000000 20.000000 91.000000 15.000000 22.237713 1.7300647 -7.2377129 -2.64

<8-13>
1 STARPAC 2.08S (03/15/90)
+LINEAR LEAST SQUARES ESTIMATION WITH USER-SPECIFIED MODEL, CONTINUED

STD RES VS ROW NUMBER STD RES VS PREDICTED VALUES
3.75++---------+---------+----+----+---------+---------++ 3.75++---------+---------+----+----+---------+---------++
- - - -
- - - -
- - - -
- - - -
2.25+ + 2.25+ +
- - - -
- * - - * -
- * - - * -
-* - - *-
.75+ * * * + .75+* * * +
- * - - * -
- * * - - * * -
- * - - * -
- * * - - * * -
-.75+ * * * * * + -.75+ * * * * *+
- * * * - - * ** -
- - - -
- - - -
- - - -
-2.25+ + -2.25+ +
- - - -
- *- - * -
- - - -
- - - -
-3.75++---------+---------+----+----+---------+---------++ -3.75++---------+---------+----+----+---------+---------++
1.0 11.0 21.0 5.639 22.28 38.92

AUTOCORRELATION FUNCTION OF RESIDUALS NORMAL PROBABILITY PLOT OF STD RES
1++---------+---------+----***--+---------+---------++ 3.75++---------+---------+----+----+---------+---------++
- *** - - -
- * - - -
- * - - -
- ******* - - -
6+ ******* + 2.25+ +
- ** - - -
- **** - - * -
- *** - - * -
- **** - - * -
11+ *** + .75+ ** * +
- ****** - - * -
- * - - ** -
- ** - - * -
- *** - - ** -
16+ ** + -.75+ ** *** +
- ***** - - * * * -
- ****** - - -
- *** - - -
- **** - - -
21+ + -2.25+ +
- - - -
- - - * -
- - - -
- - - -
26++---------+---------+----+----+---------+---------++ -3.75++---------+---------+----+----+---------+---------++
-1.00 0.0 1.00 -2.5 0.0 2.5
<8-14>
1
STARPAC 2.08S (03/15/90)
+LINEAR LEAST SQUARES ESTIMATION WITH USER-SPECIFIED MODEL, CONTINUED

ANALYSIS OF VARIANCE
-DEPENDENT ON ORDER VARIABLES ARE ENTERED, UNLESS VECTORS ARE ORTHOGONAL-

PAR SUM OF SQUARES ------ PAR=0 ------ ------ PARS=0 -----
INDEX RED DUE TO PAR CUM MS RED DF(MSRED) CUM RES MS DF(RMS) F PROB(F) F PROB(F)

1 6448.76190 6448.76190 1 103.461905 20 613.035 .000 198.185 .000
2 1750.12199 4099.44195 2 16.7955845 19 166.371 .000 59.9022 .000
3 130.320772 2776.40156 3 10.4886297 18 12.3886 .003 6.66797 .007
4 9.96537226 2084.79251 4 10.5194095 17 .947332 .344 .947332 .344

RESIDUAL 178.8300 17
TOTAL 8518.000 21

<8-15>
1
STARPAC 2.08S (03/15/90)
+LINEAR LEAST SQUARES ESTIMATION WITH USER-SPECIFIED MODEL, CONTINUED

VARIANCE-COVARIANCE AND CORRELATION MATRICES OF THE ESTIMATED PARAMETERS
------------------------------------------------------------------------

- COVARIANCES ARE ABOVE THE DIAGONAL
- VARIANCES ARE ON THE DIAGONAL
- CORRELATION COEFFICIENTS ARE BELOW THE DIAGONAL

COLUMN 1 2 3 4

1 141.51474 .28758711 -.65179437 -1.6763208
2 .17926325 .18186730E-01 -.36510675E-01 -.71435215E-02
3 -.14887895 -.73564128 .13544186 .10476827E-04
4 -.90159992 -.33891642 .18214234E-03 .24427828E-01

------------------------- ESTIMATES FROM FIT ------------------------
+ ---- ESTIMATES FROM FIT OMITTING LAST PREDICTOR VALUE ----

ESTIMATED PARAMETER SD OF PAR T(PAR=0) PROB(T) ACC DIG*
+ ESTIMATED PARAMETER SD OF PAR T(PAR=0) PROB(T)

1 -39.9196744 11.8959969 -3.356 .004 14.1 -50.3588401 5.13832806 -9.801 .000
2 .715640200 .134858185 5.307 .000 14.1 .671154441 .126691047 5.298 .000
3 1.29528612 .368024265 3.520 .003 13.8 1.29535137 .367485444 3.525 .002
4 -.152122519 .156294043 -.9733 .344 14.1

RESIDUAL STANDARD DEVIATION 3.243364 3.238615
BASED ON DEGREES OF FREEDOM 21 - 4 = 17
+ 21 - 3 = 18

MULTIPLE CORRELATION COEFFICIENT SQUARED .9136

APPROXIMATE CONDITION NUMBER 1047.370

* THE NUMBER OF CORRECTLY COMPUTED DIGITS IN EACH PARAMETER USUALLY DIFFERS BY LESS THAN 1 FROM THE VALUE GIVEN HERE.

<8-16>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE LLSP USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y, X AND RES MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(30), X(30), RES(30)
DOUBLE PRECISION DSTAK(500)
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 500
C
C READ NUMBER OF OBSERVATIONS AND DEGREE OF THE POLYNOMIAL TO BE FIT
C INDEPENDENT AND DEPENDENT VARIABLES
C
READ (5,100) N, NDEG
READ (5,101) (X(I), I=1,N)
READ (5,101) (Y(I), I=1,N)
C
C PRINT TITLE AND CALL LLSP TO PERFORM LINEAR LEAST SQUARES ANALYSIS
C
WRITE (IPRT,102)
CALL LLSP (Y, X, N, NDEG, RES, LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (2I5)
101 FORMAT (10F5.1)
102 FORMAT ('1RESULTS OF STARPAC',
* ' LINEAR LEAST SQUARES SUBROUTINE LLSP')
END

Data:

9 2
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0
12.0 10.5 10.0 8.0 7.0 8.0 7.5 8.5 9.0

<8-17>
1RESULTS OF STARPAC LINEAR LEAST SQUARES SUBROUTINE LLSP
STARPAC 2.08S (03/15/90)
+***********************************************************
* LINEAR LEAST SQUARES ESTIMATION WITH POLYNOMIAL MODEL *
***********************************************************

RESULTS FROM LEAST SQUARES FIT
-------------------------------

DEPENDENT PREDICTED STD DEV OF STD
ROW PREDICTOR VALUES VARIABLE VALUE PRED VALUE RESIDUAL RES

1 0. 12.000000 12.184848 .41999992 -.18484848 -.61
2 1.0000000 10.500000 10.521212 .27284429 -.21212121E-01 -.05
3 2.0000000 10.000000 9.2233766 .23159593 .77662338 1.68
4 3.0000000 8.0000000 8.2913420 .24891687 -.29134199 -.64
5 4.0000000 7.0000000 7.7251082 .26115476 -.72510823 -1.63
6 5.0000000 8.0000000 7.5246753 .24891687 .47532468 1.05
7 6.0000000 7.5000000 7.6900433 .23159593 -.19004329 -.41
8 7.0000000 8.5000000 8.2212121 .27284429 .27878788 .64
9 8.0000000 9.0000000 9.1181818 .41999992 -.11818182 -.39

<8-18>
1 STARPAC 2.08S (03/15/90)
+LINEAR LEAST SQUARES ESTIMATION WITH POLYNOMIAL MODEL, CONTINUED

STD RES VS ROW NUMBER STD RES VS PREDICTED VALUES
3.75++---------+---------+----+----+---------+---------++ 3.75++---------+---------+----+----+---------+---------++
- - - -
- - - -
- - - -
- - - -
2.25+ + 2.25+ +
- - - -
- * - - * -
- - - -
- - - -
.75+ * + .75+* +
- * - - * -
- - - -
- * - - * -
- * *- - * * -
-.75+* * + -.75+ * *+
- - - -
- - - -
- * - - * -
- - - -
-2.25+ + -2.25+ +
- - - -
- - - -
- - - -
- - - -
-3.75++---------+---------+----+----+---------+---------++ -3.75++---------+---------+----+----+---------+---------++
1.0 5.0 9.0 7.525 9.855 12.18

AUTOCORRELATION FUNCTION OF RESIDUALS NORMAL PROBABILITY PLOT OF STD RES
1++---------+------*********----+---------+---------++ 3.75++---------+---------+----+----+---------+---------++
- ********* - - -
- **** - - -
- * - - -
- *** - - -
6+ * + 2.25+ +
- * - - -
- * - - * -
- - - -
- - - -
11+ + .75+ * +
- - - * -
- - - -
- - - * -
- - - * * -
16+ + -.75+ * * +
- - - -
- - - -
- - - * -
- - - -
21+ + -2.25+ +
- - - -
- - - -
- - - -
- - - -
26++---------+---------+----+----+---------+---------++ -3.75++---------+---------+----+----+---------+---------++
-1.00 0.0 1.00 -2.5 0.0 2.5
<8-19>
1
STARPAC 2.08S (03/15/90)
+LINEAR LEAST SQUARES ESTIMATION WITH POLYNOMIAL MODEL, CONTINUED

ANALYSIS OF VARIANCE
-DEPENDENT ON ORDER VARIABLES ARE ENTERED, UNLESS VECTORS ARE ORTHOGONAL-

PAR SUM OF SQUARES ------ PAR=0 ------ ------ PARS=0 -----
INDEX RED DUE TO PAR CUM MS RED DF(MSRED) CUM RES MS DF(RMS) F PROB(F) F PROB(F)

1 720.027778 720.027778 1 2.59027778 8 2696.46 .000 922.687 .000
2 8.81666667 364.422222 2 1.70079365 7 33.0178 .001 35.8017 .000
3 10.3033911 246.382612 3 .267027417 6 38.5855 .001 38.5855 .001

RESIDUAL 1.602165 6
TOTAL 740.7500 9

<8-20>
1
STARPAC 2.08S (03/15/90)
+LINEAR LEAST SQUARES ESTIMATION WITH POLYNOMIAL MODEL, CONTINUED

VARIANCE-COVARIANCE AND CORRELATION MATRICES OF THE ESTIMATED PARAMETERS
------------------------------------------------------------------------

- COVARIANCES ARE ABOVE THE DIAGONAL
- VARIANCES ARE ON THE DIAGONAL
- CORRELATION COEFFICIENTS ARE BELOW THE DIAGONAL

COLUMN 1 2 3

1 .17639993 -.82535747E-01 .80917399E-02
2 -.80268762 .59936673E-01 -.69357771E-02
3 .65431992 -.96215765 .86697213E-03

------------------------- ESTIMATES FROM FIT ------------------------
+ ---- ESTIMATES FROM FIT OMITTING LAST PREDICTOR VALUE ----

ESTIMATED PARAMETER SD OF PAR T(PAR=0) PROB(T) ACC DIG*
+ ESTIMATED PARAMETER SD OF PAR T(PAR=0) PROB(T)

1 12.1848485 .419999917 29.01 .000 14.1 10.4777778 .801574729 13.07 .000
2 -1.84653680 .244819675 -7.542 .000 14.1 -.383333333 .168364369 -2.277 .057
3 .182900433 .294443905E-01 6.212 .001 14.0

RESIDUAL STANDARD DEVIATION .5167470 1.304145
BASED ON DEGREES OF FREEDOM 9 - 3 = 6
+ 9 - 2 = 7

MULTIPLE CORRELATION COEFFICIENT SQUARED .9227

APPROXIMATE CONDITION NUMBER 118.0340

* THE NUMBER OF CORRECTLY COMPUTED DIGITS IN EACH PARAMETER USUALLY DIFFERS BY LESS THAN 1 FROM THE VALUE GIVEN HERE.

<8-21>
1G. Acknowledgments

The code and printed output for the linear least squares subroutines has
been modeled on the linear least squares code and output used by OMNITAB II
[Hogben et al., 1971].

<8-22>
1----- CHAPTER 9 -----

NONLINEAR LEAST SQUARES

A. Introduction

STARPAC contains 16 user-callable subroutines for nonlinear least squares
regression. Twelve of these are estimation subroutines that compute the least
squares solution as described below, performing either weighted or unweighted
regression with either numerically approximated or user-supplied (analytic)
derivatives. The estimation subroutines allow three levels of control of the
computations and printed output, and allow the user to specify a subset of the
parameters to be treated as constants, with their values held fixed at their
input values. This last feature allows the user to examine the results
obtained estimating various subsets of the parameters of a general model
without rewriting the model subroutine for each subset. The other four
subroutines described in this chapter are utility procedures which choose
optimum step sizes for numerically approximating the derivative and which
verify the correctness of user-supplied (analytic) derivatives.

Each of the subroutines described in this chapter assumes that the
observations of the dependent variable, y(i), are modeled by

y(i) = f(x(i),PAR) + e(i) for i = 1, ..., N,

where

N is the number of observations;

f is the function (nonlinear in its parameters) that models the ith
observation;

x(i) is the vector of the M independent variables at the ith observation;

PAR is the vector of the NPAR model parameters; and

e(i) is the unobservable random error in the ith observation, which is
estimated by the ith residual.

The least squares estimates of the parameters, PAR, are obtained using an
iterative procedure that requires the matrix of partial derivatives of the
model with respect to each parameter,

D(i,k) = partial [ f(x(i),PAR) wrt PAR(k) ]

for i = 1, ..., N and k = 1, ..., NPAR.

The derivative matrix may be supplied analytically or approximated
numerically.

The least squares solution is that which minimizes (with respect to PAR)
the residual sum of squares function,

N N
RSS(PAR) = SUM wt(i)*(y(i) - f(x(i),PAR))**2 = SUM wt(i)*e(i)**2
i=1 i=1

<9-1>
1here

wt(i) is the weight assigned to the ith observation (wt(i) = 1.0 in the
''unweighted'' case). Appendix B discusses several common applications
for weighted least squares.

The user must supply both initial values for the parameters and the
subroutine NLSMDL (described in section D) used to compute f(x(i),PAR), i = 1,
..., N, i.e., the predicted values of the dependent variable given the
independent variables and the parameter values from each iteration. Initial
parameter values should be chosen with care, since good values can
significantly reduce computing time.

STARPAC provides a variety of subroutines to accommodate many levels of
user sophistication and problem difficulty. Users are directed to section B
for a brief description of the subroutines. The declaration and CALL
statements are given in section C, and the subroutine arguments are defined in
section D. The algorithms used and the output produced by these subroutines
are discussed in section E. Sample programs and their output are shown in
section F.

B. Subroutine Descriptions

B.1 Nonlinear Least Squares Estimation Subroutines

The simplest of the 12 nonlinear least squares estimation subroutines,
NLS, requires neither user-supplied weights nor analytic derivatives. The
estimated results and a variety of statistics are automatically summarized in
a five-part printed report, and the estimated parameters and residuals are
returned to the user via the subroutine argument list (level one control,
described below). Most nonlinear least squares problems can be solved using
NLS.

The other 11 estimation subroutines add the weighting, derivative and
level two and three control features both singly and in combination, providing
greater flexibility to the user at the price of less simplicity. These
features are indicated by the suffix letter(s) on the subroutine name (e.g.,
NLSS and NLSWDC).

* Suffix W indicates user-supplied weights.

* Suffix D indicates user-supplied (analytic) derivatives.

* Suffix C indicates level two control of the computations.

* Suffix S indicates level three control of the computations.

The three levels of computation and printed output control are as
follows.

* In level one, a five-part printed report, discussed in detail in
section E.2.a, is automatically provided and the estimated model
parameters and residuals are returned to the user via the argument
list.

* Level two also returns the estimated parameters and residuals, and,
in addition, allows the user to supply arguments to indicate

<9-2>
1 - a subset of the model parameters to be treated as constants, with
their values held fixed at their input values;
- either the step sizes used to compute the numerical approximations
to the derivative, or, when user-supplied analytic derivatives are
used, whether they will be checked;
- the maximum number of iterations allowed;
- the convergence criteria;
- the scale (i.e., the typical size) of each parameter;
- the maximum change allowed in the parameters at the first
iteration;
- how the variance-covariance matrix is to be approximated; and
- the amount of printed output desired.

* Level three has all the features of level two, and, in addition
returns the following estimated values via the argument list:
- the number of nonzero weighted observations (only when a weighted
analysis is performed);
- the number of parameters actually estimated;
- the residual standard deviation;
- the predicted values;
- the standard deviations of the predicted values;
- the standardized residuals; and
- the variance-covariance matrix of the estimated parameters.

B.2 Derivative Step Size Selection Subroutines

When the partial derivatives used in the nonlinear least squares solution
are not available analytically, STARPAC subroutines approximate them
numerically. In this case, the subroutines can select optimum step sizes for
approximating the derivatives [see section E.1.b]. The user also has the
option of computing these step sizes independently of the estimation process
by calling either of the two step size selection subroutines directly. For
example, when planning to use the parameter fixing capability [argument
IFIXED] to examine several subsets of the parameters of a general model,
computing the step sizes first and passing them to the estimation subroutine
is more efficient than recomputing them each time the estimation subroutine is
called.

The simplest of the two user-callable step size selection subroutines,
STPLS, summarizes the step size selection information for each parameter in a
printed report and returns the step sizes to the user via the subroutine
argument list.

The second step size selection subroutine, STPLSC, differs from STPLS
only in that it enables the user to supply arguments to indicate
- the number of reliable digits in the model results;
- the number of exemptions allowed by the acceptance criteria,
specified as a proportion of the total number of observations (see
section E.1.b);
- the scale (i.e., the typical size) of each parameter; and
- the amount of printed output desired.

B.3 Derivative Checking Subroutines

When the partial derivatives used in the nonlinear least squares solution
are available analytically, the user can code them for use by the estimation

<9-3>
1subroutines [see section D, argument NLSDRV]. Because coding errors are a
common problem with user-supplied derivatives, the STARPAC estimation
subroutines automatically check the validity of the user-supplied derivative
code by comparing its results to numerically approximated values for the
derivative. When the results are questionable, the checking procedure
attempts to determine whether the problem lies with the user's code or with
the accuracy of the numerical approximation [see section E.1.c]. Although the
checking procedure is automatically available to the estimation subroutines
which accept user-supplied derivatives, the user may want to check the
derivative code independently of the estimation process. In these cases, the
user can call either of the two derivative checking subroutines directly, and
suppress checking by the estimation subroutines [see section D, argument
IDRVCK].

The simplest of the two derivative checking subroutines, DCKLS,
summarizes the results of the check in a printed report.

The second of the derivative checking subroutine, DCKLSC, differs from
DCKLS only in that it enables the user to supply arguments to indicate
- the number of reliable digits in the model results;
- the agreement tolerance;
- the scale (i.e., the typical size) of each parameter;
- the row at which the derivative is to be checked; and
- the amount of printed output desired.

C. Subroutine Declaration and CALL Statements

Nonlinear Least Squares Estimation Subroutines

The identifies declaration statements that are
needed by all of the nonlinear least squares estimation subroutines. The user
should substitute the following four statements for each occurrence of given below.

Y(n), XM(n,m), PAR(npar), RES(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
EXTERNAL NLSMDL

===

NLS: Compute and print a five-part weighted nonlinear least squares
analysis with numerically approximated derivatives; return parameter
estimates and residuals

:
:
CALL NLS (Y, XM, N, M, IXM, NLSMDL,
+ PAR, NPAR, RES, LDSTAK)

===

<9-4>
1
NLSC: Compute and optionally print a five-part unweighted nonlinear least
squares analysis with numerically approximated derivatives using
user-supplied control values; return parameter estimates and
residuals

INTEGER IFIXED(npar)
STP(npar), STOPSS, STOPP, SCALE(npar), DELTA
:
:
CALL NLSC (Y, XM, N, M, IXM, NLSMDL,
+ PAR, NPAR, RES, LDSTAK,
+ IFIXED, STP, MIT, STOPSS, STOPP,
+ SCALE, DELTA, IVAPRX, NPRT)

===

NLSS: Compute and optionally print a five-part unweighted nonlinear least
squares analysis with numerically approximated derivatives using
user-supplied control values; return parameter estimates, residuals,
number of nonzero weights, number of parameters estimated, residual
standard deviation, predicted values, standard deviations of the
predicted values and variance-covariance matrix of the estimated
parameters

INTEGER IFIXED(npar)
STP(npar), STOPSS, STOPP, SCALE(npar), DELTA
RSD, PV(n), SDPV(n), SDRES(n), VCV(npare,npare)
:
:
CALL NLSS (Y, XM, N, M, IXM, NLSMDL,
+ PAR, NPAR, RES, LDSTAK,
+ IFIXED, STP, MIT, STOPSS, STOPP,
+ SCALE, DELTA, IVAPRX, NPRT,
+ NPARE, RSD, PV, SDPV, SDRES, VCV, IVCV)

===

NLSW: Compute and print a five-part weighted nonlinear least squares
analysis with numerically approximated derivatives; return parameter
estimates and residuals

WT(n)
:
:
CALL NLSW (Y, WT, XM, N, M, IXM, NLSMDL,
+ PAR, NPAR, RES, LDSTAK)

===

<9-5>
1NLSWC: Compute and optionally print a five-part weighted nonlinear least
squares analysis with numerically approximated derivatives using
user-supplied control values; return parameter estimates and residuals

INTEGER IFIXED(npar)
WT(n)
STP(npar), STOPSS, STOPP, SCALE(npar), DELTA
:
:
CALL NLSWC (Y, WT, XM, N, M, IXM, NLSMDL,
+ PAR, NPAR, RES, LDSTAK,
+ IFIXED, STP, MIT, STOPSS, STOPP,
+ SCALE, DELTA, IVAPRX, NPRT)

===

NLSWS: Compute and optionally print a five-part weighted nonlinear least
squares analysis with numerically approximated derivatives using
user-supplied control values; return parameter estimates, residuals,
number of nonzero weights, number of parameters estimated, residual
standard deviation, predicted values, standard deviations of the
predicted values and variance-covariance matrix of the estimated
parameters

INTEGER IFIXED(npar)
WT(n)
STP(npar), STOPSS, STOPP, SCALE(npar), DELTA
RSD, PV(n), SDPV(n), SDRES(n), VCV(npare,npare)
:
:
CALL NLSWS (Y, WT, XM, N, M, IXM, NLSMDL,
+ PAR, NPAR, RES, LDSTAK,
+ IFIXED, STP, MIT, STOPSS, STOPP,
+ SCALE, DELTA, IVAPRX, NPRT,
+ NNZW, NPARE, RSD, PV, SDPV, SDRES, VCV, IVCV)

===

NLSD: Compute and print a five-part unweighted nonlinear least squares
analysis with user-supplied derivatives; return parameter estimates
and residuals

EXTERNAL NLSDRV
:
:
CALL NLSD (Y, XM, N, M, IXM, NLSMDL, NLSDRV,
+ PAR, NPAR, RES, LDSTAK)

===

<9-6>
1NLSDC: Compute and optionally print a five-part unweighted nonlinear least
squares analysis with user-supplied derivatives using user-supplied
control values; return parameter estimates and residuals

EXTERNAL NLSDRV
INTEGER IFIXED(npar)
STOPSS, STOPP, SCALE(npar), DELTA
:
:
CALL NLSDC (Y, XM, N, M, IXM, NLSMDL, NLSDRV,
+ PAR, NPAR, RES, LDSTAK,
+ IFIXED, IDRVCK, MIT, STOPSS, STOPP,
+ SCALE, DELTA, IVAPRX, NPRT)

===

NLSDS: Compute and optionally print a five-part unweighted nonlinear least
squares analysis with user-supplied derivatives using user-supplied
control values; return parameter estimates, residuals, number of
parameters estimated, residual standard deviation, predicted values,
standard deviations of the predicted values and variance-covariance
matrix of the estimated parameters

EXTERNAL NLSDRV
INTEGER IFIXED(npar)
STOPSS, STOPP, SCALE(npar), DELTA
RSD, PV(n), SDPV(n), SDRES(n), VCV(npare,npare)
:
:
CALL NLSDS (Y, XM, N, M, IXM, NLSMDL, NLSDRV,
+ PAR, NPAR, RES, LDSTAK,
+ IFIXED, IDRVCK, MIT, STOPSS, STOPP,
+ SCALE, DELTA, IVAPRX, NPRT,
+ NPARE, RSD, PV, SDPV, SDRES, VCV, IVCV)

===

NLSWD: Compute and print a five-part weighted nonlinear least squares
analysis with user-supplied derivatives; return parameter estimates
and residuals

EXTERNAL NLSDRV
WT(n)
:
:
CALL NLSWD (Y, WT, XM, N, M, IXM, NLSMDL, NLSDRV,
+ PAR, NPAR, RES, LDSTAK)

===

<9-7>
1NLSWDC: Compute and optionally print a five-part weighted nonlinear least
squares analysis with user-supplied derivatives using user-supplied
control values; return parameter estimates and residuals

EXTERNAL NLSDRV
INTEGER IFIXED(npar)
WT(n)
STOPSS, STOPP, SCALE(npar), DELTA
:
:
CALL NLSWDC (Y, WT, XM, N, M, IXM, NLSMDL, NLSDRV,
+ PAR, NPAR, RES, LDSTAK,
+ IFIXED, IDRVCK, MIT, STOPSS, STOPP,
+ SCALE, DELTA, IVAPRX, NPRT)

===

NLSWDS: Compute and optionally print a five-part weighted nonlinear least
squares analysis with user-supplied derivatives using user-supplied
control values; return parameter estimates, residuals, number of
nonzero weights, number of parameters estimated, residual standard
deviation, predicted values, standard deviations of the predicted
values and variance-covariance matrix of the estimated parameters

EXTERNAL NLSDRV
INTEGER IFIXED(npar)
WT(n)
STOPSS, STOPP, SCALE(npar), DELTA
RSD, PV(n), SDPV(n), SDRES(n), VCV(npare,npare)
:
:
CALL NLSWDS (Y, WT, XM, N, M, IXM, NLSMDL, NLSDRV,
+ PAR, NPAR, RES, LDSTAK,
+ IFIXED, IDRVCK, MIT, STOPSS, STOPP,
+ SCALE, DELTA, IVAPRX, NPRT,
+ NNZW, NPARE, RSD, PV, SDPV, SDRES, VCV, IVCV)

===

Step Size Selection Subroutines

STPLS: Compute and print optimum step sizes for numerically approximating
derivatives; return selected step sizes

XM(n,m), PAR(npar), STP(npar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
EXTERNAL NLSMDL
:
:
CALL STPLS (XM, N, M, IXM, NLSMDL, PAR, NPAR, LDSTAK, STP)

===

<9-8>
1STPLSC: Compute and optionally print optimum step sizes for numerically
approximating derivatives using user-supplied control values; return
selected step sizes

XM(n,m), PAR(npar), STP(npar)
EXMPT, SCALE(npar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
EXTERNAL NLSMDL
:
:
CALL STPLSC (XM, N, M, IXM, NLSMDL, PAR, NPAR, LDSTAK, STP,
+ NETA, EXMPT, SCALE, NPRT)

===

Derivative Checking Subroutines

DCKLS: Perform and print derivative checking analysis; return error code

XM(n,m), PAR(npar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
EXTERNAL NLSMDL, NLSDRV
:
:
CALL DCKLS (XM, N, M, IXM, NLSMDL, NLSDRV, PAR, NPAR, LDSTAK)

===

DCKLSC: Perform and optionally print derivative checking analysis using
user-supplied control values; return error code

XM(n,m), PAR(npar)
SCALE(npar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
EXTERNAL NLSMDL, NLSDRV
:
:
CALL DCKLSC (XM, N, M, IXM, NLSMDL, NLSDRV, PAR, NPAR, LDSTAK,
+ NETA, NTAU, SCALE, NROW, NPRT)

===

D. Dictionary of Subroutine Arguments and COMMON Variables

<9-9>
1

D <-- The matrix of exact dimension N by NPAR that contains the partial
derivatives of the model with respect to each parameter,
PAR(k), k = 1, ..., NPAR. This argument is used within derivative
subroutine NLSDRV [see argument NLSDRV below].

DELTA --> The maximum scaled change allowed in the parameters at the first
iteration [see section E.1.a]. The default value is 100.0. When
DELTA <= 0.0 or when DELTA is not an argument of the subroutine
CALL statement the default value is used. A smaller value of
DELTA may be appropriate if, at the first iteration, the
computation of the predicted values from the user's model subroutine
produces an arithmetric overflow or the parameters leave the
region of interest in parameter space. A reasonable alternative
to the default value of DELTA is an upper bound to the scaled
change that the estimated parameters should be allowed to make on
the first iteration,

DELTA = min(|del(PAR(k))|/SCALE(k), for k = 1, ..., NPAR)

where del(PAR(k)) is the maximum change allowed for the kth
parameter at the first iteration.

EXMPT --> The proportion used to compute the number of observations,
a = EXMPT*N, for which the forward difference quotient derivative
with respect to a given parameter is exempted from meeting the
acceptance criteria for step size selection [see section E.1.b].
The default value for EXMPT is 0.1 (10 percent). When the
user-supplied value is outside the range [0.0, 1.0], or when EXMPT is
not an argument of the subroutine CALL statement, the default
value is used.

IDRVCK --> The indicator variable used to designate whether or not the
user-supplied derivative subroutine is to be checked. When
IDRVCK <> 0 the derivative is checked, and when IDRVCK = 0 it is
not. The default value is IDRVCK <> 0. When IDRVCK is not an
argument of the subroutine CALL statement the default value is
used.

For the estimation subroutines:

IERR = 0 indicates that no errors were detected, and that the
iterations converged satisfactorily.

IERR = 1 indicates that improper input was detected.

<9-10>
1 IERR = 2 indicates that the computation of the residual sum of
squares using the initial parameter values produced an
arithmetic overflow. The user should reduce the size
of DELTA or should supply new starting values.

IERR = 3 indicates that the model is computationally singular,
which means the model has too many parameters near the
solution. The user should examine the model and data
to determine and remove the cause of the singularity.

IERR = 4 indicates that at least one of the standardized
residuals could not be computed because its standard
deviation was zero. The validity of the covariance
matrix is questionable.

IERR = 5 indicates false convergence [see section E.1.a].

IERR = 6 indicates that convergence was not reached in the
allowed number of iterations or model subroutine calls
[see argument MIT].

IERR = 7 indicates that the variance-covariance matrix could not
be computed.

For the step size selection subroutines:

IERR = 0 indicates that no errors were detected, and that all
the step sizes satisfied the selection criteria.

IERR = 1 indicates that improper input was detected.

IERR = 2 indicates that one or more of the step sizes did not
satisfy the selection criteria.

For the derivative checking subroutines:

IERR = 0 indicates that no errors were detected, and that the
user-supplied derivative code appears to be correct.

IERR = 1 indicates that improper input was detected.

IERR = 2 indicates that the user-supplied derivative code and
numerical derivatives do not agree for at least one
parameter, but that in each case of disagreement the
accuracy of the numerical derivatives is questionable.
Further testing is suggested.

IERR = 3 indicates that the user-supplied derivative code and
numerical derivatives do not agree for at least one
parameter, and in at least one instance of disagreement
there is no reason to doubt the numerical derivatives.

IFIXED --> The vector of dimension at least NPAR that contains values used to
indicate whether the corresponding parameter in PAR is to be
treated as a fixed constant or is to be estimated. If
IFIXED(I) > 0, PAR(I) will be held fixed at its input value; if
IFIXED(I) = 0, PAR(I) will be estimated using the least squares
procedure described in section A. The default values are

<9-11>
1 IFIXED(I) = 0, I = 1, ..., NPAR, i.e., all parameters are
estimated. When IFIXED(1) <= -1, or when IFIXED is not an
argument of the subroutine CALL statement, the default value will
be used.

IVCV --> The exact value of the first dimension of the matrix VCV as
specified in the calling program.

IVAPRX --> The indicator variable used to specify how the variance-covariance
matrix, VCV, is to be approximated. Three approximations are
available:

(1) VCV = RSD**2 * inv(trans(Dhat)*W*Dhat)

(2) VCV = RSD**2 * inv(Hhat)

(3) VCV = RSD**2 * inv(Hhat) * (trans(Dhat)*W*Dhat) * inv(Hhat)

where

trans(.) indicates the transpose of the designated matrix;

inv(.) indicates the inverse of the designated matrix;

Hhat is the matrix of second partial derivatives of the model with
respect to each parameter (the Hessian matrix), evaluated at the
solution

= trans(Dhat)*W*Dhat +
N
(SUM e(i)*wt(i)*(second partial of e(i) wrt PAR(j) & PAR(k)
i=1
for j = 1, ..., NPAR & k = 1, ..., NPAR));

W is an N by N diagonal matrix of weights,

W = diag(wt(i), i = 1, ..., N),

when a weighted analysis is performed, and is the identity matrix
otherwise, and

Dhat is the matrix that contains the partial derivatives of the
model with respect to each parameter (the Jacobian matrix),
evaluated at the solution.

Approximation (1) is based on the assumption that H is
approximately equal to trans(D)*W*D because the residuals are
sufficiently small at the solution; approximation (2) is based on
the assumption that the necessary conditions for asymptotic
maximum likelihood theory have been met; and approximation (3) is
based on the assumption that the necessary conditions for
asymptotic maximum likelihood theory may be violated. The results
of a study by Donaldson and Schnabel [1987] indicate that
approximation (1) is preferable because it is simple, less
expensive, more numerically stable and at least as accurate as
approximations (2) and (3). However, all approximations to the
variance-covariance matrix are subject to sampling variation
because they are computed using the estimated parameter values.

<9-12>
1 The variance-covariance matrix computed for any particular
nonlinear least squares solution should thus be regarded as only a
rough estimate [Bard, 1974; Donaldson and Schnabel, 1987].

If IVAPRX = 1 or 4 then approximation (1) is used;
= 2 or 5 then approximation (2) is used; and
= 3 or 6 then approximation (3) is used.

If IVAPRX = 1, 2, or 3, then, when user-supplied analytic
derivatives are available [see argument NLSDRV], they are used to
compute VCV; if IVAPRX = 4, 5, or 6, then only the predicted
values from the model subroutine are used to compute VCV. When
analytic derivatives are available, options 1, 2, or 3, will
generally result in a faster, more accurate computation of VCV.

The default value for IVAPRX is 1. When argument IVAPRX is
outside the range [1, 6], or when IVAPRX is not an argument of the
subroutine CALL statement, then the default value will be used.

IXM --> The exact value of the first dimension of the matrix XM as
specified in the calling program.

For NLS, NLSC, NLSS, NLSW, NLSWC and NLSWS:

LDSTAK >= 27 + max(IS*(N+NPAR), 30+NPARE) +

max(IS*10*N, 94+N*(3+NPAR)+(3*NPARE**2+37*NPARE)/2)*P

with IS = 1 if default values are used for the derivative step
sizes, and IS = 0 otherwise.

For NLSD, NLSDC, NLSDS, NLSWD, NLSWDC and NLSWDS:

LDSTAK >= 45 + NPAR + (94+N*(3+NPAR)+(3*NPARE**2+35*NPARE)/2)*P

For STPLS and STPLSC:

LDSTAK >= 27 + (N+NPAR) + 10*N*P

For DCKLS and DCKLSC:

LDSTAK >= 14 + NPAR + (N*NPAR+N+NPAR)*P

M --> The number of independent variables, i.e., the number of columns
of data in XM.

MIT --> The maximum number of iterations allowed. This argument is also
used to compute the maximum number of model subroutine calls,
(2*MIT). The iterations will stop if either limit is reached,
although, as a rule, the maximum number of iterations will be
reached first. The default value for the maximum number of
iterations is 21. When MIT <= 0 or when MIT is not an argument of
the subroutine CALL statement the default value will be used.

<9-13>
1
N --> The number of observations.

NETA --> The number of reliable decimal digits in the predicted values (PV)
computed by the user's model subroutine. The default value for
NETA is experimentally determined by the procedure described in
Appendix C. The default value will be used when NETA is not an
argument in the subroutine CALL statement, or when the
user-supplied value of NETA is outside the range [1, DIGITS],
where DIGITS is the number of decimal digits carried by the user's
computer for a single precision value when the single precision
version of STARPAC is being used and is the number carried for a
double precision value otherwise.

NLSDRV *** The name of the user-supplied subroutine that computes the partial
derivative matrix (Jacobian). This argument must be listed in an
EXTERNAL statement in the program which calls the STARPAC
estimation or derivative checking subroutine. The form of the
derivative subroutine argument list and dimensioning statements
must be exactly as shown below, although if there is only one
independent variable (M = 1), XM may be declared to be a vector
with dimension IXM.

SUBROUTINE NLSDRV (PAR, NPAR, XM, N, M, IXM, D)
PAR(NPAR), XM(IXM,M), D(N,NPAR)

< Computations for D(I,J), I = 1, ..., N and J = 1, ..., NPAR >

RETURN
END

NLSMDL *** The name of the user-supplied subroutine that computes the
predicted value of the dependent variable given the independent
variables and the current values of the model parameters. This
argument must be listed in an EXTERNAL statement in the program
which calls the STARPAC estimation, step size selection, and/or
derivative checking subroutines. The form of the model subroutine
argument list and dimensioning statements must be exactly as shown
below, although if there is only one independent variable (M = 1),
XM may be declared to be a vector with dimension IXM.

SUBROUTINE NLSMDL (PAR, NPAR, XM, N, M, IXM, PV)
PAR(NPAR), XM(IXM,M), PV(N)

< Computations for PV(I), I = 1, ..., N >

RETURN
END

NNZW <-- The number of observations with nonzero weights. N.B., this value
is returned by the estimation subroutines.

NPAR --> The number of parameters in the model, including both those held
fixed at their starting values and those which are to be
estimated.

<9-14>
1NPARE <-- The number of parameters actually estimated, i.e., the number of
zero elements in IFIXED. N.B., this value is returned by the
estimation subroutines.

NPRT --> The argument controlling printed output.

For the estimation subroutines:

NPRT is a five-digit integer, in which the value of the Ith
digit (counting from left to right) is used to control the Ith
section of the output.

If the Ith digit = 0 the output from the Ith section is
suppressed;
= 1 the brief form of the Ith section is given;
>=2 the full form of the Ith section is given.

The default value for NPRT is 11112. When NPRT <= -1, or when
NPRT is not an argument in the subroutine CALL statement, the
default value will be used. If the convergence criteria are not
satisfied the subroutine gives a suitable warning and provides a
printed report even if NPRT = 0. A full discussion of the
printed output is given in section E.2.a and is summarized as
follows.

Section 1 lists the starting estimates and control values.
Brief output and full output are the same for this
section.

Section 2 reports the results of the iterations. Brief output
includes information only about the first and last
iteration while full output includes information about
all of the iterations.

Section 3 provides information for each observation based on the
final solution. Brief output includes information for
the first 40 observations while full output provides
the information for all of the data.

Section 4 is a set of four residual plots. Brief output and
full output are the same for this section.

Section 5 is the final summary of the estimated parameters.
Brief output does not include printing the
variance-covariance matrix while full output does.

For the step size selection and derivative checking subroutines:

If NPRT = 0 the printed output is suppressed.

If NPRT <> 0 the printed output is provided.

When the acceptance criteria are not met a printed report is
provided even if NPRT = 0.

NROW --> The row of the independent variable matrix at which the
user-supplied derivative code is to be checked. The default value
is the first row with no independent variables equal to zero; when

<9-15>
1 all rows have one or more independent variables equal to zero, row
one will be used for the default value. When the user-supplied
value is outside the range [1, N] or when NROW is not an argument
of the subroutine CALL statement the default value will be used.

NTAU --> The agreement tolerance, i.e., the number of digits of agreement
required between the user-supplied derivatives and the derivatives
numerically approximated by the derivative checking subroutine.
The default value is NETA/4. When the user-supplied value of NTAU
is outside the range [1, NETA/2] or when NTAU is not an argument
of the subroutine CALL statement the default value will be used.

PAR --- The vector of dimension at least NPAR that contains the parameter
values. For all estimation subroutines it must contain initial
values for the parameters on input and will contain the final
values on return. For the step size and derivative checking
subroutines it must contain the parameter values at which the
operations are to be performed.

PV <-- The vector of dimension at least N that contains the predicted
values of the dependent variable at the solution,

PV(i) = f(x(i),PAR) for i = 1, .., N.

RES <-- The vector of dimension at least N that contains the residuals at
the solution,

RES(i) = y(i) - f(x(i),PAR) = e(i) for i = 1, ..., N.

RSD <-- The residual standard deviation at the solution,

RSD = sqrt(RSS(PAR)/(NNZW-NPARE)).

SCALE --> The vector of dimension at least NPAR that contains the scale, or
typical size, of each parameter. The vector SCALE is used to
normalize the size of each parameter so that

|PAR(j)/SCALE(j)| approximates |PAR(k)/SCALE(k)|

for k = 1, ..., NPAR and j = 1, ..., NPAR.

Values of |SCALE(k)| > |PAR(k)| can be used to increase the step
size in cases where the model function is known to be insensitive
to small changes in the value PAR(k).

For the estimation subroutines:

The default values for SCALE are selected by the NL2SOL
algorithm [Dennis et al., 1981a,b] and are updated at each
iteration. When SCALE is not an argument in the subroutine CALL
statement or when the user-supplied value for SCALE(1) <= 0 the
default procedure will be used to select scale values. When
SCALE(1) > 0, values of SCALE(k) <= 0 for k = 2, ..., NPAR will
be interpreted as an input error. User-supplied scale values
may be either a vector of the typical size of each parameter or
a vector of ones if the typical sizes of the parameters are
roughly equal; user-supplied scale values can sometimes result

<9-16>
1 in reduced computing time since these values are not updated at
each iteration.

For the derivative checking and step size selection subroutines:

The default values of SCALE are defined for k = 1, ..., NPAR as:

SCALE(k) = 1.0 if PAR(k) = 0.0

SCALE(k) = |PAR(k)| otherwise

where PAR(k) is the input value of the k-th parameter.

When SCALE is not an argument in the subroutine CALL statement
or when the user-supplied value of |SCALE(k)| <= |PAR(k)| the
default value for SCALE(k) is used. When SCALE(1) <= 0, the
default values will be used for SCALE(k), k = 1, ..., NPAR.
When SCALE(1) > 0, values of SCALE(k) <= 0 for k = 2, ..., NPAR
will be interpreted as an input error.

SDPV <-- The vector of dimension at least N that contains an approximation
to the standard deviation of each predicted value at the
solution,

SDPV(i) = the ith diagonal element of sqrt(Dhat*VCV*trans(Dhat))

for i = 1, ..., N, where

Dhat(i,j) = partial [ f(x(i),PAR) wrt PAR(j) ]

for i = 1, ..., N and j = 1, ..., NPAR, evaluated at the solution,
and trans(Dhat) is the transpose of Dhat.

This approximation is based on a linearization of the model in
the neighborhood of the solution; the validity of the
approximation depends on the nonlinearity of the model. This
approximation may be extremely inaccurate for a problem with a
highly nonlinear model.

SDRES <-- The vector of dimension at least N that contains an approximation
to the standardized residuals at the solution,

SDRES(i) = RES(i)/sqrt[(RSD**2/WT(i)) - SDPV(i)**2]

for i = 1, ..., N, which is the ith residual divided by its
individual estimated standard deviation. This approximation is
based on a linearization of the model in the neighborhood of the
solution; the validity of the approximation depends on the
nonlinearity of the model. This approximation may be extremely
inaccurate for a problem with a highly nonlinear model.

STOPP --> The stopping value for the convergence test based on the maximum
scaled relative change in the parameters at the most recent
iteration. The convergence criterion is satisfied if the current
step is a Newton step and

where PARc(k) and PARp(k) indicate the current value and the value
from the previous iteration, respectively, of the kth parameter
[see Dennis et al. 1981a]. This convergence test is roughly
equivalent to the test based on the maximum relative change in
each parameter as measured by

max(|PARc(k)-PARp(k)|/|PARp(k)| for k = 1, ..., NPAR).

STOPP is not a scale-dependent value; if its value is 10**(-4)
then this criteria will be met when the first four digits of each
parameter are the same at two successive iterations regardless of
the size of the parameter values.

The default value is approximately 10**(-DIGITS/2), where DIGITS
is the number of decimal digits carried by the user's computer for
a single precision value when the single precision version of
STARPAC is being used and is the number carried for a double
precision value otherwise. When the user-supplied value for STOPP
is outside the interval [0.0, 1.0] or when STOPP is not an
argument of the subroutine CALL statement the default value will
be used.

STOPSS --> The stopping value for the convergence test based on the ratio of
the forecasted change in the residual sum of squares,
fcst(RSS(PAR)), to the residual sum of squares from the previous
iteration. The convergence criterion is satisfied if certain
conditions are met and

fcst(RSS(PAR))/RSS(PARp) < STOPSS,

where the notation is described in the description of argument
STOPP [see Dennis et al., 1981a]. This convergence test is
roughly equivalent to the test based on the relative change in the
residual standard deviation between two iterations as measured by
(RSDc - RSDl)/RSDc. STOPSS is not a scale-dependent value; if its
value is 10**(-5) this criteria will be met when the first five
digits of the residual sum of squares are the same at two
successive iterations regardless of the size of the residual sum
of squares.

The default value is approximately the maximum of 10**(-10) and
10**(-2*DIGITS/3), where DIGITS is the number of decimal digits
carried by the user's computer for a single precision value when
the single precision version of STARPAC is being used and is the
number carried for a double precision value otherwise. When the
user-supplied value for STOPSS is outside the interval
[10**(-DIGITS), 0.1] or when STOPSS is not an argument of the
subroutine CALL statement the default value will be used.

STP --- The vector of dimension at least NPAR that contains the relative
step sizes used to approximate the derivative matrix numerically.
It is input to the estimation subroutines and returned from the
step size selection subroutines. The procedure used to select the
default values is described in section E.1.b. For the estimation

<9-18>
1 subroutines, when STP is not an argument of the subroutine CALL
statement or when STP(1) <= 0 the default values will be used for
all of the step sizes, and when STP(1) > 0 values of STP(k) <= 0
for k = 2, ..., NPAR will be interpreted as an input error.

VCV <-- The matrix of dimension at least NPARE by NPARE that contains the
variance-covariance matrix of the estimated parameters,
approximated as designated by argument IVAPRX. The parameters which are
held fixed [see argument IFIXED] are not included in the
variance-covariance matrix.

The approximation of the variance-covariance matrix is based on a
linearization of the model in the neighborhood of the solution;
the validity of the approximation depends on the nonlinearity of
the model. This approximation may be extremely inaccurate for a
problem with a highly nonlinear model.

XM --> The matrix of dimension at least N by M whose jth column contains
the N values of the jth independent variable, j = 1, ..., M.

Y --> The vector of dimension at least N that contains the dependent
variable.

E. Computational Methods

E.1 Algorithms

E.1.a Nonlinear Least Squares Estimation

The nonlinear least squares estimation subroutines use the NL2SOL
software package written by Dennis et al., [1981a,b]. The observations of the
dependent variable, which are measured with error, are iteratively fit to a
nonlinear model by minimizing the sums of squares of the errors as described
in section A. The iterations continue until the convergence criteria based on
the change in the parameter values or in the residual sum of squares are
satisfied [see section D, arguments STOPP and STOPSS], the maximum number of
iterations (or model subroutine calls) is reached [see section D, argument
MIT], or the iterations are terminated due to singularity in the model or
false convergence. All but the first of these stopping conditions may
indicate computational problems and will produce an error report [see chapter
1, section D.5].

Singular convergence means that the model contains too many parameters,
at least near the solution, while false convergence can indicate that either
STOPSS or STOPP is set too small for the accuracy to which the model and its
derivatives are being computed or that there is an error or discontinuity in
the derivative. Users should examine their models to determine and correct
the underlying cause of singular or false convergence.

<9-19>
1 Iterative procedures for solving nonlinear least squares problems are
discussed in Dennis and Schnabel [1983], Draper and Smith [1981] and Kennedy
and Gentle [1980]. The specific procedure used in STARPAC is as follows. At
the current iteration the values of the parameter vector PARc are given by

PARc = PARp - inv(trans(Dp)*W*Dp + Sp + Gp)*trans(Dp)*W*trans(ep)

subject to the restriction that

NPAR
sqrt ( SUM [(PARc(k) - PARp(k))/SCALE(k)]**2 ) <= dp,
k=1

where

trans(.) is the transpose of the designated matrix.

PARp is the vector of the NPAR estimated parameter values from the previous
iteration.

Dp is the N by NPAR matrix of the partial derivatives evaluated at PARp,

D(i,k) = partial[ f(x(i),PAR) wrt PAR(k) ]

for i = 1, ..., N and k = 1, ..., NPAR.

W is an N by N diagonal matrix of user-supplied weights,

W = diag(wt(i), i = 1, ..., N)

when a weighted analysis is performed and is the identity matrix
otherwise.

Sp is an APPROXIMATION to the exact term Sp* in the matrix of second
order terms (Hessian) of the Taylor series expansion of the residual
sum of squares function,

N
Sp*(j,k) = SUM [ep(i)*wt(i)*
i=1

(second partial ep(i) wrt PARp(j) & PARp(k))],

for j = 1, ..., NPAR and k = 1, ..., NPAR.

ep is the vector of the N residuals from the previous iteration.

dp is the adaptively chosen diameter of the trust region, i.e., the
region in which the local approximation to the user's model function
is reliable. At each iteration, dp is computed based on information
from the previous iteration. At the first iteration, the initial
value, d0, is supplied by argument DELTA which can be used to control
the change in the parameters permitted at the first iteration.

Gp is an NPAR by NPAR diagonal matrix,

Gp = diag(gp/SCALE(k), k = 1, ..., NPAR),

<9-20>
1 where gp is chosen to approximate the smallest non-negative number
such that the restriction given above on the size of the change in the
parameters is satisfied.

The second order term Sp*, which is expensive and difficult to compute
accurately, is important only if it is large compared to the term
trans(Dp)*W*Dp, that is, when the residuals are large or the model is highly
nonlinear. When Sp* is large compared to trans(Dp)*W*Dp, algorithms which
ignore it, such as Levenberg-Marquardt or Gauss-Newton, may converge slowly.
The NL2SOL algorithm used by STARPAC, however, adaptively decides when
inclusion of this term is necessary for reliable results and uses an
inexpensive approximation to Sp* in those cases.

The matrix, D, of partial derivatives of the model with respect to each
parameter is either computed analytically using a user-supplied subroutine,
NLSDRV, or is numerically approximated using forward difference quotients as
described in section E.1.b. When the derivatives are approximated
numerically, the least squares solution, especially the variance-covariance
matrix, can be sensitive to the step sizes used for the approximation. The
user may want to use STARPAC subroutines STPLS or STPLSC to recompute the step
sizes at the solution provided by the estimation subroutines to assure that
the step sizes which were used are still acceptable. If there is a
significant change in the step size the least squares solution should be
recomputed with the new step sizes from the current point. In addition, if
the estimation subroutine has convergence problems the user may want to
recompute the step sizes with the most recent parameter values to see if a
change in the curvature of the model, which will be reflected as a change in
the optimum step sizes, is causing the problem.

Dennis et al. [1981a] provides a detailed description of the algorithm
used in STARPAC. STARPAC also includes the subroutines NL2SOL, NL2SNO, and
NL2ITR, which they reference, and which can be used as documented by them [see
Dennis et al., 1981b].

E.1.b Derivative Step Size Selection

The STARPAC step size selection subroutines use an algorithm developed by
Schnabel [1982] to compute optimum step sizes for approximating the partial
derivatives of the model with respect to each parameter. Briefly, the
relative step sizes selected by these subroutines are those which produce
forward difference quotient approximations to the derivative, Dfd, that agree
reasonably well with the central difference quotient approximations, Dcd. The
central difference quotient approximations are twice as accurate but also
twice as expensive to compute. Since the additional accuracy is not usually
needed, central difference quotient approximations are not used by the
estimation subroutines.

The number of reliable digits in these derivatives is a function of the
step sizes used to compute them. Given properly chosen step sizes, the number
of reliable digits in Dfd and Dcd will be approximately h/2 and h,
respectively, where h is the number of reliable digits in the predicted
values, PV, from the user's model subroutine. For example, if the predicted
values are computed using an iterative procedure (such as quadrature or a
solution of partial differential equations) which is expected to provide five
good digits, then h would be five; if the predicted values are calculated from
a simple algebraic expression translated directly into Fortran code, then h

<9-21>
1would (usually) be the number of decimal digits carried by the user's computer
for the results.

The relative step size for PAR(k), k = 1, ..., NPAR, is initially

STP(k) = 2*sqrt*(10**(-NETA)/q) for k = 1, ..., NPAR,

where

q is the average curvature (estimated by STARPAC) of the model with
respect to PAR(k).

The forward difference quotient approximations with respect to PAR(k), k = 1,
..., NPAR are then

f(x(i),PARk) - f(x(i),PAR)
Dfd(i,k) = ---------------------------- for i = 1, ..., N,
STP(k)*SCALE(k)*SIGN(PAR(k))

where

f is the function which models the ith observation.

x(i) is the vector of the values of the M independent variables at the ith
observation.

PAR is the vector of the NPAR parameter values.

PARk is a vector which has the same values as PAR except that the kth
parameter is equal to

PAR(k) + STP(k)*SCALE(k)*SIGN(PAR(k)).

SIGN is a function which returns the sign of its argument.

The central difference approximations to the model derivative with
respect to PAR(k), k = 1, ..., NPAR, are

f(x(i),PARpk) - f(x(i),PARmk)
Dcd(i,k) = -------------------------------------------- for i = 1, ..., N,
(3*10**(-NETA))**(1/3)*SCALE(k)*SIGN(PAR(k))

where

PARpk is a vector which has the same values as PAR except that the kth
parameter is equal to

PAR(k) + (3*10**(-NETA))**(1/3)*SCALE(k)*SIGN(PAR(k)).

PARmk is a vector which has the same values as PAR except that the kth
parameter is equal to

PAR(k) - (3*10**(-NETA))**(1/3)*SCALE(k)*SIGN(PAR(k)).

The relative step size is considered acceptable if, for at least N-a
observations,

|Dfd(i,k) - Dcd(i,k)| <= min(10**(-NETA/4), .02) for i = 1, ..., N,

<9-22>
1
where a is the number of observations exempted from meeting the above
acceptance criterion [see section D, argument EXMPT]. If the step size is not
acceptable, it is adjusted by factors of 10 until the condition is met or
until no further decrease in the number of failures can be made, although in
no case will the selected relative step size be greater than 1.0 or less than
10**(-NETA).

Note that the step size selection subroutines will return the selected
step sizes even when the number of failures exceeds the allowed value; this
condition will be noted by the value of IERR. The detailed printed output
should always be examined for problems discovered by the step size selection
subroutines.

E.1.c Derivative Checking

The STARPAC derivative checking subroutines use an algorithm developed by
Schnabel [1982] to determine the validity of the user-supplied derivative
subroutine. The user-supplied derivative subroutine is considered correct for
a given row i, i = 1, ..., N, and parameter PAR(k), k = 1, ..., NPAR, if

|Dfd(i,k) - D(i,k)| <= 10**(-t)*|D(i,k)|
where

D is the derivative computed by the user's subroutine.

Dfd is the forward difference quotient approximation to the derivative
described in section E.1.b.

t is the agreement tolerance, i.e., number of digits of agreement
required between D and Dfd, which must be less than or equal to the
number of good digits in Dfd [see section D, argument NTAU].

When the agreement tolerance is not satisfied the checking subroutine
attempts to determine whether the disagreement is due to an error in the
user's code or is due to the inaccuracy of the difference quotient
approximation, caused either by high curvature in the user's model or by
significant roundoff error.

The derivative checking subroutines each check only one row of the
derivative matrix. The user should examine the row at which the derivatives
were checked to ensure that some relation between the parameters and
independent variables, such as a zero-valued independent variable or a factor
(x(i) - PAR(k)) when x(i) = PAR(k), is not hiding the effect of an incorrectly
computed derivative. Checking only one row is appropriate since the same code
is frequently used to compute the model function and derivatives at each row i
= 1, ..., N, as is the case in the examples shown in section F. If the code
used to express the model function and derivatives is not the same for each
row, then each distinct section of the code should be checked by making
multiple calls to DCKLSC with argument NROW set to a row within each section.

<9-23>
1E.2 Computed Results and Printed Output

E.2.a The Nonlinear Least Squares Estimation Subroutines

The argument controlling the printed output, NPRT, is discussed in
section D.

The output from the nonlinear least squares estimation subroutines
consists of five sections, several of which include tables summarizing the
results. In the following descriptions, the actual table headings are given
by the uppercase phrases enclosed in angle braces (<...>). Results which
correspond to input or returned subroutine CALL statement arguments are
identified by the argument name in uppercase (not enclosed in angle braces).

Section 1 provides a summary of the initial estimates and control values. It
lists the following information.

* The initial values of the parameters, PAR, and whether they are to be
held fixed or not as specified by IFIXED.

* The scale values, SCALE.

* Either the step sizes used to approximate the derivatives numerically,
or, when user-supplied (analytic) derivatives are used, the results of
the checking procedure; and the control values used in these
computations as applicable [see section E.1.b and section E.1.c].

* The number of observations, N.

* The number of observations with nonzero weights, NNZW.

* The number of independent variables, M.

* The maximum number of iterations allowed, MIT.

* The maximum number of model subroutine calls allowed.

* The two convergence criteria, STOPSS and STOPP.

* The maximum change in the parameters allowed at the first iteration,
DELTA.

* The residual sum of squares computed using the starting parameter
values.

* The residual standard deviation, RSD, computed using the starting
parameter values.

Section 2 lists selected information about each iteration and includes the
reason the iterations were terminated. The information provided for
each iteration includes the following.

* The iteration number.

<9-24>
1 * : the total number of times since execution began that
the user's model subroutine has been called, not including calls
required to approximate the derivatives numerically.

* : the residual standard deviation computed using the parameter
values from the current iteration.

* : the residual sum of squares computed using the parameter values
from the current iteration.

* : the relative change in the residual sum of squares
caused by the current iteration.

* : the forecasted relative change in the
residual sum of squares at the current iteration, and whether this
value was checked against STOPSS ( = Y) or not ( = N).

* : the maximum scaled relative change in the parameters
at the current iteration, and whether this value was checked against
STOPP ( = Y) or not ( = N).

* : the estimated parameter values resulting
from the current iteration.

Section 3 provides the following information for each observation, i = 1, ...,
N, based on the final solution.

* : the row number of the observations.

* : the values for up to the first three columns of
the independent variable matrix, XM, not including the first column if
it is constant.

* : the values of the dependent variable, Y.

* : the estimated predicted values, PV, from the fit.

* : the standard deviations of the predicted
values, SDPV.

* : the error estimates, RES.

* : the standardized residuals, SDRES.

* : the user-supplied weights, WT, printed only when weighted
analysis is performed.

Section 4 displays the following plots of the standardized residuals.

* The standardized residuals versus row numbers.

* The standardized residuals versus predicted values.

* The autocorrelation function of the (non-standardized) residuals.

* The normal probability plot of the standardized residuals.

<9-25>
1

Section 5 summarizes the following information about the final parameter
estimates and their variances.

* The variance-covariance matrix, VCV, of the estimated (unfixed)
parameters, and the corresponding correlation matrix,

rjk = VCV(j,k) / sqrt(VCV(j,j)*VCV(k,k)) for j = 1, ..., NPARE
and k = 1, ..., NPARE.

* : the final value of each parameter, PAR(k), k = 1, ...,
NPAR.

* : the standard deviation of each estimated parameter,

sqrt(VCV(k,k)) for k = 1, ..., NPAR.

* : the ratio of each estimated parameter to its standard
deviation,

RATIO(k) = PAR(k) / sqrt(VCV(k,k)) for k = 1, ..., NPAR.

* : the lower and upper
95-percent confidence limits for each parameter, computed using the
appropriate value of the Student's t distribution with NNZW-NPARE
degrees of freedom.

* the residual sum of squares, RSS(PAR).

* the residual standard deviation at the solution, RSD.

* the residual degrees of freedom, NNZW-NPARE.

* an approximation to the condition number of the derivative matrix, D
(the Jacobian), under the assumption that the absolute error in each
column of D is roughly equal. The approximation will be meaningless
if this assumption is not valid; otherwise it will usually
underestimate the actual condition number by a factor of from 2 to 10
[see Dongarra et al., 1979, p. 9.5]. (Note that the condition number
returned by the nonlinear least squares subroutines is not exactly the
same as that returned by the linear least subroutines because of
differences in the computational procedures used by the two families
of subroutines.)

NOTE: The standard deviation of the predicted values, the standardized
residuals, the variance-covariance matrix, the standard deviations of the
parameters and the 95-percent confidence limits on the parameters are all
based on a linear approximation to the model in a neighborhood of the
solution; the validity of this approximation depends on the nonlinearity of
the model. The statistics based on this approximation may be extremely
inaccurate for a problem with a highly nonlinear model.

E.2.b The Derivative Step Size Selection Subroutines

The argument controlling the printed output, NPRT, is discussed in
section D.

<9-26>
1
The output from the step size selection subroutines consists of a summary
of the input and control values and, for each parameter, the selected relative

<9-27>
1step size, the number of observations at which this step size failed the step
size selection criteria and the row numbers at which the failures occurred.

E.2.c The Derivative Checking Subroutines

The argument controlling the printed output, NPRT, is discussed in
section D.

The output for the derivative checking subroutines consists of a summary
of the input and control values and the results of the derivative checking
test with respect to each of the model parameters, PAR(k), k = 1, ..., NPAR.
The possible test results are:

OK -

* The user-supplied derivative and the numerical derivative agree to the
required number of digits.

QUESTIONABLE -

* The user-supplied derivative and the approximated derivative agree to
the required number of digits but both are equal to zero. The user
should recheck the derivative at another row.

* The user-supplied derivative and the approximated derivative do not
agree to the required number of digits but the user-supplied
derivative is identically zero and the approximated derivative
is nearly zero. The user should recheck the derivative at another row.

* The user-supplied derivative and the approximated derivative disagree
but the user-supplied derivative is identically zero. The user should
recheck the derivative at another row.

* The user-supplied derivative and the approximated derivative disagree
but the validity of the approximated derivative is questionable
because either the ratio of the relative curvature of the model to the
slope of the model is too high or SCALE(k) is wrong.

* The user-supplied derivative and the approximated derivative disagree
but the validity of the estimated derivative is questionable because
the ratio of the relative curvature of the model to the slope of the
model is too high.

INCORRECT -

* The user-supplied derivative and the approximated derivative disagree,
and there is no reason to question the accuracy of the approximated
derivative.

F. Examples

The sample programs of this section use the model and data given in
example one, pages 428 to 441 of Daniel and Wood [1980]; the model is

f(x(i),b) = PAR(1)*x(i,1)**PAR(2) for i = 1, ..., N.

<9-28>
1
Nonlinear Least Squares Estimation. In the first example program below,
NLS is used to compute the least squares solution using numerically
approximated derivatives. In the second example program, NLSD is used to
compute the least squares solution given analytic derivatives.

Derivative Step Size Selection. In the third example program, STPLS is
used to compute the optimum step sizes for numerically approximating the
derivatives with respect to each of the parameters, PAR(k), k = 1, 2.

Derivative Checking. In the fourth example program below, DCKLS is used
to check the validity of a user-supplied derivative subroutine. The
derivative subroutine has been intentionally coded incorrectly in order to
display the report obtained when the derivative checking subroutine determines
the derivatives are incorrect, and the starting parameter values have been
chosen in order to display the report obtained when the test results are
questionable.

<9-29>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE NLS USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y, XM, PAR AND RES MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(10), XM(10,5), PAR(5), RES(10)
DOUBLE PRECISION DSTAK(200)
C
EXTERNAL NLSMDL
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 200
IXM = 10
C
C READ NUMBER OF PARAMETERS
C STARTING VALUES FOR PARAMETERS
C NUMBER OF OBSERVATIONS AND NUMBER OF INDEPENDENT VARIABLES
C INDEPENDENT AND DEPENDENT VARIABLES
C
READ (5,100) NPAR
READ (5,101) (PAR(I), I=1,NPAR)
READ (5,100) N, M
READ (5,101) ((XM(I,J), I=1,N), J=1,M), (Y(I), I=1,N)
C
C PRINT TITLE AND CALL NLS TO PERFORM NONLINEAR REGRESSION
C WITH NUMERICALLY APPROXIMATED DERIVATIVES
C
WRITE (IPRT,102)
CALL NLS (Y, XM, N, M, IXM, NLSMDL, PAR, NPAR, RES, LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (2I5)
101 FORMAT (6F6.3)
102 FORMAT ('1RESULTS OF STARPAC',
* ' NONLINEAR LEAST SQUARES SUBROUTINE NLS')
END
SUBROUTINE NLSMDL (PAR, NPAR, XM, N, M, IXM, PV)
C
C SUBROUTINE TO COMPUTE PREDICTED VALUES OF DEPENDENT VARIABLE
C
C N.B. DECLARATION OF PAR, XM AND PV MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C

<9-30>
1 REAL PAR(NPAR), XM(IXM,M), PV(N)
C
DO 10 I = 1, N
PV(I) = PAR(1) * XM(I, 1) ** PAR(2)
10 CONTINUE
C
RETURN
END

Data:

2
0.725 4.000
6 1
1.309 1.471 1.490 1.565 1.611 1.680
2.138 3.421 3.597 4.340 4.882 5.660

<9-31>
1RESULTS OF STARPAC NONLINEAR LEAST SQUARES SUBROUTINE NLS
STARPAC 2.08S (03/15/90)
+**********************************************************************************
* NONLINEAR LEAST SQUARES ESTIMATION WITH NUMERICALLY APPROXIMATED DERIVATIVES *
**********************************************************************************

SUMMARY OF INITIAL CONDITIONS
------------------------------

STEP SIZE FOR OBSERVATIONS FAILING STEP SIZE SELECTION CRITERIA
APPROXIMATING *
PARAMETER STARTING VALUE SCALE DERIVATIVE COUNT NOTES ROW NUMBER
INDEX FIXED (PAR) (SCALE) (STP) F C

1 NO .72500000 DEFAULT .46415888E-04 0
2 NO 4.0000000 DEFAULT .38782913E-06 0

* NOTES. A PLUS (+) IN THE COLUMNS HEADED F OR C HAS THE FOLLOWING MEANING.

F - NUMBER OF OBSERVATIONS FAILING STEP SIZE SELECTION CRITERIA EXCEEDS
NUMBER OF EXEMPTIONS ALLOWED.

C - HIGH CURVATURE IN THE MODEL IS SUSPECTED AS THE CAUSE OF
ALL FAILURES NOTED.

NUMBER OF RELIABLE DIGITS IN MODEL RESULTS (NETA) 13

PROPORTION OF OBSERVATIONS EXEMPTED FROM SELECTION CRITERIA (EXMPT) .1000

NUMBER OF OBSERVATIONS EXEMPTED FROM SELECTION CRITERIA 1

NUMBER OF OBSERVATIONS (N) 6

NUMBER OF INDEPENDENT VARIABLES (M) 1

MAXIMUM NUMBER OF ITERATIONS ALLOWED (MIT) 21

MAXIMUM NUMBER OF MODEL SUBROUTINE CALLS ALLOWED 42

CONVERGENCE CRITERION FOR TEST BASED ON THE

FORECASTED RELATIVE CHANGE IN RESIDUAL SUM OF SQUARES (STOPSS) .3696E-09
MAXIMUM SCALED RELATIVE CHANGE IN THE PARAMETERS (STOPP) .8425E-07

MAXIMUM CHANGE ALLOWED IN THE PARAMETERS AT THE FIRST ITERATION (DELTA) 100.0

RESIDUAL SUM OF SQUARES FOR INPUT PARAMETER VALUES .1472E-01

RESIDUAL STANDARD DEVIATION FOR INPUT PARAMETER VALUES (RSD) .6067E-01

<9-32>
1
STARPAC 2.08S (03/15/90)
+NONLINEAR LEAST SQUARES ESTIMATION WITH NUMERICALLY APPROXIMATED DERIVATIVES, CONTINUED

ITERATION NUMBER 1
----------------------
MODEL FORECASTED
CALLS RSD RSS REL CHNG RSS REL CHNG RSS REL CHNG PAR
VALUE CHKD VALUE CHKD
2 .3390E-01 .4597E-02 .6877 .7109 Y .1790E-01 Y

CURRENT PARAMETER VALUES
INDEX 1 2
VALUE .7679852 3.859309

ITERATION NUMBER 4
----------------------
MODEL FORECASTED
CALLS RSD RSS REL CHNG RSS REL CHNG RSS REL CHNG PAR
VALUE CHKD VALUE CHKD
5 .3285E-01 .4317E-02 .8936E-12 .7028E-12 Y .1123E-07 Y

CURRENT PARAMETER VALUES
INDEX 1 2
VALUE .7688623 3.860406

***** PARAMETER AND RESIDUAL SUM OF SQUARES CONVERGENCE *****

<9-33>
1
STARPAC 2.08S (03/15/90)
+NONLINEAR LEAST SQUARES ESTIMATION WITH NUMERICALLY APPROXIMATED DERIVATIVES, CONTINUED

RESULTS FROM LEAST SQUARES FIT
-------------------------------

DEPENDENT PREDICTED STD DEV OF STD
ROW PREDICTOR VALUES VARIABLE VALUE PRED VALUE RESIDUAL RES

1 1.3090000 2.1380000 2.1741175 .22079043E-01 -.36117492E-01 -1.48
2 1.4710000 3.4210000 3.4111549 .16469586E-01 .98450831E-02 .35
3 1.4900000 3.5970000 3.5844108 .15615321E-01 .12589151E-01 .44
4 1.5650000 4.3400000 4.3326419 .14065814E-01 .73580833E-02 .25
5 1.6110000 4.8820000 4.8453073 .16512112E-01 .36692701E-01 1.29
6 1.6800000 5.6600000 5.6968365 .26183728E-01 -.36836492E-01 -1.86

<9-34>
1 STARPAC 2.08S (03/15/90)
+NONLINEAR LEAST SQUARES ESTIMATION WITH NUMERICALLY APPROXIMATED DERIVATIVES, CONTINUED

STD RES VS ROW NUMBER STD RES VS PREDICTED VALUES
3.75++---------+---------+----+----+---------+---------++ 3.75++---------+---------+----+----+---------+---------++
- - - -
- - - -
- - - -
- - - -
2.25+ + 2.25+ +
- - - -
- - - -
- - - -
- * - - * -
.75+ + .75+ +
- - - -
- * * * - - * * * -
- - - -
- - - -
-.75+ + -.75+ +
- - - -
- - - -
-* - -* -
- *- - *-
-2.25+ + -2.25+ +
- - - -
- - - -
- - - -
- - - -
-3.75++---------+---------+----+----+---------+---------++ -3.75++---------+---------+----+----+---------+---------++
1.0 3.5 6.0 2.174 3.935 5.697

AUTOCORRELATION FUNCTION OF RESIDUALS NORMAL PROBABILITY PLOT OF STD RES
1++---------+-------********----+---------+---------++ 3.75++---------+---------+----+----+---------+---------++
- ** - - -
- *** - - -
- ********** - - -
- ******** - - -
6+ + 2.25+ +
- - - -
- - - -
- - - -
- - - * -
11+ + .75+ +
- - - -
- - - * * * -
- - - -
- - - -
16+ + -.75+ +
- - - -
- - - -
- - - * -
- - - * -
21+ + -2.25+ +
- - - -
- - - -
- - - -
- - - -
26++---------+---------+----+----+---------+---------++ -3.75++---------+---------+----+----+---------+---------++
-1.00 0.0 1.00 -2.5 0.0 2.5
<9-35>
1
STARPAC 2.08S (03/15/90)
+NONLINEAR LEAST SQUARES ESTIMATION WITH NUMERICALLY APPROXIMATED DERIVATIVES, CONTINUED

VARIANCE-COVARIANCE AND CORRELATION MATRICES OF THE ESTIMATED (UNFIXED) PARAMETERS
----------------------------------------------------------------------------------

- APPROXIMATION BASED ON ASSUMPTION THAT RESIDUALS ARE SMALL
- COVARIANCES ARE ABOVE THE DIAGONAL
- VARIANCES ARE ON THE DIAGONAL
- CORRELATION COEFFICIENTS ARE BELOW THE DIAGONAL

COLUMN 1 2

1 .3342304E-03 -.9369370E-03
2 -.9907719 .2675639E-02

ESTIMATES FROM LEAST SQUARES FIT
---------------------------------

APPROXIMATE
95 PERCENT CONFIDENCE LIMITS
INDEX FIXED PARAMETER SD OF PAR RATIO LOWER UPPER

1 NO .76886226 .18281968E-01 42.06 .71810338 .81962114
2 NO 3.8604056 .51726577E-01 74.63 3.7167896 4.0040216

RESIDUAL SUM OF SQUARES .4317308E-02

RESIDUAL STANDARD DEVIATION .3285311E-01
BASED ON DEGREES OF FREEDOM 6 - 2 = 4

APPROXIMATE CONDITION NUMBER 20.87491

<9-36>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE NLSD USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y, XM, PAR AND RES MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(10), XM(10,5), PAR(5), RES(10)
DOUBLE PRECISION DSTAK(200)
C
EXTERNAL NLSMDL, NLSDRV
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 200
IXM = 10
C
C READ NUMBER OF PARAMETERS
C STARTING VALUES FOR PARAMETERS
C NUMBER OF OBSERVATIONS AND NUMBER OF INDEPENDENT VARIABLES
C INDEPENDENT AND DEPENDENT VARIABLES
C
READ (5,100) NPAR
READ (5,101) (PAR(I), I=1,NPAR)
READ (5,100) N, M
READ (5,101) ((XM(I,J), I=1,N), J=1,M), (Y(I), I=1,N)
C
C PRINT TITLE AND CALL NLSD TO PERFORM NONLINEAR REGRESSION
C WITH USER-SUPPLIED DERIVATIVES
C
WRITE (IPRT,102)
CALL NLSD (Y, XM, N, M, IXM, NLSMDL, NLSDRV, PAR, NPAR, RES,
* LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (2I5)
101 FORMAT (6F6.3)
102 FORMAT ('1RESULTS OF STARPAC',
* ' NONLINEAR LEAST SQUARES SUBROUTINE NLSD')
END
SUBROUTINE NLSMDL (PAR, NPAR, XM, N, M, IXM, PV)
C
C SUBROUTINE TO COMPUTE PREDICTED VALUES OF DEPENDENT VARIABLE
C
C N.B. DECLARATION OF PAR, XM AND PV MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.

<9-37>
1C
REAL PAR(NPAR), XM(IXM,M), PV(N)
C
DO 10 I = 1, N
PV(I) = PAR(1) * XM(I, 1) ** PAR(2)
10 CONTINUE
C
RETURN
END
SUBROUTINE NLSDRV (PAR, NPAR, XM, N, M, IXM, D)
C
C SUBROUTINE TO COMPUTE THE PARTIAL DERIVATIVE (JACOBIAN) MATRIX
C
C N.B. DECLARATION OF PAR, XM AND D MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL PAR(NPAR), XM(IXM,M), D(N,NPAR)
C
DO 10 I = 1, N
D(I,1) = XM(I,1) ** PAR(2)
D(I,2) = PAR(1) * XM(I,1) ** PAR(2) * ALOG(XM(I,1))
10 CONTINUE
C
RETURN
END

Data:

2
0.725 4.000
6 1
1.309 1.471 1.490 1.565 1.611 1.680
2.138 3.421 3.597 4.340 4.882 5.660

<9-38>
1RESULTS OF STARPAC NONLINEAR LEAST SQUARES SUBROUTINE NLSD
STARPAC 2.08S (03/15/90)
+***********************************************************************
* NONLINEAR LEAST SQUARES ESTIMATION WITH USER-SUPPLIED DERIVATIVES *
***********************************************************************

SUMMARY OF INITIAL CONDITIONS
------------------------------

DERIVATIVE
PARAMETER STARTING VALUE SCALE ASSESSMENT
INDEX FIXED (PAR) (SCALE)

1 NO .72500000 DEFAULT OK
2 NO 4.0000000 DEFAULT OK

NUMBER OF RELIABLE DIGITS IN MODEL RESULTS (NETA) 13

NUMBER OF DIGITS IN DERIVATIVE CHECKING AGREEMENT TOLERANCE (NTAU) 4

ROW NUMBER AT WHICH DERIVATIVES WERE CHECKED (NROW) 1
-VALUES OF THE INDEPENDENT VARIABLES AT THIS ROW
INDEX 1
VALUE 1.309000

NUMBER OF OBSERVATIONS (N) 6

NUMBER OF INDEPENDENT VARIABLES (M) 1

MAXIMUM NUMBER OF ITERATIONS ALLOWED (MIT) 21

MAXIMUM NUMBER OF MODEL SUBROUTINE CALLS ALLOWED 42

CONVERGENCE CRITERION FOR TEST BASED ON THE

FORECASTED RELATIVE CHANGE IN RESIDUAL SUM OF SQUARES (STOPSS) .3696E-09
MAXIMUM SCALED RELATIVE CHANGE IN THE PARAMETERS (STOPP) .8425E-07

MAXIMUM CHANGE ALLOWED IN THE PARAMETERS AT THE FIRST ITERATION (DELTA) 100.0

RESIDUAL SUM OF SQUARES FOR INPUT PARAMETER VALUES .1472E-01

RESIDUAL STANDARD DEVIATION FOR INPUT PARAMETER VALUES (RSD) .6067E-01

<9-39>
1
STARPAC 2.08S (03/15/90)
+NONLINEAR LEAST SQUARES ESTIMATION WITH USER-SUPPLIED DERIVATIVES, CONTINUED

ITERATION NUMBER 1
----------------------
MODEL FORECASTED
CALLS RSD RSS REL CHNG RSS REL CHNG RSS REL CHNG PAR
VALUE CHKD VALUE CHKD
2 .3390E-01 .4597E-02 .6877 .7109 Y .1790E-01 Y

CURRENT PARAMETER VALUES
INDEX 1 2
VALUE .7679852 3.859309

ITERATION NUMBER 4
----------------------
MODEL FORECASTED
CALLS RSD RSS REL CHNG RSS REL CHNG RSS REL CHNG PAR
VALUE CHKD VALUE CHKD
5 .3285E-01 .4317E-02 -.3214E-13 .6352E-12 Y .1068E-07 Y

CURRENT PARAMETER VALUES
INDEX 1 2
VALUE .7688623 3.860406

***** PARAMETER AND RESIDUAL SUM OF SQUARES CONVERGENCE *****

<9-40>
1
STARPAC 2.08S (03/15/90)
+NONLINEAR LEAST SQUARES ESTIMATION WITH USER-SUPPLIED DERIVATIVES, CONTINUED

RESULTS FROM LEAST SQUARES FIT
-------------------------------

DEPENDENT PREDICTED STD DEV OF STD
ROW PREDICTOR VALUES VARIABLE VALUE PRED VALUE RESIDUAL RES

1 1.3090000 2.1380000 2.1741175 .22079044E-01 -.36117523E-01 -1.48
2 1.4710000 3.4210000 3.4111549 .16469585E-01 .98450648E-02 .35
3 1.4900000 3.5970000 3.5844109 .15615321E-01 .12589135E-01 .44
4 1.5650000 4.3400000 4.3326419 .14065814E-01 .73580808E-02 .25
5 1.6110000 4.8820000 4.8453073 .16512112E-01 .36692709E-01 1.29
6 1.6800000 5.6600000 5.6968365 .26183727E-01 -.36836464E-01 -1.86

<9-41>
1 STARPAC 2.08S (03/15/90)
+NONLINEAR LEAST SQUARES ESTIMATION WITH USER-SUPPLIED DERIVATIVES, CONTINUED

STD RES VS ROW NUMBER STD RES VS PREDICTED VALUES
3.75++---------+---------+----+----+---------+---------++ 3.75++---------+---------+----+----+---------+---------++
- - - -
- - - -
- - - -
- - - -
2.25+ + 2.25+ +
- - - -
- - - -
- - - -
- * - - * -
.75+ + .75+ +
- - - -
- * * * - - * * * -
- - - -
- - - -
-.75+ + -.75+ +
- - - -
- - - -
-* - -* -
- *- - *-
-2.25+ + -2.25+ +
- - - -
- - - -
- - - -
- - - -
-3.75++---------+---------+----+----+---------+---------++ -3.75++---------+---------+----+----+---------+---------++
1.0 3.5 6.0 2.174 3.935 5.697

AUTOCORRELATION FUNCTION OF RESIDUALS NORMAL PROBABILITY PLOT OF STD RES
1++---------+-------********----+---------+---------++ 3.75++---------+---------+----+----+---------+---------++
- ** - - -
- *** - - -
- ********** - - -
- ******** - - -
6+ + 2.25+ +
- - - -
- - - -
- - - -
- - - * -
11+ + .75+ +
- - - -
- - - * * * -
- - - -
- - - -
16+ + -.75+ +
- - - -
- - - -
- - - * -
- - - * -
21+ + -2.25+ +
- - - -
- - - -
- - - -
- - - -
26++---------+---------+----+----+---------+---------++ -3.75++---------+---------+----+----+---------+---------++
-1.00 0.0 1.00 -2.5 0.0 2.5
<9-42>
1
STARPAC 2.08S (03/15/90)
+NONLINEAR LEAST SQUARES ESTIMATION WITH USER-SUPPLIED DERIVATIVES, CONTINUED

VARIANCE-COVARIANCE AND CORRELATION MATRICES OF THE ESTIMATED (UNFIXED) PARAMETERS
----------------------------------------------------------------------------------

- APPROXIMATION BASED ON ASSUMPTION THAT RESIDUALS ARE SMALL
- COVARIANCES ARE ABOVE THE DIAGONAL
- VARIANCES ARE ON THE DIAGONAL
- CORRELATION COEFFICIENTS ARE BELOW THE DIAGONAL

COLUMN 1 2

1 .3342306E-03 -.9369379E-03
2 -.9907719 .2675642E-02

ESTIMATES FROM LEAST SQUARES FIT
---------------------------------

APPROXIMATE
95 PERCENT CONFIDENCE LIMITS
INDEX FIXED PARAMETER SD OF PAR RATIO LOWER UPPER

1 NO .76886229 .18281974E-01 42.06 .71810339 .81962119
2 NO 3.8604055 .51726611E-01 74.63 3.7167894 4.0040216

RESIDUAL SUM OF SQUARES .4317308E-02

RESIDUAL STANDARD DEVIATION .3285311E-01
BASED ON DEGREES OF FREEDOM 6 - 2 = 4

APPROXIMATE CONDITION NUMBER 20.87492

<9-43>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE STPLS USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF XM, PAR AND STP MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL XM(10,5), PAR(5), STP(5)
DOUBLE PRECISION DSTAK(200)
C
EXTERNAL NLSMDL, DERIV
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 200
IXM = 10
C
C READ NUMBER OF PARAMETERS
C STARTING VALUES FOR PARAMETERS
C NUMBER OF OBSERVATIONS AND NUMBER OF INDEPENDENT VARIABLES
C INDEPENDENT VARIABLES
C
READ (5,100) NPAR
READ (5,101) (PAR(I), I=1,NPAR)
READ (5,100) N, M
READ (5,101) ((XM(I,J), I=1,N), J=1,M)
C
C PRINT TITLE AND CALL STPLS TO SELECT STEP SIZES FOR
C APPROXIMATING DERIVATIVES
C
WRITE (IPRT,102)
CALL STPLS (XM, N, M, IXM, NLSMDL, PAR, NPAR, LDSTAK, STP)
C
C FORMAT STATEMENTS
C
100 FORMAT (2I5)
101 FORMAT (6F6.3)
102 FORMAT ('1RESULTS OF STARPAC',
* ' DERIVATIVE STEP SIZE SELECTION SUBROUTINE STPLS')
END
SUBROUTINE NLSMDL (PAR, NPAR, XM, N, M, IXM, PV)
C
C SUBROUTINE TO COMPUTE PREDICTED VALUES OF DEPENDENT VARIABLE
C
C N.B. DECLARATION OF PAR, XM AND PV MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C

<9-44>
1 REAL PAR(NPAR), XM(IXM,M), PV(N)
C
DO 10 I = 1, N
PV(I) = PAR(1) * XM(I, 1) ** PAR(2)
10 CONTINUE
C
RETURN
END

Data:

2
0.725 4.000
6 1
1.309 1.471 1.490 1.565 1.611 1.680

<9-45>
1RESULTS OF STARPAC DERIVATIVE STEP SIZE SELECTION SUBROUTINE STPLS
STARPAC 2.08S (03/15/90)
+**********************************
* DERIVATIVE STEP SIZE SELECTION *
**********************************

STEP SIZE FOR OBSERVATIONS FAILING STEP SIZE SELECTION CRITERIA
PARAMETER APPROXIMATING *
STARTING VALUE SCALE DERIVATIVE COUNT NOTES ROW NUMBER(S)
INDEX (PAR) (SCALE) (STP) F C

1 .72500000 DEFAULT .46415888E-04 0
2 4.0000000 DEFAULT .38782913E-06 0

* NOTES. A PLUS (+) IN THE COLUMNS HEADED F OR C HAS THE FOLLOWING MEANING.

F - NUMBER OF OBSERVATIONS FAILING STEP SIZE SELECTION CRITERIA EXCEEDS
NUMBER OF EXEMPTIONS ALLOWED.

C - HIGH CURVATURE IN THE MODEL IS SUSPECTED AS THE CAUSE OF
ALL FAILURES NOTED.

NUMBER OF RELIABLE DIGITS IN MODEL RESULTS (NETA) 13

PROPORTION OF OBSERVATIONS EXEMPTED FROM SELECTION CRITERIA (EXMPT) .1000

NUMBER OF OBSERVATIONS EXEMPTED FROM SELECTION CRITERIA 1

NUMBER OF OBSERVATIONS (N) 6

<9-46>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE DCKLS USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF XM AND PAR MUST BE CHANGED TO DOUBLE PRECISION
C IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL XM(10,5), PAR(5)
DOUBLE PRECISION DSTAK(200)
C
EXTERNAL NLSMDL, NLSDRV
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 200
IXM = 10
C
C READ NUMBER OF PARAMETERS
C STARTING VALUES FOR PARAMETERS
C NUMBER OF OBSERVATIONS AND NUMBER OF INDEPENDENT VARIABLES
C INDEPENDENT VARIABLES
C
READ (5,100) NPAR
READ (5,101) (PAR(I), I=1,NPAR)
READ (5,100) N, M
READ (5,101) ((XM(I,J), I=1,N), J=1,M)
C
C PRINT TITLE AND CALL DCKLS TO PERFORM DERIVATIVE CHECKING
C
WRITE (IPRT,102)
CALL DCKLS (XM, N, M, IXM, NLSMDL, NLSDRV, PAR, NPAR, LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (2I5)
101 FORMAT (6F6.3)
102 FORMAT ('1RESULTS OF STARPAC',
* ' DERIVATIVE CHECKING SUBROUTINE DCKLS')
END
SUBROUTINE NLSMDL (PAR, NPAR, XM, N, M, IXM, PV)
C
C SUBROUTINE TO COMPUTE PREDICTED VALUES OF DEPENDENT VARIABLE
C
C N.B. DECLARATION OF PAR, XM AND PV MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL PAR(NPAR), XM(IXM,M), PV(N)

<9-47>
1C
DO 10 I = 1, N
PV(I) = PAR(1) * XM(I, 1) ** PAR(2)
10 CONTINUE
C
RETURN
END
SUBROUTINE NLSDRV (PAR, NPAR, XM, N, M, IXM, D)
C
C SUBROUTINE TO COMPUTE THE PARTIAL DERIVATIVE (JACOBIAN) MATRIX
C
C DERIVATIVE WITH RESPECT TO FIRST PARAMETER HAS BEEN CODED
C INCORRECTLY TO DEMONSTRATE ERROR DETECTION CAPABILITIES
C
C N.B. DECLARATION OF PAR, XM AND D MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL PAR(NPAR), XM(IXM,M), D(N,NPAR)
C
DO 10 I = 1, N
D(I,1) = XM(I,1) * PAR(2)
D(I,2) = PAR(1) * XM(I,1) ** PAR(1) * ALOG(XM(I,1))
10 CONTINUE
C
RETURN
END

Data:

2
0.000 4.000
6 1
1.309 1.471 1.490 1.565 1.611 1.680

<9-48>
1RESULTS OF STARPAC DERIVATIVE CHECKING SUBROUTINE DCKLS
STARPAC 2.08S (03/15/90)
+***********************
* DERIVATIVE CHECKING *
***********************

*
PARAMETER DERIVATIVE
STARTING VALUE SCALE ASSESSMENT
INDEX (PAR) (SCALE)

1 0. DEFAULT INCORRECT
2 4.0000000 DEFAULT QUESTIONABLE (1)

* NUMBERS IN PARENTHESES REFER TO THE FOLLOWING NOTES.

(1) USER-SUPPLIED AND APPROXIMATED DERIVATIVES AGREE, BUT
BOTH ARE ZERO. RECHECK AT ANOTHER ROW.

NUMBER OF RELIABLE DIGITS IN MODEL RESULTS (NETA) 14

NUMBER OF DIGITS IN DERIVATIVE CHECKING AGREEMENT TOLERANCE (NTAU) 4

ROW NUMBER AT WHICH DERIVATIVES WERE CHECKED (NROW) 1
-VALUES OF THE INDEPENDENT VARIABLES AT THIS ROW
INDEX 1
VALUE 1.309000

NUMBER OF OBSERVATIONS (N) 6

<9-49>
1G. Acknowledgments

The subroutines used to compute the nonlinear least squares solution are
those referenced in Dennis et al. [1981]. The algorithms used to select
optimum step sizes for numerical derivatives, and to check analytic
derivatives were developed by Schnabel [1982]. The printed output for the
nonlinear least squares subroutines has been modeled on the linear least
squares output used by OMNITAB II [Hogben et al., 1971].

<9-50>
1----- CHAPTER 10 -----

DIGITAL FILTERING

A. Introduction

STARPAC contains 16 subroutines for digital filtering time series. These
include subroutines which: compute a least squares approximation to an ideal
low-pass filter; perform symmetric linear filter operations; sample values
from a series; perform autoregressive (or difference) filter operations; and
compute the gain function of any symmetric linear filter and the gain and
phase functions of any autoregressive (or difference) filter.

Users are directed to section B for a brief description of the
subroutines. The declaration and CALL statements are given in section C and
the subroutine arguments are defined in section D. The algorithms used and
the output produced by these subroutines are discussed in section E. Sample
programs and their output are shown in section F.

B. Subroutine Descriptions

B.1 Symmetric Linear Filter Subroutines

Subroutine LPCOEF computes symmetric linear low-pass filter coefficients
using a least squares approximation to an ideal low-pass filter that has
convergence factors which reduce overshoot and ripple [Bloomfield, 1976].
This low-pass filter has a transfer function which changes from approximately
one to zero in a transition band about the ideal cutoff frequency FC, that is
from (FC - 1/K) to (FC + 1/K), as discussed in section 6.4 of Bloomfield
[1976]. The user must specify the cutoff frequency in cycles per sample
interval and the number of filter coefficients, which must be odd. The user
must also choose the number of filter terms, K, so that (FC - 1/K) > 0 and (FC
+ 1/K) < 0.5. In addition, K must be chosen as a compromise between:

1) A sharp cutoff, that is, 1/K small; and

2) Minimizing the number of data points lost by the filtering operations
((K-1)/2 data points will be lost from each end of the series).

The subroutine returns the normalized low-pass filter coefficients. There is
no printed output.

For any low-pass filter there is a corresponding high-pass filter
equivalent to subtracting the low-pass filtered series from the original
series. Subroutine HPCOEF returns symmetric linear high-pass filter
coefficients computed from user supplied low-pass symmetric linear filter

<10-1>
1coefficients. The number of filter coefficients must be odd. There is no
printed output.

Subroutine MAFLT performs a simple moving average filter operation on the
input series using the simple moving average filter defined by

HMA(J) = 1/K for J = 1, ..., K.

The user must specify the number of filter coefficients, K, which must be odd;
the subroutine returns the filtered series and the number of observations in
the filtered series. There is no printed output.

Subroutine SLFLT performs a symmetric linear filter operation with user
supplied coefficients and returns the filtered series to the user. The filter
coefficients must be normalized on input to SLFLT. The filtered series and
the number of observations in the filtered series are returned. There is no
printed output.

Subroutine SAMPLE samples every NSth observation from an input series.
If the input series was obtained using an NS term low-pass filter, this
sampling rate removes the autocorrelation introduced by the filtering
operation. This subroutine returns the series of sampled observations and the
number of observations in the series. There is no printed output.

Subroutine LOPASS computes low-pass filter coefficients as described for
subroutine LPCOEF and then performs the filtering operation described for
subroutine SLFLT. The user must specify the cutoff frequency in cycles per
sample interval and the number of filter terms, which must be odd. The
subroutine returns the normalized filter coefficients, the filtered series and
the number of observations in the filtered series. There is no printed
output.

Subroutine HIPASS computes the high-pass filter coefficients equivalent
to using HPCOEF with the input low-pass filter coefficients supplied by LPCOEF
and performs the filtering operation described for subroutine SLFLT. The user
must specify the cutoff frequency in cycles per sample interval and the number
of filter terms, which must be odd. The subroutine returns the filter
coefficients, the filtered series and the number of observations in the
filtered series. There is no printed output.

B.2 Autoregressive or Difference Linear Filter Subroutines

Subroutine ARFLT subtracts the series mean from each observation and
performs an autoregressive linear filter operation with user supplied filter
coefficients. This subroutine returns the filtered series and the number of
observations in the filtered series. There is no printed output.

Subroutine DIF performs a first difference filter operation on the input
series. It returns the differenced series and the number of observations in
the differenced series. There is no printed output.

Subroutine DIFC performs a user controlled differencing operation. It
returns the order of the difference filter specified, the coefficients of the
difference filter, the differenced series and the number of observations in
the differenced series. This subroutine can be used as a high-pass filter or
for differencing series in the style of Box and Jenkins [1976]. There is no
printed output.

<10-2>
1
Subroutines DIFM and DIFMC are the same as DIF and DIFC, respectively,
except that the input series may contain missing data. A missing value code
must be used within the input series to specify time points without an
observed value. The difference between a missing value and an observed value,
or between two missing values, will result in a missing value in the
differenced series; the missing value code used in the differenced series is
also returned to the user. Users should note that the number of missing
values may be significantly increased by the differencing operation.

B.3 Gain and Phase Function Subroutines

Subroutine GFSLF computes the gain function of a symmetric linear filter.
The printed output consists of a plot of the gain function versus frequency.

Subroutine GFSLFS is the same as GFSLF but allows the user to specify
various options which are preset in GFSLF, including the frequency range, the
number of frequencies for which the gain function is to be computed and the
type of plot to be used. In addition, the gain function is returned to the
user, permitting the use of other methods of displaying the results.

Subroutine GFARF computes the gain and phase functions of either
autoregressive or difference filters. The output consists of a plot of the
gain and phase functions versus frequency.

Subroutine GFARFS is the same as GFARF but provides the user with the
same options as are available for subroutine GFSLFS.

C. Subroutine Declaration and CALL Statements

Subroutines Supporting Symmetric Linear Filter Operations

LPCOEF: Compute symmetric linear low-pass filter coefficients; return filter
coefficients (no printed output)

HLP(k)
:
:
CALL LPCOEF (FC, K, HLP)

===

HPCOEF: Compute symmetric linear high-pass filter coefficients; return
coefficients (no printed output)

HLP(k), HHP(k)
:
:
CALL HPCOEF (HLP, K, HHP)

===

<10-3>
1
MAFLT: Perform simple moving average; return filtered series (no printed
output)

Y(n), YF(n)
:
:
CALL MAFLT (Y, N, K, YF, NYF)

===

SLFLT: Perform symmetric linear filter operation with user-supplied filter
coefficients; return filtered series (no printed output)

Y(n), H(k), YF(n)
:
:
CALL SLFLT (Y, N, K, H, YF, NYF)

===

SAMPLE: Sample (extract) every NSth observation from a series; return
sampled series (no printed output)

Y(n), YS(n)
:
:
CALL SAMPLE (Y, N, NS, YS, NYS)

===

LOPASS: Filter series with symmetric linear low-pass filter; return filtered
series (no printed output)

Y(n), HLP(k), YF(n)
:
:
CALL LOPASS (Y, N, FC, K, HLP, YF, NYF)

===

HIPASS: Filter series with symmetric linear high-pass filter; return
filtered series (no printed output)

Y(n), HHP(k), YF(n)
:
:
CALL HIPASS (Y, N, FC, K, HHP, YF, NYF)

===

<10-4>
1 Subroutines for Autoregressive or Difference Linear Filters

ARFLT: Perform autoregressive filter operation with user-supplied filter
coefficients; return filtered series (no printed output)

Y(n), PHI(iar), YF(n)
:
:
CALL ARFLT (Y, N, IAR, PHI, YF, NYF)

===

DIF: Perform first-difference filter operation; return differenced series
(no printed output)

Y(n), YF(n)
:
:
CALL DIF (Y, N, YF, NYF)

===

DIFC: Perform user-specified difference filter operation; return
differenced series (no printed output)

INTEGER ND(nfac), IOD(nfac)
Y(n), YF(n), PHI(iar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL DIFC (Y, N,
+ NFAC, ND, IOD, IAR, PHI, LPHI,
+ YF, NYF, LDSTAK)

===

DIFM: Perform first-difference filter operation on series with missing
data; return differenced series (no printed output)

Y(n), YF(n)
:
:
CALL DIFM (Y, YMISS, N, YF, YFMISS, NYF)

===

<10-5>
1DIFMC: Perform user-specified difference filter operation on series with
missing data; return differenced series (no printed output)

INTEGER ND(nfac), IOD(nfac)
Y(n), YF(n), PHI(iar)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL DIFMC (Y, YMISS, N,
+ NFAC, ND, IOD, IAR, PHI, LPHI,
+ YF, YFMISS, NYF, LDSTAK)

===

Subroutines for Computing the Gain and Phase Functions

GFSLF: Compute and plot gain function of symmetric linear filter

H(k)
:
:
CALL GFSLF (H, K)

===

GFSLFS: Compute and optionally plot gain function of symmetric linear filter
with user-supplied control values; return gain function values and
corresponding frequency values

H(k), GAIN(nf), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL GFSLFS (H, K,
+ NF, FMIN, FMAX, GAIN, FREQ, NPRT, LDSTAK)

===

GFARF: Compute and plot gain and phase functions of autoregressive or
difference filter

PHI(iar)
:
:
CALL GFARF (PHI, IAR)

===

<10-6>
1GFARFS: Compute and optionally plot gain and phase functions of
autoregressive or difference filter; with user-supplied control
values; return gain and phase function values and corresponding
frequency values

PHI(iar), GAIN(nf), PHAS(nf), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL GFARFS (PHI, IAR,
+ NF, FMIN, FMAX, GAIN, PHAS, FREQ, NPRT, LDSTAK)

===

D. Dictionary of Subroutine Arguments and COMMON Variables

DSTAK ... The DOUBLE PRECISION vector in COMMON /CSTAK/ of dimension at
least LDSTAK. DSTAK provides workspace for the computations. The
first LDSTAK locations of DSTAK are overwritten during subroutine
execution.

FC --> The cutoff frequency for the filter, in cycles per sample
interval. FC must lie between 0.0 and 0.5.

FMAX --> The maximum frequency, in cycles per sample interval, at which the
gain and phase functions are computed (0.0 <= MIN < FMAX <= 0.5).
The default value is 0.5. If FMAX is outside the range FMIN to
0.5 or is not an argument in the CALL statement the default value
is used.

FMIN --> The minimum frequency, in cycles per sample interval, at which the
gain and phase functions are computed (0.0 <= FMIN < FMAX <= 0.5).
The default value is 0.5. If FMIN is outside the range 0.0 to
FMAX or is not an argument in the CALL statement the default value
is used.

FREQ <-- The vector of dimension at least NF containing the NF frequencies
at which the gain and phase functions are computed.

<10-7>
1GAIN <-- The vector of dimension at least NF containing the NF gain
function values over the range FMIN to FMAX.

For symmetric linear filters:

The gain function of a symmetric linear filter is

K
GAIN(I) = | SUM H(J) * cos[2*pi*|Km-J|*(FMIN+DELi)] |
J=1

for I = 1, ..., NF, where

Km is the midpoint of the symmetric filter, Km = (K+1)/2, and

DELi is the frequency increment, defined as

DELi = 2*(I-1)*(FMAX-FMIN)/(NF-1)

There is no phase change in a symmetric linear filter.

For autoregressive (or difference) filters:

The gain and phase functions of an autoregressive (or
difference) filter are

GAIN(I) =

IAR
|1 - SUM PHI(J) * cos[2*pi*J*(FMIN+DELi)]
J=1

IAR
- i * SUM PHI(J) * sin[2*pi*J*(FMIN+DELi)] |
J=1

and

IAR
SUM PHI(J) * sin[i*pi*J*(FMIN+DELi)]
J=1
PHAS(I) = Arctan ----------------------------------------
IAR
1 - SUM PHI(J) * cos[i*pi*J*(FMIN+DELi)]
J=1

for I = 1, 2, ..., NF, where

i is the complex value sqrt(-1); and

DELi is the frequency increment, defined as

DELi = 2*(I-1)*(FMAX-FMIN)/(NF-1).

H --> The vector of dimension at least K containing filter coefficients,
which must be symmetric about H[(K+1)/2].

<10-8>
1HHP <-- The vector of dimension at least K containing the K high-pass
filter coefficients, which are symmetric about HHP[(K+1)/2]. The
high-pass filter coefficients are computed from the low-pass
coefficients by

HLP(J)
HHP(J) = 1 - ---------- for J = Km,
K
SUM HLP(J)
L=1

HLP(J)
HHP(J) = - ---------- for J = 1, ..., Km-1, Km+1, ..., K.
K
SUM HLP(J)
L=1

HLP --- The vector of dimension at least K containing the K low-pass
filter coefficients, which must be symmetric about HLP[(K+1)/2].
HLP must be input to HPCOEF; it is returned by LPCOEF and LOPASS.

For LPCOEF and LOPASS, HLP is defined by

K
HLP(J) = hJ / SUM hI for J = 1, ..., K,
I=1

where

hJ is computed by

hJ = 2*FC for J = Km

sin[2*pi*|Km-J|*FC] sin[2*pi*|Km-J|/K] for j = 1, ...,
hJ = ------------------- * ------------------ Km-1,Km+1,
2*pi*|Km-J| 2*pi*|Km-J|/K ..., K,

with Km the midpoint of the filter, Km = (K+1)/2.

This low-pass filter has a transfer function which changes from
approximately one to zero in a transition band about the ideal
cutoff frequency FC, that is from (FC - 1/K) to (FC + 1/K), as
shown in figure 6.7 of Bloomfield [1976].

IAR --- The number of coefficients in the autoregressive (or difference)
filter, including zero coefficients. Equivalently, IAR is the
maximum order of the backward shift operator. IAR must be input
to ARFLT, GFARF and GFARFS; it is returned by DIFC and DIFMC. IAR
is defined by

NDF
IAR = SUM IOD(J) * ND(J).
J=1

<10-9>
1IERR ... An error flag returned in COMMON /ERRCHK/ [see chapter 1,
section D.5]. Note that using (or not using) the error flag will
not affect the printed error messages that are automatically
provided.

IERR = 0 indicates that no errors were detected.

IERR = 1 indicates that improper input was detected.

IOD --> The vector of dimension at least NFAC containing the NFAC values
designating the order of each difference factor.

K --> The number of coefficients in the symmetric linear filter. K must
be odd. For LPCOEF, LOPASS, and HIPASS, the user must choose the
number of filter terms, K, so that (FC - 1/K) > 0 and
(FC + 1/K) < 0.5. In addition, K must be chosen as a compromise
between:

1) A sharp cutoff, that is, 1/K small; and

2) Minimizing the number of data points lost by the filtering
operations. ((K-1)/2 data points will be lost from each end of
the series.)

For DIFC and DIFMC:

NFAC
LDSTAK >= 7 + 2 * SUM ND(J)*IOD(J)*P
J=1

For GFSLFS and GFARFS:

LDSTAK >= (11+IO*(9+NF))/2 + IO*2*NF*P

where IO = 0 if NPRT = 0, and IO = 1 if NPRT <> 0.

LPHI --> The length of the vector PHI. LPHI must equal or exceed IAR.

N --> The number of observations in the time series. The minimum number
of observations is three.

ND --> The vector of dimension at least NFAC containing the NFAC values
designating the number of times each difference factor is to be
applied.

NF --> The number of frequencies at which the gain and phase functions
are to be computed. The default value is 101. If NF is not an
argument of the subroutine CALL statement the default value is
used.

NFAC --> The number of difference factors.

<10-10>
1NPRT --> The argument controlling printed output.

If NPRT < 0, the output consists of a plot of the gain function
versus frequency, where the gain function is
expressed in decibels and is adjusted so that the
peak is at zero. For GFARFS only the output also
includes a plot of the phase function versus
frequency.

If NPRT = 0, the automatic printout is suppressed.

If NPRT > 0, the output consists of a log-linear plot of the gain
function versus frequency. For GFARFS only, the
output also includes a plot of the phase function
versus frequency.

The default value is -1. If NPRT is not an argument of the
subroutine CALL statement the default value is used.

NS --> The sample rate, 1 <= NS <= N.

NYF <-- The number of observations in YF.

NYS <-- The number of observations in YS.

PHAS <-- The vector of dimension at least NF containing the NF phase
function values over the range FMIN to FMAX.

PHI --- The vector of dimension at least NF containing IAR autoregressive
or difference filter coefficients. PHI must be input to ARFLT,
GFARF, and GFARFS; it is returned by DIFC and DIFMC.

For DIFC and DIFMC the difference filter coefficients are obtained
by expanding the difference operator

NFAC
PRODUCT (1-B[IOD(J)])**ND(J) =
J=1

1 - PHI(1)*B[1] - PHI(2)*B[2] - ... - PHI(IAR)*B[IAR]

where

B[i] is the backward shift operator, defined by

B[i]*y(t) = y(t-i);

PHI(i) is the ith difference filter coefficient, which will be a
positive or negative integer if the ith order backward
shift operator B[i] is used, and zero if the ith order
backward shift operator is unused.

Y --> The vector of dimension at least N containing the N observations
of the time series.

<10-11>
1YF <-- The vector of dimension at least N containing the NYF values of
the filtered series. The filtered series will start in
YF(1); YF(NYF+1) through YF(N) will be set to zero.

For symmetric linear filters:

The filtered series obtained by applying a moving average filter
is defined by

K
YF(I) = SUM H(J) * Y(I+K-J) for I = 1, ..., NYF,
J=1

where

NYF is the number of values in the filtered series,
NYF = N - (K-1), reflecting the (K-1)/2 data points lost
from each end of the original series by the filtering
operation.

For autoregressive or difference filters:

The filtered series obtained using an autoregressive (or
difference) filter is computed by

IAR
YF(I) = Z(I+IAR) - SUM PHI(J)*Z(I+IAR-J) for I = 1, ..., NYF,
J=1

where

Z is the N observation time series being filtered which, for
ARFLT is the input series Y minus its mean and, for DIF,
DIFC, DIFM and DIFMC is the input series Y;

NYF is the number of observations in the filtered series,
NYF = N-IAR, reflecting the IAR data points lost from the
beginning of the original series by the filtering operation.

YFMISS <-- The missing value code used within the filtered series YF to
indicate that a value could not be computed due to missing data.

YMISS --> The missing value code used within the input series Y to indicate
that an observation is missing.

YS <-- The vector of dimension at least N containing the series formed by
sampling every NSth element of Y,

YS(J) = Y((J-1)*NS + 1) for J = 1, ..., NYS,

where NYS, the number of observations in the sampled series, is
returned by subroutine SAMPLE. The series will start in YS(1);
YS(NYS+1) through YS(N) will be set to zero.

<10-12>
1E. Computational Methods

E.1 Algorithms

The code for computing the low-pass filter coefficients is based on
subroutine LOPASS, listed on page 149 of Bloomfield [1976]. The transforms
used to compute the gain function of symmetric filters and the gain and phase
functions autoregressive (or difference) filters are based on the algorithms
shown on pages 311 and 420, respectively, of Jenkins and Watts [1968].

E.2 Computed Results and Printed Output

Except for the gain and phase function subroutines, STARPAC digital
filtering subroutines do not produce printed output. For the gain and phase
function subroutines, the argument controlling the printed results is NPRT and
is discussed in section D; the output from the gain and phase function
subroutines consists of line printer plots of the gain and phase function of
the input filter.

F. Examples

In the example program below, DIF is used to filter the input series Y;
VP (documented in chapter 2) is used to display the log of the original series
and the differenced series; and GFARF is used to plot the gain and phase
functions of the first difference filter. The data used are the natural
logarithm of Series G, the airline data, listed on page 531 of Box and Jenkins
[1976]. The formulas for the gain and phase functions of a first difference
filter and a plot of the corresponding gain function are shown on pages 296
and 9, respectively, of Jenkins and Watts [1968].

<10-13>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE DIF AND GFARF USING SINGLE PRECISION VERSION OF
C STARPAC
C
C N.B. DECLARATION OF Y, YF AND PHI MUST BE CHANGED TO DOUBLE
C PRECISION IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(200), YF(200), PHI(5)
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C READ NUMBER OF OBSERVATIONS
C OBSERVED SERIES
C
READ (5,100) N
READ (5,101) (Y(I), I=1,N)
C
C COMPUTE LOG OF DATA
C
DO 10 I = 1, N
Y(I) = ALOG(Y(I))
10 CONTINUE
C
C CALL DIF TO PERFORM DIFFERENCING OPERATION
C
CALL DIF (Y, N, YF, NYF)
C
C PRINT TITLE AND CALL VP TO DISPLAY LOG OF ORIGINAL SERIES
C
WRITE (IPRT,102)
CALL VP (Y, N, 1)
C
C PRINT TITLE AND CALL VP TO DISPLAY DIFFERENCED SERIES
C
WRITE (IPRT,103)
CALL VP (YF, NYF, 1)
C
C SET PARAMETERS FOR FIRST DIFFERENCE FILTER
C
PHI(1) = 1.0
IAR = 1
C
C PRINT TITLE AND CALL GFARF TO COMPUTE GAIN AND PHASE OF
C FIRST DIFFERENCE FILTER
C
WRITE (IPRT,104)
CALL GFARF (PHI, IAR)
C
C FORMAT STATEMENTS
C

<10-14>
1 100 FORMAT (4I5)
101 FORMAT (12F6.1)
102 FORMAT ('1LOG OF ORIGINAL SERIES DISPLAYED WITH STARPAC PLOT',
1 ' SUBROUTINE VP')
103 FORMAT ('1RESULTS OF STARPAC FIRST DIFFERENCE DIGITAL FILTERING',
1 ' SUBROUTINE DIF DISPLAYED WITH STARPAC PLOT SUBROUTINE VP')
104 FORMAT ('1RESULTS OF STARPAC',
* ' GAIN AND PHASE FUNCTION SUBROUTINE GFARF')
END

Data:

144
112.0 118.0 132.0 129.0 121.0 135.0 148.0 148.0 136.0 119.0 104.0 118.0
115.0 126.0 141.0 135.0 125.0 149.0 170.0 170.0 158.0 133.0 114.0 140.0
145.0 150.0 178.0 163.0 172.0 178.0 199.0 199.0 184.0 162.0 146.0 166.0
171.0 180.0 193.0 181.0 183.0 218.0 230.0 242.0 209.0 191.0 172.0 194.0
196.0 196.0 236.0 235.0 229.0 243.0 264.0 272.0 237.0 211.0 180.0 201.0
204.0 188.0 235.0 227.0 234.0 264.0 302.0 293.0 259.0 229.0 203.0 229.0
242.0 233.0 267.0 269.0 270.0 315.0 364.0 347.0 312.0 274.0 237.0 278.0
284.0 277.0 317.0 313.0 318.0 374.0 413.0 405.0 355.0 306.0 271.0 306.0
315.0 301.0 356.0 348.0 355.0 422.0 465.0 467.0 404.0 347.0 305.0 336.0
340.0 318.0 362.0 348.0 363.0 435.0 491.0 505.0 404.0 359.0 310.0 337.0
360.0 342.0 406.0 396.0 420.0 472.0 548.0 559.0 463.0 407.0 362.0 405.0
417.0 391.0 419.0 461.0 472.0 535.0 622.0 606.0 508.0 461.0 390.0 432.0

<10-15>
1LOG OF ORIGINAL SERIES DISPLAYED WITH STARPAC PLOT SUBROUTINE VP
STARPAC 2.08S (03/15/90)
4.6444 4.8232 5.0021 5.1810 5.3598 5.5387 5.7175 5.8964 6.0752 6.2541 6.4329
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
1.0000 I + I 4.7185
2.0000 I + I 4.7707
3.0000 I + I 4.8828
4.0000 I + I 4.8598
5.0000 I + I 4.7958
6.0000 I + I 4.9053
7.0000 I + I 4.9972
8.0000 I + I 4.9972
9.0000 I + I 4.9127
10.000 I + I 4.7791
11.000 I+ I 4.6444
12.000 I + I 4.7707
13.000 I + I 4.7449
14.000 I + I 4.8363
15.000 I + I 4.9488
16.000 I + I 4.9053
17.000 I + I 4.8283
18.000 I + I 5.0039
19.000 I + I 5.1358
20.000 I + I 5.1358
21.000 I + I 5.0626
22.000 I + I 4.8903
23.000 I + I 4.7362
24.000 I + I 4.9416
25.000 I + I 4.9767
26.000 I + I 5.0106
27.000 I + I 5.1818
28.000 I + I 5.0938
29.000 I + I 5.1475
30.000 I + I 5.1818
31.000 I + I 5.2933
32.000 I + I 5.2933
33.000 I + I 5.2149
34.000 I + I 5.0876
35.000 I + I 4.9836
36.000 I + I 5.1120
37.000 I + I 5.1417
38.000 I + I 5.1930
39.000 I + I 5.2627
40.000 I + I 5.1985
41.000 I + I 5.2095
42.000 I + I 5.3845
43.000 I + I 5.4381
44.000 I + I 5.4889
45.000 I + I 5.3423
46.000 I + I 5.2523
47.000 I + I 5.1475
48.000 I + I 5.2679
49.000 I + I 5.2781
50.000 I + I 5.2781
51.000 I + I 5.4638
52.000 I + I 5.4596
53.000 I + I 5.4337
54.000 I + I 5.4931
55.000 I + I 5.5759
<10-16>
1 56.000 I + I 5.6058
57.000 I + I 5.4681
58.000 I + I 5.3519
59.000 I + I 5.1930
60.000 I + I 5.3033
61.000 I + I 5.3181
62.000 I + I 5.2364
63.000 I + I 5.4596
64.000 I + I 5.4250
65.000 I + I 5.4553
66.000 I + I 5.5759
67.000 I + I 5.7104
68.000 I + I 5.6802
69.000 I + I 5.5568
70.000 I + I 5.4337
71.000 I + I 5.3132
72.000 I + I 5.4337
73.000 I + I 5.4889
74.000 I + I 5.4510
75.000 I + I 5.5872
76.000 I + I 5.5947
77.000 I + I 5.5984
78.000 I + I 5.7526
79.000 I + I 5.8972
80.000 I + I 5.8493
81.000 I + I 5.7430
82.000 I + I 5.6131
83.000 I + I 5.4681
84.000 I + I 5.6276
85.000 I + I 5.6490
86.000 I + I 5.6240
87.000 I + I 5.7589
88.000 I + I 5.7462
89.000 I + I 5.7621
90.000 I + I 5.9243
91.000 I + I 6.0234
92.000 I + I 6.0039
93.000 I + I 5.8721
94.000 I + I 5.7236
95.000 I + I 5.6021
96.000 I + I 5.7236
97.000 I + I 5.7526
98.000 I + I 5.7071
99.000 I + I 5.8749
100.00 I + I 5.8522
101.00 I + I 5.8721
102.00 I + I 6.0450
103.00 I + I 6.1420
104.00 I + I 6.1463
105.00 I + I 6.0014
106.00 I + I 5.8493
107.00 I + I 5.7203
108.00 I + I 5.8171
109.00 I + I 5.8289
110.00 I + I 5.7621
111.00 I + I 5.8916
112.00 I + I 5.8522
113.00 I + I 5.8944
114.00 I + I 6.0753
<10-17>
1 115.00 I + I 6.1964
116.00 I + I 6.2246
117.00 I + I 6.0014
118.00 I + I 5.8833
119.00 I + I 5.7366
120.00 I + I 5.8201
121.00 I + I 5.8861
122.00 I + I 5.8348
123.00 I + I 6.0064
124.00 I + I 5.9814
125.00 I + I 6.0403
126.00 I + I 6.1570
127.00 I + I 6.3063
128.00 I + I 6.3261
129.00 I + I 6.1377
130.00 I + I 6.0088
131.00 I + I 5.8916
132.00 I + I 6.0039
133.00 I + I 6.0331
134.00 I + I 5.9687
135.00 I + I 6.0379
136.00 I + I 6.1334
137.00 I + I 6.1570
138.00 I + I 6.2823
139.00 I +I 6.4329
140.00 I + I 6.4069
141.00 I + I 6.2305
142.00 I + I 6.1334
143.00 I + I 5.9661
144.00 I + I 6.0684

<10-18>
1RESULTS OF STARPAC FIRST DIFFERENCE DIGITAL FILTERING SUBROUTINE DIF DISPLAYED WITH STARPAC PLOT SUBROUTINE VP
STARPAC 2.08S (03/15/90)
-.2231 -.1785 -.1339 -.0893 -.0446 .0000 .0446 .0893 .1339 .1785 .2231
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
1.0000 I + I .52186E-01
2.0000 I + I .11212
3.0000 I + I -.22990E-01
4.0000 I + I -.64022E-01
5.0000 I + I .10948
6.0000 I + I .91937E-01
7.0000 I + I 0.
8.0000 I + I -.84557E-01
9.0000 I + I -.13353
10.000 I + I -.13473
11.000 I + I .12629
12.000 I + I -.25752E-01
13.000 I + I .91350E-01
14.000 I + I .11248
15.000 I + I -.43485E-01
16.000 I + I -.76961E-01
17.000 I + I .17563
18.000 I + I .13185
19.000 I + I 0.
20.000 I + I -.73203E-01
21.000 I + I -.17225
22.000 I + I -.15415
23.000 I + I .20544
24.000 I + I .35091E-01
25.000 I + I .33902E-01
26.000 I + I .17115
27.000 I + I -.88033E-01
28.000 I + I .53744E-01
29.000 I + I .34289E-01
30.000 I + I .11152
31.000 I + I 0.
32.000 I + I -.78369E-01
33.000 I + I -.12734
34.000 I + I -.10399
35.000 I + I .12838
36.000 I + I .29676E-01
37.000 I + I .51293E-01
38.000 I + I .69733E-01
39.000 I + I -.64193E-01
40.000 I + I .10989E-01
41.000 I + I .17501
42.000 I + I .53584E-01
43.000 I + I .50858E-01
44.000 I + I -.14660
45.000 I + I -.90061E-01
46.000 I + I -.10478
47.000 I + I .12036
48.000 I + I .10257E-01
49.000 I + I 0.
50.000 I + I .18572
51.000 I + I -.42463E-02
52.000 I + I -.25864E-01
53.000 I + I .59339E-01
54.000 I + I .82888E-01
55.000 I + I .29853E-01
<10-19>
1 56.000 I + I -.13774
57.000 I + I -.11620
58.000 I + I -.15890
59.000 I + I .11035
60.000 I + I .14815E-01
61.000 I + I -.81678E-01
62.000 I +I .22314
63.000 I + I -.34635E-01
64.000 I + I .30371E-01
65.000 I + I .12063
66.000 I + I .13448
67.000 I + I -.30254E-01
68.000 I + I -.12334
69.000 I + I -.12311
70.000 I + I -.12052
71.000 I + I .12052
72.000 I + I .55216E-01
73.000 I + I -.37899E-01
74.000 I + I .13621
75.000 I + I .74627E-02
76.000 I + I .37106E-02
77.000 I + I .15415
78.000 I + I .14458
79.000 I + I -.47829E-01
80.000 I + I -.10632
81.000 I + I -.12988
82.000 I + I -.14507
83.000 I + I .15956
84.000 I + I .21353E-01
85.000 I + I -.24957E-01
86.000 I + I .13488
87.000 I + I -.12699E-01
88.000 I + I .15848E-01
89.000 I + I .16220
90.000 I + I .99192E-01
91.000 I + I -.19561E-01
92.000 I + I -.13177
93.000 I + I -.14853
94.000 I + I -.12147
95.000 I + I .12147
96.000 I + I .28988E-01
97.000 I + I -.45462E-01
98.000 I + I .16782
99.000 I + I -.22728E-01
100.00 I + I .19915E-01
101.00 I + I .17289
102.00 I + I .97032E-01
103.00 I + I .42919E-02
104.00 I + I -.14491
105.00 I + I -.15209
106.00 I + I -.12901
107.00 I + I .96799E-01
108.00 I + I .11834E-01
109.00 I + I -.66894E-01
110.00 I + I .12959
111.00 I + I -.39442E-01
112.00 I + I .42200E-01
113.00 I + I .18094
114.00 I + I .12110
<10-20>
1 115.00 I + I .28114E-01
116.00 I+ I -.22314
117.00 I + I -.11809
118.00 I + I -.14675
119.00 I + I .83511E-01
120.00 I + I .66021E-01
121.00 I + I -.51293E-01
122.00 I + I .17154
123.00 I + I -.24939E-01
124.00 I + I .58841E-01
125.00 I + I .11672
126.00 I + I .14930
127.00 I + I .19874E-01
128.00 I + I -.18842
129.00 I + I -.12891
130.00 I + I -.11717
131.00 I + I .11224
132.00 I + I .29199E-01
133.00 I + I -.64379E-01
134.00 I + I .69163E-01
135.00 I + I .95527E-01
136.00 I + I .23581E-01
137.00 I + I .12529
138.00 I + I .15067
139.00 I + I -.26060E-01
140.00 I + I -.17640
141.00 I + I -.97083E-01
142.00 I + I -.16725
143.00 I + I .10228

<10-21>
1RESULTS OF STARPAC GAIN AND PHASE FUNCTION SUBROUTINE GFARF
STARPAC 2.08S (03/15/90)
GAIN FUNCTION OF 1 TERM AUTOREGRESSIVE, OR DIFFERENCE, FILTER
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
.0000 - +++++++++++++++++++ -
I +++++++++++++ I
I ++++++++ I
I +++++++ I
I +++++ I
-1.8039 - +++++ -
I ++++ I
I +++ I
I ++++ I
I ++ I
-3.6078 - +++ -
I ++ I
I ++ I
I ++ I
I ++ I
-5.4117 - ++ -
I + I
I + I
I ++ I
I + I
-7.2156 - + -
I + I
I + I
I I
I + I
-9.0195 - + -
I I
I + I
I + I
I I
-10.8234 - -
I + I
I I
I + I
I I
-12.6273 - -
I I
I + I
I I
I I
-14.4312 - -
I I
I + I
I I
I I
-16.2351 - -
I I
I I
I I
I I
-18.0390 - + -
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
.0000 .0500 .1000 .1500 .2000 .2500 .3000 .3500 .4000 .4500 .5000
+FREQ
PERIOD INF 20. 10. 6.6667 5. 4. 3.3333 2.8571 2.5 2.2222 2.
<10-22>
1
STARPAC 2.08S (03/15/90)
PHASE FUNCTION OF 1 TERM AUTOREGRESSIVE, OR DIFFERENCE, FILTER
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
3.1416 - -
I I
I I
I I
I I
2.5133 - -
I I
I I
I I
I I
1.8850 - -
I I
I I
I I
I I
1.2566 - -
I I
I I
I I
I I
.6283 - -
I I
I I
I I
I I
.0000 - ++++ -
I ++++++++ I
I ++++++++ I
I ++++++++ I
I ++++++++ I
-.6283 - ++++++++ -
I +++++++ I
I ++++++++ I
I ++++++++ I
I ++++++++ I
-1.2566 - ++++++++ -
I ++++++++ I
I ++++++++ I
I + I
I I
-1.8850 - -
I I
I I
I I
I I
-2.5133 - -
I I
I I
I I
I I
-3.1416 - -
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
.0000 .0500 .1000 .1500 .2000 .2500 .3000 .3500 .4000 .4500 .5000
+FREQ
PERIOD INF 20. 10. 6.6667 5. 4. 3.3333 2.8571 2.5 2.2222 2.
<10-23>
1G. Acknowledgments

The code for computing the low-pass filter coefficients is based on
subroutine LOPASS, listed on page 149 of Bloomfield [1976]. The transforms
used to compute the gain function of symmetric filters and the gain and phase
functions of autoregressive (or difference) filters are based on the
algorithms shown on pages 311 and 420, respectively, of Jenkins and Watts
[1968].

<10-24>
1----- CHAPTER 11 -----

COMPLEX DEMODULATION

A. Introduction

STARPAC contains two subroutines which find the amplitude and phase
functions of a demodulated series as described in Bloomfield [1976].

The demodulated series w(t) is formed by multiplying the observed series
by a complex sinusoid at the demodulation frequency. If the observed series Y
is a sinusoid of the nominal demodulation frequency FD with varying amplitude
and phase plus noise, that is,

Y(t) = R(t) * cos[2*pi*FD*t + phi(t)] + a(t)

= 0.5 * R(t) * (exp[i*(2*pi*FD*t+phi(t))] +

exp[-i*(2*pi*FD*t+phi(t))]) + a(t)

then the demodulated series may be represented by

w(t) = exp[-i*2*pi*FD] * Y(t)

= 0.5 * R(t) * exp[i*phi(t)] +

0.5 * R(t) * exp[-i*(4*pi*FD*t+phi(t))] + exp[-i*2*pi*FD] * a(t)

for t = 1, ..., N, where

N is the number of time points in the series;

i is the complex value sqrt(-1);

FD is the demodulation frequency in cycles per sample interval;

R(t) is the amplitude component of the observed series at time t;

phi(t) is the phase component of the observed series at time t;

a(t) is the noise at time t.

Users are directed to section B for a brief description of the
subroutines. The declaration and CALL statements are given in section C and
the subroutine arguments are defined in section D. The algorithms used and
output produced by these subroutines are discussed in section E. Sample
programs and their output are shown in section F.

B. Subroutine Descriptions

Subroutine DEMOD computes the smoothed amplitude and phase components of
the demodulated series. The user must specify the demodulation frequency,
along with the number of filter terms and the cutoff frequency which define
the low-pass filter utilized to smooth the demodulated series. Output from
DEMOD consists of plots of the amplitude and phase functions. The phase
function plot reduces discontinuities using the method suggested by Bloomfield

<11-1>
1[1976]. As shown in the example provided in section F, this method displays
both the principle phase value, which is defined to lie in the range -pi to
pi, and the principle phase value plus or minus 2*pi, where the sign is chosen
such that the second value lies in the range -2*pi to 2*pi.

Subroutine DEMODS is the same as DEMOD except that the computed amplitude
and phase functions are returned to the user and the printed output described
for DEMOD is optional.

C. Subroutine Declaration and CALL Statements

DEMOD: Compute and plot the results of a complex demodulation of the input
series

Y(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL DEMOD (Y, N, FD, FC, K, LDSTAK)

===

DEMODS: Compute and optionally plot the results of a complex demodulation of
the input series; return amplitude and phase functions of demodulated
series

Y(n), AMPL(n), PHAS(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL DEMODS (Y, N, FD, FC, K,
+ AMPL, PHAS, NDEM, NPRT, LDSTAK)

===

D. Dictionary of Subroutine Arguments and COMMON Variables

AMPL <-- The vector of dimension at least N-(K-1) that contains the NDEM
values of the smoothed amplitude function of the observed series,

<11-2>
1 AMPL(I) = R(t), where R(t) is defined in section A and the index I
is computed as I = t - (K-1)/2 for t = (K+1)/2 to N-(K-1)/2. The
stored values of the amplitude function will start in AMPL(1);
AMPL(NDEM+1) to AMPL(N) will be set to zero.

FC --> The cutoff frequency for the low-pass filter in cycles per sample
interval. FC must lie between 1/K and FD.

FD --> The demodulation frequency in cycles per sample interval. FD must
lie between 0.0 and 0.5.

IERR = 0 indicates that no errors were detected.

IERR = 1 indicates that improper input was detected.

K --> The number of terms in the low-pass filter used to extract the
amplitude and phase functions. K must be odd. The user must
choose the number of filter terms, K, so that (FC - 1/K) > 0 and
(FC + 1/K) < 0.5. In addition, K must be chosen as a compromise
between:

1) A sharp cutoff, that is, 1/K small; and

2) Minimizing the number of data points lost by the filtering
operations. ((K - 1)/2 data points will be lost from each end
of the series.)

For DEMOD:

LDSTAK >= 10 + (3*N+K)*P

For DEMODS:

LDSTAK >= 9 + (IO*2*N+K)*P

where IO = 0 if NPRT = 0 and IO = 1 if NPRT <> 0.

N --> The number of observations, which must equal or exceed 17.

NDEM <-- The number of observations in AMPL and PHAS, NDEM = N - (K-1).

<11-3>
1NPRT --> The variable controlling printed output.

If NPRT = 0, the automatic printout is suppressed.

If NPRT <> 0, the automatic printout is provided.

The default value is 1. If NPRT is not an argument of the
subroutine CALL statement the default value is used.

PHAS <-- The vector of dimension at least NDEM = N-(K-1) that contains the
NDEM primary values of the smoothed phase function of the observed
series, PHAS(I) = phi(t), where phi(t) is defined in section A and
the index I is computed as I = t - (K-1)/2 for t = (K+1)/2 to N-
(K-1)/2. The stored values of the phase function will start in
PHAS(1); PHAS(NDEM+1) to PHAS(N) will be set to zero.

Y --> The vector of dimension at least N that contains the observations
of the time series.

E. Computational Methods

E.1 Algorithms

The STARPAC code for performing complex demodulation was adapted from the
subroutines given on pages 147 to 150 of Bloomfield [1976]. As noted in
Bloomfield, the first term of the demodulated series defined in section A is
centered about zero frequency while the remaining two terms are centered at
frequencies FD and 2*FD. Thus, the first term can be separated from the
others using the low-pass filter described in chapter 10 (with FC
approximately FD/2), resulting in the complex filtered series

K
YF(t) = SUM HLP(J)*w(t+Km-J)
J=1

= alpha(t) + i*beta(t)

which is approximately

0.5*R(t)*exp[i*phi(t)] for t = Km, Km+1, ..., (N-Km+1),

where

K is the number of filter terms;

Km is the midpoint of the filter, Km = (K+1)/2;

HLP(J) is the Jth low-pass filter coefficient, defined in chapter 10,
section D;

alpha(t) is the real part of the filtered series;

beta(t) is the imaginary part of the filtered series.

The smoothed estimates of the amplitude, Rhat, and phase, phihat, functions
can then be extracted from the filtered series using

<11-4>
1 Rhat(t) = 2*sqrt[alpha(t)**2 + beta(t)**2]

and

phihat(t) = arctan[alpha(t)/beta(t)].

Note that (K-1)/2 points have been lost from each end of the demodulated
series by the filtering operation.

E.2 Computed Results and Printed Output

The argument controlling the printed output, NPRT, is discussed in
section D. The output consists of plots of the smoothed amplitude and phase
functions, and a list of the demodulation frequency, cutoff frequency and
number of terms in the low-pass filter used to smooth the demodulated series.

F. Example

In the example program below, DEMOD is used to estimate the amplitude and
phase function corresponding to the input series Y. The data used are the
Wolf sunspot numbers for the years 1700 to 1960 as tabulated by Waldmeier
[l961]. Further discussion of this example can be found on pages 137 to 141
of Bloomfield [1976].

<11-5>
1Program:

PROGRAM EXAMPL
C
C DEMONSTRATE DEMOD USING SINGLE PRECISION VERSION OF STARPAC
C
C N.B. DECLARATION OF Y MUST BE CHANGED TO DOUBLE PRECISION
C IF DOUBLE PRECISION VERSION OF STARPAC IS USED.
C
REAL Y(300)
DOUBLE PRECISION DSTAK(500)
C
COMMON /CSTAK/ DSTAK
C
C SET UP INPUT AND OUTPUT FILES
C [CHAPTER 1, SECTION D.4, DESCRIBES HOW TO CHANGE OUTPUT UNIT.]
C
CALL IPRINT(IPRT)
OPEN (UNIT=IPRT, FILE='FILENM')
OPEN (UNIT=5, FILE='DATA')
C
C SPECIFY NECESSARY DIMENSIONS
C
LDSTAK = 500
C
C READ NUMBER OF OBSERVATIONS
C OBSERVED SERIES
C
READ (5,100) N
READ (5,101) (Y(I), I=1,N)
C
C SET DEMODULATION FREQUENCY
C CUTOFF FREQUENCY
C NUMBER OF TERMS IN THE LOW-PASS FILTER
C
FD = 1.0 / 11.0
FC = 1.0 / 22.0
K = 41
C
C PRINT TITLE AND CALL DEMOD FOR COMPLEX DEMODULATION ANALYSIS
C
WRITE (IPRT,102)
CALL DEMOD (Y, N, FD, FC, K, LDSTAK)
C
C FORMAT STATEMENTS
C
100 FORMAT (I5)
101 FORMAT (10F7.2)
102 FORMAT ('1RESULTS OF STARPAC',
* ' COMPLEX DEMODULATION SUBROUTINE DEMOD')
END

<11-6>
1Data:

261
5.00 11.00 16.00 23.00 36.00 58.00 29.00 20.00 10.00 8.00
3.00 0.00 0.00 2.00 11.00 27.00 47.00 63.00 60.00 39.00
28.00 26.00 22.00 11.00 21.00 40.00 78.00 122.00 103.00 73.00
47.00 35.00 11.00 5.00 16.00 34.00 70.00 81.00 111.00 101.00
73.00 40.00 20.00 16.00 5.00 11.00 22.00 40.00 60.00 80.90
83.40 47.70 47.80 30.70 12.20 9.60 10.20 32.40 47.60 54.00
62.90 85.90 61.20 45.10 36.40 20.90 11.40 37.80 69.80 106.10
100.80 81.60 66.50 34.80 30.60 7.00 19.80 92.50 154.40 125.90
84.80 68.10 38.50 22.80 10.20 24.10 82.90 132.00 130.90 118.10
89.90 66.60 60.00 46.90 41.00 21.30 16.00 6.40 4.10 6.80
14.50 34.00 45.00 43.10 47.50 42.20 28.10 10.10 8.10 2.50
0.00 1.40 5.00 12.20 13.90 35.40 45.80 41.10 30.10 23.90
15.60 6.60 4.00 1.80 8.50 16.60 36.30 49.60 64.20 67.00
70.90 47.80 27.50 8.50 13.20 56.90 121.50 138.30 103.20 85.70
64.60 36.70 24.20 10.70 15.00 40.10 61.50 98.50 124.70 96.30
66.60 64.50 54.10 39.00 20.60 6.70 4.30 22.70 54.80 93.80
95.80 77.20 59.10 44.00 47.00 30.50 16.30 7.30 37.60 74.00
139.00 111.20 101.60 66.20 44.70 17.00 11.30 12.40 3.40 6.00
32.30 54.30 59.70 63.70 63.50 52.20 25.40 13.10 6.80 6.30
7.10 35.60 73.00 85.10 78.00 64.00 41.80 26.20 26.70 12.10
9.50 2.70 5.00 24.40 42.00 63.50 53.80 62.00 48.50 43.90
18.60 5.70 3.60 1.40 9.60 47.40 57.10 103.90 80.60 63.60
37.60 26.10 14.20 5.80 16.70 44.30 63.90 69.00 77.80 64.90
35.70 21.20 11.10 5.70 8.70 36.10 79.70 114.40 109.60 88.80
67.80 47.50 30.60 16.30 9.60 33.20 92.60 151.60 136.30 134.70
83.90 69.40 31.50 13.90 4.40 38.00 141.70 190.20 184.80 159.00
112.30

<11-7>
1RESULTS OF STARPAC COMPLEX DEMODULATION SUBROUTINE DEMOD
STARPAC 2.08S (03/15/90)

TIME SERIES DEMODULATION

DEMODULATION FREQUENCY IS .09090909
CUTOFF FREQUENCY IS .04545455
THE NUMBER OF TERMS IN THE FILTER IS 41

PLOT OF AMPLITUDE OF SMOOTHED DEMODULATED SERIES

LOCATION OF MEAN IS GIVEN BY PLOT CHARACTER M
19.7732 24.0727 28.3722 32.6717 36.9711 41.2706 45.5701 49.8696 54.1690 58.4685 62.7680
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
1.0000 I + M I 30.652
2.0000 I + M I 32.848
3.0000 I + M I 35.191
4.0000 I + M I 37.582
5.0000 I M + I 39.925
6.0000 I M + I 42.146
7.0000 I M + I 44.216
8.0000 I M + I 46.153
9.0000 I M + I 47.948
10.000 I M + I 49.529
11.000 I M + I 50.786
12.000 I M + I 51.633
13.000 I M + I 52.052
14.000 I M + I 52.071
15.000 I M + I 51.754
16.000 I M + I 51.178
17.000 I M + I 50.429
18.000 I M + I 49.599
19.000 I M + I 48.776
20.000 I M + I 48.011
21.000 I M + I 47.287
22.000 I M + I 46.515
23.000 I M + I 45.587
24.000 I M + I 44.441
25.000 I M + I 43.055
26.000 I M + I 41.462
27.000 I M+ I 39.721
28.000 I + M I 37.931
29.000 I + M I 36.245
30.000 I + M I 34.853
31.000 I + M I 33.877
32.000 I + M I 33.306
33.000 I + M I 33.020
34.000 I + M I 32.861
35.000 I + M I 32.690
36.000 I + M I 32.398
37.000 I + M I 31.913
38.000 I + M I 31.235
39.000 I + M I 30.500
40.000 I + M I 29.924
41.000 I + M I 29.721
42.000 I + M I 30.052
43.000 I + M I 30.936
<11-8>
1 44.000 I + M I 32.234
45.000 I + M I 33.719
46.000 I + M I 35.148
47.000 I + M I 36.375
48.000 I + M I 37.438
49.000 I +M I 38.520
50.000 I M + I 39.817
51.000 I M + I 41.445
52.000 I M + I 43.391
53.000 I M + I 45.493
54.000 I M + I 47.509
55.000 I M + I 49.231
56.000 I M + I 50.580
57.000 I M + I 51.625
58.000 I M + I 52.548
59.000 I M + I 53.573
60.000 I M + I 54.903
61.000 I M + I 56.621
62.000 I M + I 58.605
63.000 I M + I 60.549
64.000 I M + I 62.053
65.000 I M +I 62.768
66.000 I M + I 62.470
67.000 I M + I 61.053
68.000 I M + I 58.498
69.000 I M + I 54.856
70.000 I M + I 50.271
71.000 I M + I 44.992
72.000 I M+ I 39.363
73.000 I + M I 33.810
74.000 I + M I 28.878
75.000 I + M I 25.219
76.000 I + M I 23.401
77.000 I + M I 23.440
78.000 I + M I 24.701
79.000 I + M I 26.348
80.000 I + M I 27.725
81.000 I + M I 28.463
82.000 I + M I 28.457
83.000 I + M I 27.798
84.000 I + M I 26.694
85.000 I + M I 25.380
86.000 I + M I 24.053
87.000 I + M I 22.852
88.000 I + M I 21.861
89.000 I + M I 21.114
90.000 I + M I 20.585
91.000 I + M I 20.228
92.000 I + M I 19.997
93.000 I+ M I 19.856
94.000 I+ M I 19.780
95.000 I+ M I 19.773
96.000 I+ M I 19.881
97.000 I + M I 20.172
98.000 I + M I 20.663
99.000 I + M I 21.322
100.00 I + M I 22.121
101.00 I + M I 23.043
102.00 I + M I 24.074
<11-9>
1 103.00 I + M I 25.170
104.00 I + M I 26.225
105.00 I + M I 27.089
106.00 I + M I 27.651
107.00 I + M I 27.926
108.00 I + M I 28.094
109.00 I + M I 28.456
110.00 I + M I 29.302
111.00 I + M I 30.831
112.00 I + M I 33.091
113.00 I + M I 35.983
114.00 I M I 39.318
115.00 I M + I 42.866
116.00 I M + I 46.383
117.00 I M + I 49.656
118.00 I M + I 52.532
119.00 I M + I 54.905
120.00 I M + I 56.647
121.00 I M + I 57.612
122.00 I M + I 57.717
123.00 I M + I 56.979
124.00 I M + I 55.517
125.00 I M + I 53.529
126.00 I M + I 51.257
127.00 I M + I 48.939
128.00 I M + I 46.789
129.00 I M + I 44.991
130.00 I M + I 43.671
131.00 I M + I 42.836
132.00 I M + I 42.376
133.00 I M + I 42.143
134.00 I M + I 41.978
135.00 I M + I 41.754
136.00 I M + I 41.392
137.00 I M + I 40.857
138.00 I M + I 40.153
139.00 I M+ I 39.348
140.00 I +M I 38.625
141.00 I + M I 38.249
142.00 I + M I 38.470
143.00 I M+ I 39.409
144.00 I M + I 41.011
145.00 I M + I 43.087
146.00 I M + I 45.406
147.00 I M + I 47.742
148.00 I M + I 49.872
149.00 I M + I 51.603
150.00 I M + I 52.815
151.00 I M + I 53.486
152.00 I M + I 53.642
153.00 I M + I 53.276
154.00 I M + I 52.344
155.00 I M + I 50.815
156.00 I M + I 48.720
157.00 I M + I 46.180
158.00 I M + I 43.393
159.00 I M + I 40.591
160.00 I + M I 37.983
161.00 I + M I 35.722
<11-10>
1 162.00 I + M I 33.921
163.00 I + M I 32.684
164.00 I + M I 32.106
165.00 I + M I 32.232
166.00 I + M I 32.994
167.00 I + M I 34.207
168.00 I + M I 35.618
169.00 I + M I 36.974
170.00 I + M I 38.080
171.00 I +M I 38.791
172.00 I M I 39.002
173.00 I +M I 38.724
174.00 I + M I 38.055
175.00 I + M I 37.128
176.00 I + M I 36.047
177.00 I + M I 34.859
178.00 I + M I 33.594
179.00 I + M I 32.295
180.00 I + M I 31.060
181.00 I + M I 29.998
182.00 I + M I 29.205
183.00 I + M I 28.723
184.00 I + M I 28.547
185.00 I + M I 28.668
186.00 I + M I 29.109
187.00 I + M I 29.919
188.00 I + M I 31.119
189.00 I + M I 32.649
190.00 I + M I 34.381
191.00 I + M I 36.171
192.00 I + M I 37.886
193.00 I M+ I 39.408
194.00 I M + I 40.621
195.00 I M + I 41.430
196.00 I M + I 41.815
197.00 I M + I 41.858
198.00 I M + I 41.681
199.00 I M + I 41.339
200.00 I M + I 40.812
201.00 I M + I 40.056
202.00 I M I 39.052
203.00 I + M I 37.838
204.00 I + M I 36.522
205.00 I + M I 35.275
206.00 I + M I 34.325
207.00 I + M I 33.954
208.00 I + M I 34.416
209.00 I + M I 35.740
210.00 I + M I 37.716
211.00 I M + I 40.017
212.00 I M + I 42.334
213.00 I M + I 44.438
214.00 I M + I 46.206
215.00 I M + I 47.597
216.00 I M + I 48.667
217.00 I M + I 49.589
218.00 I M + I 50.609
219.00 I M + I 51.878
220.00 I M + I 53.421
<11-11>
1 221.00 I M + I 55.145

<11-12>
1
STARPAC 2.08S (03/15/90)
PLOT OF PHASE OF SMOOTHED DEMODULATED SERIES

LOCATION OF ZERO IS GIVEN BY PLOT CHARACTER 0
-6.2832 -5.0265 -3.7699 -2.5133 -1.2566 .0000 1.2566 2.5133 3.7699 5.0265 6.2832
-I---------I---------I---------I---------I---------I---------I---------I---------I---------I---------I-
1.0000 I B 0 A I
2.0000 I B 0 A I
3.0000 I B 0 A I
4.0000 I B 0 A I
5.0000 I B 0 A I
6.0000 I B 0 A I
7.0000 I B 0 A I
8.0000 I A 0 B I
9.0000 I A 0 B I
10.000 I A 0 B I
11.000 I A 0 B I
12.000 I A 0 B I
13.000 I A 0 B I
14.000 I A 0 B I
15.000 I A 0 B I
16.000 I A 0 B I
17.000 I A 0 B I
18.000 I A 0 B I
19.000 I A 0 B I
20.000 I A 0 B I
21.000 I A 0 B I
22.000 I A 0 B I
23.000 I A 0 B I
24.000 I A 0 B I
25.000 I A 0 B I
26.000 I A 0 B I
27.000 I B 0 A I
28.000 I B 0 A I
29.000 I B 0 A I
30.000 I B 0 A I
31.000 I B 0 A I
32.000 I B 0 A I
33.000 I B 0 A I
34.000 I B 0 A I
35.000 I B 0 A I
36.000 I B 0 A I
37.000 I B 0 A I
38.000 I B 0 A I
39.000 I B 0 A I
40.000 I B 0 A I
41.000 I A 0 B I
42.000 I A 0 B I
43.000 I A 0 B I
44.000 I A 0 B I
45.000 I A 0 B I
46.000 I A 0 B I
47.000 I A 0 B I
48.000 I A 0 B I
49.000 I A 0 B I
50.000 I A 0 B I
51.000 I A 0 B I
52.000 I A 0 B I
<11-13>
1 53.000 I A 0 B I
54.000 I A 0 B I
55.000 I A 0 B I
56.000 I A 0 B I
57.000 I A 0 B I
58.000 I A 0 B I
59.000 I A 0 B I
60.000 I A 0 B I
61.000 I A 0 B I
62.000 I A 0 B I
63.000 I A 0 B I
64.000 I A 0 B I
65.000 I A 0 B I
66.000 I A 0 B I
67.000 I A 0 B I
68.000 I A 0 B I
69.000 I A 0 B I
70.000 I A 0 B I
71.000 I A 0 B I
72.000 I A 0 B I
73.000 I A 0 B I
74.000 I A 0 B I
75.000 I A 0 B I
76.000 I A 0 B I
77.000 I A 0 B I
78.000 I A 0 B I
79.000 I A 0 B I
80.000 I A 0 B I
81.000 I A 0 B I
82.000 I A 0 B I
83.000 I A 0 B I
84.000 I A 0 B I
85.000 I A 0 B I
86.000 I A 0 B I
87.000 I A 0 B I
88.000 I A 0 B I
89.000 I A 0 B I
90.000 I A 0 B I
91.000 I B 0 A I
92.000 I B 0 A I
93.000 I B 0 A I
94.000 I B 0 A I
95.000 I B 0 A I
96.000 I B 0 A I
97.000 I B 0 A I
98.000 I B 0 A I
99.000 I B 0 A I
100.00 I B 0 A I
101.00 I B 0 A I
102.00 I B 0 A I
103.00 I B 0 A I
104.00 I B 0 A I
105.00 I B 0 A I
106.00 I B 0 A I
107.00 I B 0 A I
108.00 I B 0 A I
109.00 I B 0 A I
110.00 I B 0 A I
111.00 I B 0 A I
<11-14>
1 112.00 I B 0 A I
113.00 I B 0 A I
114.00 I B 0 A I
115.00 I B 0 A I
116.00 I B 0 A I
117.00 I B 0 A I
118.00 I B 0 A I
119.00 I A 0 B I
120.00 I A 0 B I
121.00 I A 0 B I
122.00 I A 0 B I
123.00 I A 0 B I
124.00 I A 0 B I
125.00 I A 0 B I
126.00 I A 0 B I
127.00 I A 0 B I
128.00 I A 0 B I
129.00 I B 0 A I
130.00 I B 0 A I
131.00 I B 0 A I
132.00 I B 0 A I
133.00 I B 0 A I
134.00 I B 0 A I
135.00 I B 0 A I
136.00 I B 0 A I
137.00 I B 0 A I
138.00 I B 0 A I
139.00 I B 0 A I
140.00 I B 0 A I
141.00 I B 0 A I
142.00 I B 0 A I
143.00 I B 0 A I
144.00 I B 0 A I
145.00 I B 0 A I
146.00 I B 0 A I
147.00 I B 0 A I
148.00 I B 0 A I
149.00 I B 0 A I
150.00 I B 0 A I
151.00 I B 0 A I
152.00 I B 0 A I
153.00 I B 0 A I
154.00 I B 0 A I
155.00 I B 0 A I
156.00 I B 0 A I
157.00 I B 0 A I
158.00 I B 0 A I
159.00 I B 0 A I
160.00 I B 0 A I
161.00 I B 0 A I
162.00 I B 0 A I
163.00 I B 0 A I
164.00 I B 0 A I
165.00 I B 0 A I
166.00 I B 0 A I
167.00 I B 0 A I
168.00 I B 0 A I
169.00 I B 0 A I
170.00 I B 0 A I
<11-15>
1 171.00 I B 0 A I
172.00 I B 0 A I
173.00 I B 0 A I
174.00 I B 0 A I
175.00 I B 0 A I
176.00 I B 0 A I
177.00 I B 0 A I
178.00 I B 0 A I
179.00 I B 0 A I
180.00 I B 0 A I
181.00 I B 0 A I
182.00 I B 0 A I
183.00 I B 0 A I
184.00 I B 0 A I
185.00 I B 0 A I
186.00 I B 0 A I
187.00 I B 0 A I
188.00 I B 0 A I
189.00 I B 0 A I
190.00 I B 0 A I
191.00 I B 0 A I
192.00 I B 0 A I
193.00 I B 0 A I
194.00 I B 0 A I
195.00 I B 0 A I
196.00 I B 0 A I
197.00 I B 0 A I
198.00 I B 0 A I
199.00 I B 0 A I
200.00 I B 0 A I
201.00 I B 0 A I
202.00 I B 0 A I
203.00 I B 0 A I
204.00 I B 0 A I
205.00 I B 0 A I
206.00 I B 0 A I
207.00 I B 0 A I
208.00 I B 0 A I
209.00 I B 0 A I
210.00 I B 0 A I
211.00 I B 0 A I
212.00 I B 0 A I
213.00 I B 0 A I
214.00 I B 0 A I
215.00 I B 0 A I
216.00 I B 0 A I
217.00 I B 0 A I
218.00 I B 0 A I
219.00 I B 0 A I
220.00 I B 0 A I
221.00 I B 0 A I

<11-16>
1G. Acknowledgments

The code for performing the complex demodulation was adapted from the
subroutines given on pages 147 to 150 of Bloomfield [1976].

<11-17>
1----- CHAPTER 12 -----

CORRELATION AND SPECTRUM ANALYSIS

A. Introduction

STARPAC contains 50 subroutines for time series correlation and spectrum
estimation. Both univariate and bivariate series can be analyzed. Included
are subroutines that compute the correlation function using the fast Fourier
transform and that accept time series with missing observations. The user may
choose from spectrum analysis subroutines implementing the classical Fourier
transformed covariance function techniques presented in Jenkins and Watts
[1968], the autoregressive or rational spectrum techniques described by Jones
[1971] or the direct Fourier transform (periodogram) techniques discussed in
Bloomfield [1976].

Users are directed to section B for a brief description of the
subroutines. The declaration and CALL statements are given in section C and
the subroutine arguments are defined in section D. The algorithms used and
output produced by these subroutines are discussed in section E. Sample
programs and their output are shown in section F.

B. Subroutine Descriptions

STARPAC correlation and spectrum analysis subroutines are divided into
seven families. For correlation analysis of univariate and bivariate series
there are two families of subroutines supporting
1. Autocorrelation Analysis and
2. Cross Correlation Analysis.
For spectrum estimation there are four families of subroutines for univariate
series and one family for bivariate series supporting
3. Univariate Spectrum Estimation Using the Fourier Transform of the
Autocorrelation Function,
4. Univariate Spectrum Estimation Using Autoregressive Models,
5. Univariate Spectrum Estimation Using the Direct Fourier Transform,
6. Univariate Series Utilities and
7. Bivariate Spectrum Estimation Using the Fourier Transform of the
Cross Correlation Function.

In general, each family of subroutines has one basic subroutine which
performs the desired computations with a minimum of user input. The other
subroutines in each family provide greater flexibility to the user at the
price of more input. The features of these subroutines are indicated by the
suffix letter(s) on the subroutine (e.g., ACFM and BFSFS). Not all features
are available for each family. Features which are common to more than one
family are described here. Features which are unique to a specific family are
described in the subsections below.

* Suffix S indicates that the user is allowed to specify various options
which are preset in the simplest call and that certain results are
returned to the user via the subroutine CALL statement. In the
subsections that follow, the specific details of this feature are
discussed individually for each family of subroutines.

* Suffix M indicates that the series contains missing data. A missing
value code must be used within the series to specify time points
without an observed value. There is no limit on the percentage of data

<12-1>
1 that can be missing. However, because the correlation matrix computed
from a series with missing values is not necessarily positive definite,
the partial autocorrelation function estimates and autoregressive order
selection statistics are not computed and caution must be used in
interpreting those results which are provided. Analysis of time series
with missing values is discussed in Jones [1971].

* Suffix F indicates that the covariances are computed using the
Singleton [1969] fast Fourier transform (FFT). When the number of
observations in the series is large this method of computation is more
efficient than the direct computation normally used by STARPAC.
Subroutines with an F suffix reduce the amount of workspace needed by
using the vector originally containing the data as workspace; the data
must be copied into another vector prior to calling these subroutines
if the data are to be preserved. These subroutines automatically
extend the length of the input series by appending enough zeros to meet
the requirements of this FFT code; the length of the vector used to
pass the data to these subroutines must therefore equal or exceed the
extended series length, NFFT, as discussed in section D.

* Suffix V indicates that the user inputs the covariances rather than the
original series, thus avoiding a time-consuming recomputation of the
covariance function if it is already available, for example, from
subroutines ACFS, ACFFS, ACFMS, CCFS, CCFFS or CCFMS.

B.1 Correlation Analysis

B.1.a Univariate Series

Autocorrelation Analysis. STARPAC's autocorrelation function (acf)
subroutines compute and plot the autocorrelation function estimates; compute
the large lag standard error of the estimates; perform a chi-squared test of
the null hypothesis that the series is white noise; compute the partial
autocorrelation function coefficients estimates; and, using the modified
Akaike information criterion [Akaike, 1974], select the order of an
autoregressive process which models the series and estimate the parameters of
this autoregressive model. The user should note that a purely autoregressive
model may approximate the true structure of the model with an unnecessarily
large number of terms. Such an autoregressive model must be used with
discretion since the true structure might actually be more complex including
moving average components, harmonic terms or some mixture of deterministic and
stochastic elements. For some purposes, a purely autoregressive approximation
may be useful. In other cases, careful model identification can lead to the
discovery of more detailed structure of the data or to a more parsimonious
model.

The simplest of the Auto Correlation Function subroutines is ACF, which
performs the basic analysis described in the preceding paragraph. The other
autocorrelation analysis subroutines provide the same basic analysis as ACF
while adding the features indicated above by suffixes S, M, F, MS and FS.

For the ACF family of subroutines, the suffix S feature allows the user
to indicate:
1) the maximum lag value for which the correlation function is to be
computed; and
2) the amount of printed output.
The acf subroutines with suffix S also return the autocovariance function

<12-2>
1estimates and the coefficients of the selected autoregressive model to the
user via the subroutine CALL statement.

The ACF family of subroutines also includes subroutine ACFD, where the
suffix D indicates that the autocorrelation analysis will be performed for a
sequence of differenced series. The difference factors are provided by the
user. If the number of difference factors, NFAC, is greater than one,
difference factors beyond the first are applied to the input series Y(t) to
yield a series Z(t) given by

NFAC
Z(t) = [ PRODUCT (1 - B[IOD(J)])**ND(J) ] * Y(t)
J=2

where the B[k] indicates the backward shift operator defined by

B[k]*Y(t) = Y(t-k)

and IOD and ND are defined in section D. If the number of difference factors
is equal to one, Z(t) = Y(t). In either case, the autocorrelation analysis is
performed first on the series Z and then on series Z with the difference
factor (1-B[IOD(1)]) applied 1 to ND(1) times. This produces ND(1) + 1 passes
of the basic ACF analysis.

B.1.b Bivariate Series

Cross Correlation Analysis. STARPAC's cross correlation analysis
subroutines compute and plot the cross correlation function coefficients and
provide the large lag standard error of these estimates. Subroutine CCF is
the simplest of the Cross Correlation Function subroutines. The other five
cross correlation analysis subroutines provide the same basic analysis as CCF
while adding the features indicated above by suffixes S, M, F, MS and FS.

For the CCF family of subroutines, suffix S indicates that the analysis
is provided for each pair of series of a multivariate time series. The suffix
S feature also allows the user to indicate:
1) the maximum lag value for which the correlation function is to be
computed; and
2) the amount of printed output.
In addition, the cross covariance function estimates are returned to the
user.

B.2 Spectrum Estimation

B.2.a Univariate Series

Univariate Spectrum Estimation Using the Fourier Transform of the
Autocorrelation Function. The UFS (Univariate Fourier Spectrum) subroutine
family computes the estimated spectrum from the Fourier transform of the
autocovariance function (acvf) as discussed in Jenkins and Watts [1968]. The
spectrum is smoothed using Parzen windows with the bandwidth controlled either
by the user or a window-closing algorithm. The principal output from each of
these subroutines consists of plots of the estimated spectrum.

Subroutine UFS has the simplest CALL statement of this family of
subroutines. The printed output consists of four spectrum plots with

<12-3>
1successively narrower bandwidths. Each spectrum is displayed in decibels (10
times the base 10 logarithm of the power spectrum) scaled so that the maximum
value plotted is zero. The length of the upper and lower 95-percent
confidence intervals and the bandwidth for each spectrum are shown on the
plots.

The other nine univariate Fourier spectrum estimation subroutines provide
the same basic analysis as UFS while adding the features indicated above by
suffixes S, M, F, V, FS, MS, VS, MV and MVS.

For the UFS family of subroutines, the suffix S feature allows the user
to indicate:
1) the number of different window bandwidths to be used and the lag
window truncation point for each;
2) the frequency range and the number of frequencies for which the
spectrum is to be computed; and
3) whether the plot is to be in decibels or on a logarithmic scale.
In addition, the spectrum values are returned to the user.

Univariate Spectrum Estimation Using Autoregressive Models. STARPAC
Univariate Autoregressive Spectrum estimation subroutines (UAS family)
approximate an input series with an autoregressive model and compute the
corresponding theoretical spectrum for that model. For comparative purposes,
the plot of the autoregressive spectrum is superimposed against a Fourier
spectrum plot.

Subroutine UAS is the simplest of the autoregressive spectrum estimation
subroutines. It uses the modified Akaike information criterion [Akaike, 1974]
to select the order of the autoregressive model to be used. The
autoregressive coefficients are then computed from the autocovariance function
using the Levinson-Durbin recursive method [Box and Jenkins, 1976] for solving
the Yule-Walker equations. The lag window truncation point used for the Fourier
spectrum is half the maximum truncation point which would have been selected
by subroutine UFS [see section D, argument LAGS]. The output consists of
several autoregressive order selection statistics and a plot of the
autoregressive and Fourier spectra in decibels (10 times the base 10 logarithm
of the power spectrum) scaled such that the maximum value plotted is zero.
The bandwidth and the length of the 95-percent confidence interval for the
Fourier spectrum are shown on the plot. This Fourier spectrum and its
confidence intervals should be used in interpreting the autoregressive
spectrum since confidence intervals are not computed by STARPAC for the
autoregressive spectrum. (The bandwidth is not relevant to the autoregressive
spectrum.)

The other five autoregressive spectrum subroutines provide the same basic
analysis as subroutine UAS while adding the features indicated above by
suffixes S, F, V, FS and VS.

For the autoregressive spectrum family of subroutines, the suffix S
feature allows the user to indicate:
1) the order of the autoregressive model to be used for the
autoregressive spectrum;
2) the lag window truncation point to be used for the Fourier spectrum;
3) the frequency range and the number of frequencies within this range
at which the spectrum is to be computed; and
4) whether the plot is to be in decibels or on a logarithmic scale.
In addition, the autoregressive and Fourier spectra are returned to the user.

<12-4>
1The user should be cautious about using high order models without checking
order selection statistics since such models can produce spurious peaks in the
spectrum.

Univariate Spectrum Estimation Using the Direct Fourier Transform. The
STARPAC direct Fourier transform subroutines (PGM family) implement the
PeriodoGraM approach to time series analysis discussed in Bloomfield [1976].
Subroutines are included for computing the raw periodogram and for computing
and plotting the integrated periodogram (or cumulative spectrum).

Subroutine PGM computes the periodogram of the series as described for
argument PER in section D, using zeros to extend the length of the input
series. Output consists of a plot of the computed periodogram in decibels (10
times the base 10 logarithm of the periodogram estimates) scaled so that the
maximum value plotted is zero. The input series must be either centered or
tapered by the user before PGM is called.

PGMS provides the same basic analysis as PGM but allows the user to
indicate:
1) whether zeros or the series mean is used to extend the series;
2) the length of the extended series; and
3) the amount of printed output.
In addition, the periodogram values and their frequencies are returned to the
user via the subroutine CALL statement.

The integrated periodogram subroutine IPGM first subtracts the series
mean from the series, then extends the input series with zeros and finally
computes the normalized integrated periodogram. Output consists of a one-page
plot of the integrated periodogram, accompanied by 95-percent contours for
testing the null hypothesis of white noise. The integrated periodogram is
discussed in chapter 8 of Box and Jenkins [1976].

The other three integrated periodogram subroutines add the features
indicated by suffixes S, P and PS. The suffix S option allows the user to
control the amount of printed output; the integrated periodogram values and
their corresponding frequencies are also returned to the user via the
subroutine CALL statement. The suffix P option indicates that the user inputs
the periodogram rather than the original series, thus avoiding a
time-consuming recomputation of the periodogram if it is already available,
for example, from subroutine PGMS.

Utilities. STARPAC includes utility subroutines for centering
(subtracting the mean) and tapering the observed series, periodogram smoothing
and for computing the Fourier coefficients of the series. These routines are
particularly useful when using direct Fourier techniques such as PGM and IPGM.

Subroutine CENTER subtracts the series mean from the series and returns
the centered series. There is no printed output.

Subroutine TAPER centers the input series and applies the
split-cosine-bell taper described for argument YT in section D. The user
specifies the total proportion of the series to be tapered. The centered
tapered series is returned to the user and can be used as input to subroutine
PGM or PGMS. There is no printed output.

<12-5>
1 Subroutine FFTLEN computes for an observed series length, N, the minimum
extended series length, NFFT, which will meet the requirements of the
Singleton FFT code. The value of the extended series length is returned to
the user. There is no printed output.

Subroutine MDFLT smooths the input periodogram by applying a sequence of
modified Daniell filters as discussed in chapter 7 of Bloomfield [1976]. The
filtered series is returned to the user. There is no printed output.
Subroutine MDFLT takes advantage of the symmetry of the periodogram to avoid
losing values from the ends of the series. It should therefore not be used
for input series that are not symmetric about their end values. Other digital
filtering routines, such as those described in chapter 10, may also be used
for periodogram smoothing but end effect losses will be incurred.

Subroutine FFTR computes the Fourier coefficients of an input series of
single precision observations. There is no printed output.

B.2.b Bivariate Series

Bivariate Spectrum Estimation Using the Fourier Transform of the Cross
Correlation Function. The BFS (Bivariate Fourier Spectrum) subroutine family
computes the estimated spectrum from the Fourier transform of the covariance
function as discussed in Jenkins and Watts [1968]. The spectrum is smoothed
using Parzen windows with the bandwidth controlled either by the user or a
window-closing algorithm. The principal output from each of these subroutines
consists of plots of the squared coherency and phase components of the cross
spectrum. The phase function plots reduce discontinuities using the method
suggested by Bloomfield [1976]. As shown in the example in section F, this
method displays both the principle phase value, which is defined to lie in the
range -pi to pi, and the principle phase value plus or minus 2*pi, where the
sign is chosen such that the second value lies in the range -2*pi to 2*pi.

Subroutine BFS provides the basic analysis with a brief CALL statement.
The printed output consists of four spectrum plot pairs (a squared coherency
plot and a phase plot) with successively narrower bandwidths chosen by the
window-closing algorithm. The upper and lower 95-percent confidence intervals
and the 95-percent significance levels are shown on the coherency plots.

The other nine bivariate Fourier spectrum estimation subroutines provide
the basic analysis described for BFS while adding the features indicated above
by suffixes S, M, F, V, FS, MS, VS, MV and MVS.

For the BFS family of subroutines, the suffix S feature allows the user
to indicate:
1) the number of different window bandwidths to be used and the lag
window truncation point for each;
2) the frequency range and the number of frequencies for which the
spectrum is to be computed; and
3) whether the plot is to be in decibels or on a logarithmic scale.
In addition, the squared coherency and phase values are returned to the user.

C. Subroutine Declaration and CALL Statements

<12-6>
1

Subroutines for Autocorrelation Analysis

ACF: Compute and print a two-part auto and partial correlation analysis
of a series, select the order of an autoregressive process which
models the series, and estimate the parameters of this model

Y(n)
:
:
CALL ACF (Y, N)

===

ACFS: Compute and optionally print a two-part auto and partial correlation
analysis of a series, select the order of an autoregressive process
which models the series, and estimate the parameters of this model;
use user-supplied control values; return autocovariance function,
and order and parameter estimates of selected autoregressive model

Y(n), ACOV(lagmax+1), PHI(lagmax)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL ACFS (Y, N,
+ LAGMAX, LACOV, ACOV, IAR, PHI, NPRT, LDSTAK)

===

ACFM: Compute and print a two-part auto and partial correlation analysis
of a series with missing observations

Y(n)
:
:
CALL ACFM (Y, YMISS, N)

===

ACFMS: Compute and optionally print a two-part auto and partial correlation
analysis of a series with missing observations; use user-supplied
control values; return autocovariance function

INTEGER NLPPA(lagmax+1)
Y(n), ACOV(lagmax+1)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL ACFMS (Y, YMISS, N,
+ LAGMAX, LACOV, ACOV, AMISS, NLPPA, NPRT, LDSTAK)

===

<12-7>
1ACFF: Compute and print a two-part auto and partial correlation analysis
of a series, select the order of an autoregressive process which
models the series, and estimate the parameters of this model; use
FFT for computations

YFFT(nfft)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL ACFF (YFFT, N, LYFFT, LDSTAK)

===

ACFFS: Compute and optionally print a two-part auto and partial correlation
analysis of a series, select the order of an autoregressive process
which models the series, and estimate the parameters of this model;
use FFT for computations; use user-supplied control values; return
autocovariance function, and order and parameter estimates of
selected autoregressive model

YFFT(nfft), ACOV(lagmax+1), PHI(lagmax)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL ACFFS (YFFT, N, LYFFT, LDSTAK,
+ LAGMAX, LACOV, ACOV, IAR, PHI, NPRT)

===

ACFD: Compute and print a two-part auto and partial correlation analysis
of a sequence of differenced series, select the order of an
autoregressive process which models each series, and estimate the
parameters of these models

INTEGER ND(nfac), IOD(nfac)
Y(n)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL ACFD (Y, N, LAGMAX, NFAC, ND, IOD, LDSTAK)

===

<12-8>
1 Subroutines for Cross Correlation Analysis

CCF: Compute and print a two-part cross correlation analysis of a pair of
series

Y1(n), Y2(n)
:
:
CALL CCF (Y1, Y2, N)

===

CCFS: Compute and optionally print a two-part cross correlation analysis
of a multivariate series using user-supplied control values; return
cross covariance function

YM(n,m), CCOV(lagmax+1,m,m)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL CCFS (YM, N, M, IYM,
+ LAGMAX, CCOV, ICCOV, JCCOV, NPRT, LDSTAK)

===

CCFM: Compute and print a two-part cross correlation analysis of a pair of
series with missing observations

Y1(n), Y2(n)
:
:
CALL CCFM (Y1, YMISS1, Y2, YMISS2, N)

===

CCFMS: Compute and optionally print a two-part cross correlation analysis
of a multivariate series with missing observations using
user-supplied control values; return cross covariance function

INTEGER NLPPC(lagmax+1,m,m)
YM(n,m), YMMISS(m), CCOV(lagmax+1,m,m)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL CCFMS (YM, YMMISS, N, M, IYM,
+ LAGMAX, CCOV, CMISS, ICCOV, JCCOV,
+ NLPPC, INLPPC, JNLPPC, NPRT, LDSTAK)

===

<12-9>
1CCFF: Compute and print a two-part cross correlation analysis of a pair of
series; use FFT for computations

YFFT1(nfft), YFFT2(nfft)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL CCFF(YFFT1, YFFT2, N, LYFFT, LDSTAK)

===

CCFFS: Compute and optionally print a two-part cross correlation analysis
of a multivariate series using user-supplied control values; use FFT
for computations; return cross covariance function

YMFFT(nfft,m), CCOV(lagmax+1,m,m)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL CCFFS (YMFFT, N, M, IYMFFT,
+ LAGMAX, CCOV, ICCOV, JCCOV, NPRT, LDSTAK)

===

Subroutines for Univariate Spectrum Estimation
Using the Fourier Transform of the Autocorrelation Function

UFS: Compute and print a univariate Fourier spectrum analysis of a series

Y(n)
:
:
CALL UFS (Y, N)

===

UFSS: Compute and optionally print a univariate Fourier spectrum analysis
of a series using user-supplied control values; return Fourier
spectrum and corresponding frequencies

INTEGER LAGS(nw)
Y(n), SPCF(nf,nw), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UFSS (Y, N,
+ NW, LAGS, NF, FMIN, FMAX, NPRT,
+ SPCF, ISPCF, FREQ, LDSTAK)

===

<12-10>
1UFSF: Compute and print a univariate Fourier spectrum analysis of a series;
use FFT for computations

YFFT(nfft)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UFSF (YFFT, N, LYFFT, LDSTAK)

===

UFSFS: Compute and optionally print a univariate Fourier spectrum analysis
of a series using user-supplied control values; use FFT for
computations; return Fourier spectrum and corresponding frequencies

INTEGER LAGS(nw)
YFFT(nfft), SPCF(nf,nw), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UFSFS (YFFT, N, LYFFT, LDSTAK,
+ NW, LAGS, NF, FMIN, FMAX, NPRT,
+ SPCF, ISPCF, FREQ)

===

UFSM: Compute and print a univariate Fourier spectrum analysis of a series
with missing observations

Y(n)
:
:
CALL UFSM (Y, YMISS, N)

===

UFSMS: Compute and optionally print a univariate Fourier spectrum analysis
of a series with missing observations using user-supplied control
values; return Fourier spectrum and corresponding frequencies

INTEGER LAGS(nw)
Y(n), SPCF(nf,nw), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UFSMS (Y, YMISS, N,
+ NW, LAGS, NF, FMIN, FMAX, NPRT,
+ SPCF, ISPCF, FREQ, LDSTAK)

===

<12-11>
1UFSV: Compute and print a univariate Fourier spectrum analysis of a series;
input covariances rather than original series

ACOV(lagmax+1)
:
:
CALL UFSV (ACOV, LAGMAX, N)

===

UFSVS: Compute and optionally print a univariate Fourier spectrum analysis
of a series using user-supplied control values; input covariances
rather than original series; return Fourier spectrum and
corresponding frequencies

INTEGER LAGS(nw)
ACOV(lagmax+1), SPCF(nf,nw), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UFSVS (ACOV, LAGMAX, N,
+ NW, LAGS, NF, FMIN, FMAX, NPRT,
+ SPCF, ISPCF, FREQ, LDSTAK)

===

UFSMV: Compute and print a univariate Fourier spectrum analysis of a series
with missing observations; input covariances rather than original
series

INTEGER NLPPA(lagmax+1)
ACOV(lagmax+1)
:
:
CALL UFSMV(ACOV, NLPPA, LAGMAX, N)

===

UFSMVS: Compute and optionally print a univariate Fourier spectrum analysis
of a series with missing observations using user-supplied control
values; input covariances rather than original series; return Fourier
spectrum and corresponding frequencies

INTEGER NLPPA(lagmax+1)
ACOV(lagmax+1), SPCF(nf,nw), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UFSMVS (ACOV, NLPPA, LAGMAX, N,
+ NW, LAGS, NF, FMIN, FMAX, NPRT,
+ SPCF, ISPCF, FREQ, LDSTAK)

===

<12-12>
1 Subroutines for Univariate Spectrum Estimation Using Autoregressive Models

UAS: Compute and print a univariate autoregressive spectrum analysis of a
series

Y(n)
:
:
CALL UAS (Y, N)

===

UASS: Compute and optionally print a univariate autoregressive spectrum
analysis of a series using user-supplied control values; return
autoregressive and Fourier spectrum and corresponding frequencies

Y(n), PHI(lagmax), SPCF(nf), SPCA(nf), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UASS (Y, N,
+ IAR, PHI, LAGMAX, LAG, NF, FMIN, FMAX, NPRT,
+ SPCA, SPCF, FREQ, LDSTAK)

===

UASF: Compute and print a univariate autoregressive spectrum analysis of a
series; use FFT for computations

YFFT(nfft)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UASF(YFFT, N, LYFFT, LDSTAK)

===

UASFS: Compute and optionally print a univariate autoregressive spectrum
analysis of a series using user-supplied control values; use FFT for
computations; return autoregressive and Fourier spectrum and
corresponding frequencies

YFFT(nfft), PHI(lagmax), SPCA(nf), SPCF(nf), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UASFS (YFFT, N, LYFFT, LDSTAK,
+ IAR, PHI, LAGMAX, LAG, NF, FMIN, FMAX, NPRT,
+ SPCA, SPCF, FREQ)

===

<12-13>
1UASV: Compute and print a univariate autoregressive spectrum analysis of a
series; input covariances rather than original series

ACOV (lagmax+1)
:
:
CALL UASV (ACOV, LAGMAX, N)

===

UASVS: Compute and optionally print a univariate autoregressive spectrum
analysis of a series using user-supplied control values; input
covariances rather than original series; return autoregressive and
Fourier spectrum and corresponding frequencies

ACOV(lagmax+1), Y(n), PHI(lagmax)
SPCA(nf), SPCF(nf), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL UASVS (ACOV, LAGMAX, Y, N,
+ IAR, PHI, LAG, NF, FMIN, FMAX, NPRT,
+ SPCA, SPCF, FREQ, LDSTAK)

===

Subroutines for Univariate Spectrum Estimation Using the
Direct Fourier Transform

PGM: Compute and print a periodogram analysis of a series; use FFT for
computations

YFFT(nfft)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL PGM (YFFT, N, LYFFT, LDSTAK)

===

PGMS: Compute and optionally print a periodogram analysis of a series; use
FFT for computations; return periodogram and corresponding
frequencies

YFFT(nfft), PER(nf), FREQ(nf)
:
:
CALL PGMS (YFFT, N, NFFT, LYFFT,
+ IEXTND, NF, PER, LPER, FREQ, LFREQ, NPRT)

===

<12-14>
1IPGM: Compute and print an integrated periodogram analysis of a series;
use FFT for computations

YFFT(nfft)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL IPGM (YFFT, N, LYFFT, LDSTAK)

===

IPGMS: Compute and optionally print an integrated periodogram analysis of a
series; use FFT for computations; return integrated periodogram and
corresponding frequencies

YFFT(nfft), PERI(nf), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL IPGMS (YFFT, N, LYFFT, LDSTAK,
+ NF, PERI, LPERI, FREQ, LFREQ, NPRT)

===

IPGMP: Compute and print an integrated periodogram analysis of a series;
input periodogram rather than original series

PER(nf), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL IPGMP (PER, FREQ, NF, N, LDSTAK)

===

IPGMPS: Compute and optionally print an integrated periodogram analysis of a
series; input periodogram rather than original series; return
integrated periodogram and corresponding frequencies

PER(nf), FREQ(nf), PERI(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL IPGMPS (PER, FREQ, NF, N, LDSTAK, PERI, NPRT)

===

<12-15>
1 Utility Subroutines

CENTER: Subtract the series mean from each observation of a series; return
the centered series (no printed output)

Y(n), YC(n)
:
:
CALL CENTER (Y, N, YC)

===

TAPER: Center a series about its mean and apply a split-cosine-bell taper;
return the tapered series (no printed output)

Y(n), YT(n)
:
:
CALL TAPER (Y, N, TAPERP, YT)

===

FFTLEN: Compute the minimum extended series length for using the Singleton
FFT; return the extended series length (no printed output)

CALL FFTLEN (N, NDIV, NFFT)

===

MDFLT: Smooth a periodogram by applying a sequence of modified Daniell
filters; return the smoothed periodogram (no printed output)

INTEGER KMD(nk)
PER(nf), PERF(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL MDFLT (PER, NF, NK, KMD, PERF, LDSTAK)

===

FFTR: Compute the Fourier coefficients of an input series of
observations; return the Fourier coefficients (no printed output)

YFFT(n), AB(nfft)
:
:
CALL FFTR (YFFT, N, NFFT, IEXTND, NF, AB, LAB)

===

<12-16>
1 Subroutines for Bivariate Spectrum Estimation Using the
Fourier Transform of the Cross Correlation Function

BFS: Compute and print a bivariate Fourier spectrum analysis of a pair of
series

Y1(n), Y2(n)
:
:
CALL BFS (Y1, Y2, N)

===

BFSS: Compute and optionally print a bivariate Fourier spectrum analysis
of a pair of series using user-supplied control values; return
squared coherency and phase components of the cross spectrum and the
corresponding frequencies

INTEGER LAGS(nw)
Y1(n), Y2(n), CSPC2(nf,nw), PHAS(nf,nw), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL BFSS (Y1, Y2, N,
+ NW, LAGS, NF, FMIN, FMAX, NPRT,
+ CSPC2, ICSPC2, PHAS, IPHAS, FREQ, LDSTAK)

===

BFSF: Compute and print a bivariate Fourier spectrum analysis of a pair of
series; use FFT for computations

YFFT1(nfft), YFFT2(nfft)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL BFSF (YFFT1, YFFT2, N, LYFFT, LDSTAK)

===

<12-17>
1BFSFS: Compute and optionally print a bivariate Fourier spectrum analysis
of a pair of series using user-supplied control values; use FFT for
computations; return squared coherency and phase components of the
cross spectrum and the corresponding frequencies

INTEGER LAGS(nw)
YFFT1(nfft), YFFT2(nfft), CSPC2(nf,nw), PHAS(nf,nw), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL BFSFS (YFFT1, YFFT2, N, LYFFT, LDSTAK,
+ NW, LAGS, NF, FMIN, FMAX, NPRT,
+ CSPC2, ICSPC2, PHAS, IPHAS, FREQ)

===

BFSM: Compute and print a bivariate Fourier spectrum analysis of a pair of
series with missing observations

Y1(n), Y2(n)
:
:
CALL BFSM (Y1, YMISS1, Y2, YMISS2, N)

===

BFSMS: Compute and optionally print a bivariate Fourier spectrum analysis
of a pair of series with missing observations using user-supplied
control values; return squared coherency and phase components of the
cross spectrum and the corresponding frequencies

INTEGER LAGS(nw)
Y1(n), Y2(n), CSPC2(nf,nw), PHAS(nf,nw), FREQ(nf)
DOUBLE PRECISION DSTAK(ldstak)
COMMON /CSTAK/ DSTAK
:
:
CALL BFSMS (Y1, YMISS1, Y2, YMISS2, N,
+ NW, LAGS, NF, FMIN, FMAX, NPRT,
+ CSPC2, ICSPC2, PHAS, IPHAS, FREQ, LDSTAK)

===

BFSV: Compute and print a bivariate Fourier spectrum analysis of a pair of
series; input covariances rather than original series