File Name: README2.TXT
GAUSSX 9.0 rev 1
Last minute notes and corrections
******************************************************************************
Long Names under GAUSS
While GAUSS supports long names of up to 32 characters, it does
not support long names in matrices with character elements. Thus
long names are not supported for GAUSSX variables.
******************************************************************************
CHANGES from 8.0 (1) to 9.0 (1)
1. PANEL
2. 64 bit support
3. Normality tests - AD, SW, SF, PPC
4. Additional survival models
5. Cox Snell and martingale residuals
6. Nonparametric survival estimation - SURVIVAL
7. Cox proportional hazards model
8. Utility functions - PERMS, COMBS, INTERP2
9. Enhanced print option
10. Random number generation for any distribution - RNDGEN
11. Utility functions - Inverse hyperbolic functions
12. Transforms vector to a normal variate - NORMAL
13. Invert a function - INVERT
14. Tests of distributions - PIT (Probability Integral Transformation)
15. Krinsky-Robb standard errors in Analyz
16 STATLIB - Distribution function library
******************************************************************************
1. PANEL
PANEL estimates the coefficients of a linear regression model for panel
data - data in which there are several observations or time periods for
each individual. Both fixed effects and random effects (error components
model) for balanced and unbalanced models are supported. The Hausman test
for testing the orthogonality of the random effects and the regressors, and
the Bresuch Pagan test for random effects are implemented.
Example
PANEL (p,d) y c x1 x2 x3;
IDENT = cusip;
PANEL (p,d) y c x1 x2 x3;
IDENT = cusip;
MODE = random;
The first estimation is for a fixed effects model (default), with the panel
identifier specified in the cusip variable. The second estimation is for the
effects model.
******************************************************************************
2. 64 bit support
Windows installation and menu programming modules have been updated to support
64bit chip sets.
******************************************************************************
3. The following normality tests have been added:
AD - Anderson Darling
SW - Shapiro Wilks
SF - Shapiro Francia
PPC - Probability plot correlation
In each case, both uncensored and Type I right censored samples are supported, and p-values are reported.
Example:
test (p) vseries;
method = ad;
test (p) vseries;
method = sw;
censor = cen;
The probability plot correlation test can provide correlation estimates and p-values for 14 distributions, as well as Q-Q plots.
******************************************************************************
4. Additional survival models
Survival analysis uses models that predict failure time. Data can be full or
censored. The following additional survival models have been added:
Beta
Cox (see below)
Gompertz
Pareto
Pearson
******************************************************************************
5. Survival model residuals
The following residuals for survival models are supported using the Duration
command:
The Cox-Snell residual.
The deviance residual.
The martingale residual.
The estimation residual.
The standardized residual.
The Schoenfeld residuals.
The scaled Schoenfeld residuals.
The efficient score residuals.
For each residual, the residual, the standard error, and the lower and upper confidence band
are reported
******************************************************************************
6. Nonparametric survival estimation - SURVIVAL
The SURVIVAL command computes non parametric estimates of survival measures using either
the Kaplan-Meier or Nelson-Aalan algorithms. The survival measures are:
The survival rate.
The cumulative failure rate.
The hazard rate.
The cumulative hazard rate.
For each measure, the rate, the standard error, and the lower and upper confidence band
are reported for each observation. Censoring and grouping are supported.
Example:
SURVIVAL (p,d) cf cferr;
MODE = cumfail;
VLIST = y ;
CENSOR = cen;
This provides the Kaplan-Meier measures for the cumulative failure rate of y, using the
censoring information in cen; the failure rate and its standard error are stored in cf
and cferr respectively.
******************************************************************************
7. Cox proportional hazards model
The Cox proportional hazards model is used extensively in biomedical research. It is
implemented using the same format as other parametric survival models. Right censoring is
supported, and ties are treated using the Breslow, Efron, Exact or Discrete methodologies.
Baseline survivor, hazard and cumulative hazard are supported, as well as the ordinary
survivor measures. Residuals include Cox-Snell, martingale, deviance,Schoenfeld, scaled Schoenfeld
and score.
Example:
FRML cq0 indx = b1*arrtemp + b2*plant;
FRML cq1 llfn = cox(failuret,indx,2);
ML (p,i) cq0 cq1;
DURATION res reserr;
MODE = mgale;
DURATION sv sverr svlb svub;
MODE = survival
The example shows an ML estimation of a Cox proportional hazards model with two covariates
(arrtemp, plant), using the Efron method for ties. The first duration command estimates the
martingale residuals and their standard error; the second estimates the survival function, its
standard error and the 95% lower and upper bounds.
******************************************************************************
8. Utility functions - PERMS, COMBS, INTERP2
PERMS computes the n! permutations of a vector of length n.
COMBS computes all the k combinations of a vector of length n.
INTERP2 does a two dimensional data interpolation (table lookup)
******************************************************************************
9. Enhanced print option
The number of lines printed per page using the PRINT command can be controlled
using MAXLINES under the Gaussx OPTION command.
******************************************************************************
10. Random number generation for any distribution - RNDGEN
RNDGEN creates (pseudo) random numbers from any distribution, given the CDF for that
distribution. Argument constraints are permitted.
Example:
proc beta_cdf(x,dta,pvec);
local v1, v2, cdf;
v1=pvec[1];
v2=pvec[2];
cdf = cdfbeta(x,v1,v2);
retp(cdf);
endp;
pvec = { .3, .5 };
cns = 3;
dta = 0;
y = RNDGEN(&beta_cdf,100,1,dta,pvec,cns);
This generates a random sample of 100 observations from a beta
distribution with shape parameters .3 and .5. Since the argument for a
beta variate lies in the range {0:1}, cns is specified as 3.
******************************************************************************
11. Inverse hyperbolic functions
ARCCOSH --- Inverse cosh function
ARCSINH --- Inverse sinh function
ARCTANH --- Inverse tanh function
******************************************************************************
12 Transforms vector to a normal variate - NORMAL
NORMAL transforms a vector such that the transformed vector is distributed
standard normal, using either the BoxCox or the Johnson transformation.
Example:
NORMAL (p) ndta;
METHOD = johnson;
VLIST = dta;
This example transforms the vector DTA using the Johnson transformation to a new
vector NDTA which follows a standard normal distribution.
******************************************************************************
13 Invert a function - INVERT
INVERT computes the inverse of a function. Given a function f(x,z), INVERT returns the
value of x such that f(x,z) = K.
Example:
proc sincos(x,z);
retp(sin(x).*cos(x));
endp;
kvec = { .3, .4, .5 };
x0 = .2;
ix = invert(&sincos,x0,0,kvec);
This example inverts the sincos function for the values shown in kvec.
******************************************************************************
14 Tests of Distributions - PIT (Probability Integral Transformation)
Typically, survival analysis requires the specification of the underlying survival
distribution, such as Weibull, lognormal, etc. The PIT test evaluates the CDF (or
equivalently the survival rate) using the estimated coefficients. Under the null of
the specified distribution, the CDF will be distributed uniform. The PIT test evaluates
the probability plot correlation (PPC) in this context, as well as the associated
p-value.
An adjustment is required to the probability integral transformation to take into account
that the distributional parameters are estimated. Thus the upper and lower bounds of the
PPC coefficient are derived, along with their associated p-values.
The test is available for all duration models, and for censored or uncensored data.
Example:
FRML eq0 indx = b0 +b1*arrtemp + b2*plant;
FRML eq1 llfn = lognorm(failuret,indx,scale);
ML (p,i) eq0 eq1;
TEST (p) failuret;
METHOD = PIT;
This test will determine if the assumption of log normality can be rejected.
******************************************************************************
15. Krinsky-Robb standard errors in Analyz
Analyz estimates the value and standard errors for a non-linear function of
estimated parameters. The standard method for calculating the standard error
is the 'delta' method. As an alternative, the Krinsky-Robb simulation can be
specified - it is often used in WTP (willingness to pay) models.
FRML eq1 y = b1 + b2*x1 + b3*x2;
NLS eq1;
FRML ea1 c1 = (b2+b3)/b1;
ANALYZ ea1;
METHOD = KR;
REPLIC = 10000;
This creates the parameter c1, and estimates its standard error using
10000 simulations of the original parameters, drawn from a multivariate normal
distribution with the estimated covariance matrix.
******************************************************************************
16. STATLIB - Statistical Distribution library
For each of the distributions given below, the following functionality is
provided:
fn_llf likelihood function
fn_pdf probability density function
fn_cdf cumulative density function
fn_cdfi inverse cumulative density function
fn_rnd random number generator
Continuous functions:
Beta
Cauchy
Chisquared
Exponential
F
Gamma
Half Normal
Inverse Gaussian
Laplace
Largest Extreme Value
Levy
Logistic
Log Logistic
Log Normal
Normal
Pareto
Pearson
Smallest Extreme Value
Students t
Triangular
Uniform
Von Mises
Weibull
Discrete functions:
Bernoulli
Binomial
Geometric
Hypergeometric
Negative Binomial
Poisson
Uniform
******************************************************************************
CHANGES from 7.0 (2) to 8.0 (1)
1. ANOVA
2. FREQ and CROSSTAB
3. TABULATE enhancement
4. Variable case respected
5 Local Whittle process
6. FILTER enhancements
7. Survival processes
8. DURATION command (survival and hazard)
9 Count processes
10 FORCST enhancements
11. Batch mode
******************************************************************************
1. ANOVA
ANOVA implements an N-Way analysis of variance providing an adjusted (Type III)
sum of squares for fixed, random or mixed models, which can be balanced or
unbalanced. Covariates are permitted, and for random or mixed models, variance
components are reported. The model is specified in terms of nested effects and
interaction effects.
Example:
ANOVA (p) score noise subject etime;
VALUE = 0 0 1;
MODEL = { 1 1 0, 1 1 1 };
In this ANOVA model, the dependent variable is score, followed by three
categorical variables - noise subject and etime. The VALUE field specifies
whether a categorical variable is random or fixed - so etime is random. The
MODEL field specifies two interaction effects - noise*subject, and
noise*subject*etime.
******************************************************************************
2. FREQ and CROSSTAB
These two commands are now implemented as part of the Gaussx package
Example:
freq (p) zcat;
crosstab (p,s) Activity Gender;
mode = num fit min max row%;
catname = Slight Moderate High Female Male ;
******************************************************************************
3. TABULATE enhancement
The following additional modes are supported for tabular output:
FIT Expected cell count
RESID Raw residual
STDRES Standardized residual
ADJRES Adjusted residual
CHISQ Chi-sq contribution for each cell
******************************************************************************
4. Variable case respected
In previous versions, variables were displayed in upper case. For
the most part, case is now respected.
******************************************************************************
5. Local Whittle process
The local Whittle estimator is a semi-parametric estimator of the
degree of differencing in a fractionally integrated process, based on the
periodogram. The local Whittle, exact local Whittle, and feasible exact
local Whittle are supported.
Example:
FRML eq1 llf = WHITTLE(y, d);
ML (p,i) eq1 ;
PERIODS = 80;
OPLIST = elw;
******************************************************************************
6. FILTER enhancements
The FILTER command now supports the following filters:
arma Autoregressive Moving Average
dediff Inverse difference
detrend Demean and detrend
diff Difference
hp Hodrick-Prescott
linear Linear recursive (IIR)
standard Standardize
******************************************************************************
7. SURVIVAL processess
Survival analysis uses models that predict failure time. Data can be full or
censored. The following survival models are supported:
Exponential
Gamma
Gumbel (Largest Extreme Value)
Inverse Gaussian
Logistic
Loglogistic
Lognormal
Normal
Smallest Extreme Value
Weibull
Syntax:
FRML eq0 indx = b0 +b1*arrtemp + b2*plant;
FRML eq1 llfn = logistic(failuret,indx,scale);
ML (p,i) eq0 elg1;
TITLE = "Logistic Model";
This would estimate the structural parameters b0, b1 and b2, and the
scale parameter using ML.
******************************************************************************
8. DURATION command
The DURATION command computes duration measures, based on the last survival
model estimation. The duration measures are:
The survival rate.
The inverse survival rate.
The hazard rate.
The cumulative hazard rate.
For each measure, the rate, the standard error, and the lower and upper confidence band
are reported for each observation.
Example:
FRML eq0 indx = b0 +b1*arrtemp + b2*plant;
FRML eq1 llfn = lognorm(failuret,indx,scale);
FRML ec1 scale >= 0;
PARAM b0 b1 b2 scale
VALUE = 15 0 0 1;
ML (p,i) eq0 eq1;
EQCON = ec1;
DURATION hz hzerr hzlb hzub;
OPLIST = hazard;
PRINT (p) failuret hz hzerr hzlb hzub;
This would evaluate the lognormal survival model, and then evaluate the
hazard rate, its standard error, and the lower and upper band for each
observation.
******************************************************************************
9. Count processes
Models that require integer date - such as number of events - require
specific processes. Data can be full or truncated. The following count
models are supported:
Negative Binomial
Poisson
Example
FRML eq1 indx = b0 +b1*age + b2*party + b3*unem;
FRML eq2 llf = negbin(wars,indx,gamma0,0);
ML (p,d,i) eq1 eq2;
TITLE = "Negative Binomial Model";
This would estimate the structural parameters b0, b1, b2 and b3 using ML.
Note that the Poisson process now requires 3 arguments.
******************************************************************************
10. FORCST enhancements
After having estimated a set of equations, FORCST can now produce the
predicted value and standard errors for variables that are non-linear functions
of the estimated parameters
PARAM b0 b1 b2 sigma;
VALUE = 1 1 1 .5;
FRML eq1 qhat = b0*(K^b1).*(L^b2);
NLS (p) eq1;
FRML eq2 q2 = exp(ln(b0) + (b1+b2)*ln(Z));
FORCST q2;
EQNS = eq2;
FORCST q2se;
EQNS = eq2;
MODE = stderr;
A Cobb-Douglas production function is estimated using NLS. These parameters are then
utilized in a second production function (eq2). Predicted values are derived in q2
using the first forcst, and the standard error in q2se for each observation in the
second forcst. The standard errors are predicated on the covariance matrix produced by
the NLS estimation.
If the estimation method uses ML, then the equation list can consist of a number of
equations.
************************************************************************************************************************************************************
11. Batch mode
Both Windows and UNIX now can operate in batch mode:
tgauss -b gaussxb
The Gaussx options and project is read from the Gaussx configuration file.
******************************************************************************
CHANGES from 7.0 (1) to 7.0 (2)
1. GAUSSPlot support
2. Ordered Logit/Probit process
3. Discrete choice summary statistics
4. Additional Tests
5. Non Parametric Tests
6. Constraint enhancement
7. Maple 10 support
8. Additional forecast modes
9. Durbin Watson probabilities
10. Gini coefficients
11. Additional support function (ischar)
12. Command file comments
******************************************************************************
1. GAUSSPlot Support
Gaussx now supports both Publication Quality Graphics (default), and
GAUSSPlot, if the latter is installed. The choice can be made in the
Gaussx configuration file, or as an option.
GAUSSPlot is powerful, but has a significant learning curve. Gaussx
makes using GAUSSPlot easy. For example, to plot a couple of curves
in Gaussx, one would use the syntax:
OPTION gplot;
PLOT (p) gnp export;
This would provide the graphic under GAUSSPlot. To customize the
graphic, and have this customized version available in future runs,
one would do the following from the GAUSSPlot window:
1. From the menu, select files/macro/record, and give the macro
a name - say test1
2. Customize the graph, using the interactive GAUSSPlot GUI.
3. When done, click the "stop macro" button.
To run the plot from Gaussx, use the command:
OPTION gplot;
PLOT (p) gnp export;
FNAME = test1;
Graphics without tears. And since the Gaussx source files are available,
one can always customize GAUSSPlot graphics using the command language if
one needs additional control.
******************************************************************************
2. Ordered Logit/Probit process
ML estimation of the linear or nonlinear ordered logit or probit model is
facilitated by the ORDLGT and ORDPRBT commands respectively.
Example:
param alow a1 a2 a3 ahigh;
value = -30 2 3 4 +30;
const alow ahigh;
frml eq1 xb = (alow~a1~a2~a3~ahigh) - (b1*x1 + b2*x2);
frml eq2 llf = ordlgt(ycat,xb);
ml (p,i) eq1 eq2 ;
This example estimates a four choice ordered logit model.
An example is given in test08.prg
******************************************************************************
3. Discrete choice summary statistics
The following summary statistics are reported for discrete choice models
(logit, probit, ordered)
Likelihood ratio test
Percent Correctly Predicted
Pseudo R-Squared measures
Madalla
McFadden
Cragg and Uhler
Ben-Akiva and Lerman
Information criteria
Akaike
Bayesian
Hannan-Quinn
Pearson Residual Chi-sq test
Deviance Chi-Sq test
Bera, Jarque and Lee normality test (probit)
Hosmer-Lemeshow test (binomial logit, probit)
Concordance/Discordance measures
Somers' D
Goodman-Kruskal Gamma
Kendall's Tau-a
******************************************************************************
4. Additonal Tests
The following parametric tests have been implemented.
ANOVA - Analysis of Variance
Welch's Test for equality of means, unequal variance
Bartlett's Test for equality of variances:
******************************************************************************
5. Non Parametric Tests
The following non-parametric tests have been implemented.
Test of randomness
RUNS one sample
Tests of location (2 sample, paired)
SIGN
WILCOXON
WALSH
Tests of location (2 sample, unpaired)
MW Mann Whitney U test
Tests of location (K sample, blocked)
FRIEDMAN
CONOVER
Tests of location (K sample, unblocked)
KW Kruskal-Wallis (3 or more treatments)
MOOD ( K medians)
Tests of scale
LEVENE
BF (Brown-Forsythe)
OBRIEN
Example:
TEST x1 x2;
METHOD = WILCOXON;
This tests for the equality of the medians of x1 and x2.
******************************************************************************
6. Constraint enhancement
A constraint equation of the form:
frml ed1 x11 + x21 >= 325;
required a value on the RHS. This has been relaxed.
Example:
const b325;
value = 325;
frml ed1 x11 + x21 >= b325;
or
c325 = 325;
frml ed1 x11 + x21 >= c325;
******************************************************************************
7. Maple 10 support
Automatic differentiation can now be carried out under Maple 9,0, 9.5
or Maple 10.
******************************************************************************
8. Additional forecast modes
The following are availble for OLS:
Cook's D measure using MODE = COOK.
The Hat vector using MODE = HAT.
Prediction limits using MODE = BOUNDS. (Default 95%)
******************************************************************************
9. Durbin Watson probabilities
The Durbin Watson probabilities for the null hypothesis of
no positive serial correlation and no negative serial correlation
are provided with the diagnostic statistics.
OLS (p,s) y c x1 x2;
******************************************************************************
10. Gini Goefficients
Gini coefficeints are used to measure the degree of income inequality.
If ymat is an nxk matrix of incomes by n groups for k countries, then
library gaussx;
g = gini(ymat);
will return a kx1 vector of Gini coefficients.
This is pure GAUSS code; the source in src\gxprocs.src
******************************************************************************
11. Additional support function (ischar)
ischar(x) returns 1 if x is a character vector, else zero.
******************************************************************************
12. Command file comments
Gaussx had extended the command file language to include the GAUSS
syntax for comments. Thus:
// - rest of the line is a comment.
/* .. */ - text between these markers is a comment
The Gaussx interpetation of '?' remains the same, and is synonymous
with '//'
******************************************************************************
CHANGES from 6.0 (2) to 7.0 (1)
New Estimation Routines:
1 PLS - Partial Least Squares
2 RSM - Response Surface Methodology
New Tools:
3 Consumer surplus and deadweight loss
4 Linear Programming
5 Murphy Topel two step estimation
6 Interpolation
Optimization Enhancements
7 Global Optimization algorithm
8 Nelder-Meade algorithm
9 Other enhancements
New diagnostics and tests
10 Bayesian convergence diagnostics
11 CORDIM - Corrlelation dimension
12 LYAPUNOV - Lyapunov exponent
13 Hansen test
14 Jarque-Bera test
15 KPSS test
16 Newey-West test
Other
17 Bayesian estimation examples
18 Maple 9.5 support
19 Normal process
20 DGP - Wiener and Fractional Brownian processes
21 Matrix I/O enhancements
22 Cluster enhancement
23 Spreadsheet processing enhancement
******************************************************************************
1. PLS - partial least squares
Partial least squares is especially appropriate when the data consist of
many predictors relative to the number of observations - PLS can
indicate the few underlying predictors that account for most of the variation
in the response variable.
Gaussx generates the regression statistics, the percentage variation
explained by successive factors, the VIP statistic, and also generates
cross-validated PRESS statistics.
Example:
LIST xlist;
SYMBOL = x;
RANGE = 1 15;
PLS (p,d,s) y xlist;
ORDER = 4;
MAXIT = 8;
This undertakes PLS using explanatory variables x1 - x15, with a maximum
of 4 factors. A PRESS table based on up to 8 factors is also produced.
******************************************************************************
2. RSM - response surface methodology
RSM is a methodology that is used extensively in engineering to solve an
optimization problem using simulation. Initially an experimental design is
used to fit a set of observed responses to a set of factors; typically this can
be modeled by NLS or OLS on polynomial expansion of the factors. A desirability
measure or distance metric is specified as a function of the (fitted) responses,
and in the second step the optimal factor choice that maximizes this metric is
determined by optimization. Since the response surface is not smooth, and
typically has many local optima, the default process is to use the genetic
algorithm.
The following desirability measures/distance metrics are supported
Derringer and Suich desirability function
Harrington's desirability function.
Euclidian distance.
Standardized Euclidian distance.
Mahalanobis distance.
Chebyshev metric.
Manhattan metric.
Minkowski metric.
Example:
frml eqy1 y1 = a0 + a1*x1 + a2*x2 + a3*x3;
frml eqy2 y2 = b0 + exp(b1*x1+b2*x2+b3*x3 + b4*x1^2);
let pmat[2,6] = 120 170 0 1 1 max
400 500 600 1 1 target;
param a0 a1 a2 a3;
param b0 b1 b2 b3 b4;
nls (q) eqy1;
nls (q) eqy2;
const a0 a1 a2 a3;
cosnt b0 b1 b2 b3 b4;
param x1 x2 x3;
value = 0 0 0;
lowerb = -1 -1 -1;
upperb = 1 1 1;
rsm (p,i) eqy1 eqy2 ;
maxit = 40;
mode = ds;
vlist = pmat;
This example demonstrates a 3 factor 2 response RSM example. The two
responses are first estimated using NLS. The parameters of the estimation
equation are held as constants, and the factors (x1, x2, x3) now become
the parameters. The second step involves the optimization of the desirability
measure, in this case the default (ds) Derringer and Suich. A matrix (pmat),
which defines the values required for the specified desirability measure, is
specified in vlist. Optimization occurs in the RSM command similarly to
ML. In the default, the genetic algorithm is used as the main method for \
locating the optimum.
******************************************************************************
3. Consumer surplus and deadweight loss
The 'Welfare' command evaluates three measures of consusmer surplus -
compensating variation, equivalent variation and Marshallian surplus -
for demand systems given initial and final prices. The standard error
of the calculated consumer surplus and the deadweight loss are also
evaluated.
Example:
library gaussx;
p0 = .75; @ starting price @
p1 = 1.5; @ finish price @
y = 720; @ income @
let b = 4.95 -14.22 .082; @ parameters in demand function @
omega = 0;
{cv,secv,dwl} = welfare("cv",&qgas,p0~p1,y,b,omega);
proc qgas(p,y,b); @ demand function @
local z;
z = b[1] + b[2]*p + b[3]*y;
retp(z);
endp;
This demonstrates the evaluation of the compensating variation and deadweight
loss for a single good with a linear demand function. An example with multiple goods
and a non-linear demand system is given in test46.prg
******************************************************************************
4. Linear Programming
A liner programming (lp) module has been added.
Example:
param x11 x12 x13 x21 x22 x23;
lowerb = 0 0 0 0 0 0;
unitcost = 0.09;
frml eq1 totcost = unitcost*( 2.5*x11 + 1.7*x12 + 1.8*x13
+ 2.5*x21 + 1.8*x22 + 1.4*x23) ;
frml es1 x11 + x12 + x13 <= 350;
frml es2 x21 + x22 + x23 <= 600;
frml ed1 x11 + x21 >= 325;
frml ed2 x12 + x22 >= 300;
frml ed3 x13 + x23 >= 275;
lp (p) eq1;
eqcon = es1 es2 ed1 ed2 ed3;
mode = minimize
******************************************************************************
5. Murphy Topel two step estimation
The use of a predicted value derived from parameters derived in the
first step of a two step process, and then used as a variate in the second
step, results in biased estimates of the parameter standard errors. The Murphy
Topel correction at the second step takes into account that the variate is a
function of the first step parameters, in order to obtain appropriate standard
errors.
Gaussx implements the M-T two step process for any two estimation processes
using NLS and/or ML. The only requirement is to specify each process by including
a MODE statement indicating the step.
frml ml1 xb1 = a0 + a1*exper + a2*educ + a3*white;
frml ellf1 llfp = probit(employ,xb1,0);
frml ml2 mr2 =ln( pdfn(-xb1)./cdfn(-xb1));
frml ml3 resid = wage - (b0 +b1*educ + b2*exper + b3*fprof + b4*mr2);
frml ellf2 llfn = normal(resid);
param a0 a1 a2 a3;
ml (p,i) ml1 ellf1;
method = nr nr nr;
mode = step1;
title = "First step - Probit";
const a0 a1 a2 a3;
param b0 b1 b2 b3 b4 ;
ml (p,i) ml1 ml2 ml3 ellf2;
method = nr nr nr;
mode = step2;
title = "Second step - linear, MT correction";
In this example, a probit is estimated in the first step, and a hazard
function is derived and used as a variate in the second step, which is
a linear regression. The first step is characterized by specifying
mode = step1,
and the second step by
mode = step2.
The Murphy Topel corrected standard errors are displayed in the second
estimation.
An example is given in test47.prg
******************************************************************************
6. Interpolation
Interpolation has been added as a support function:
Example:
let xtarg = .3 .4 .5;
yint = INTERP(y,x,xtarg);
This computes the univariate cubic interpolation, yint for the target
points xtarg, given a grid x with associated function values y.
******************************************************************************
7. Global Optimization
Global optimization (GO) has been added to the methods available for NLS
ML. This is a modified Lipschitzian optimization, but does not require
an estimate of a Lipschitz constant. It uses a modified global search
strategy over potentially optimal hyper cubes. It successfully found the
global optimum for a suite of tests from the Maple Global Optimization
toolbox. Unlike hill climbing methods, GO will find the global optimum
in a context where there are many local optima (such as kernel estimation),
and is not affected by initial starting conditions.
******************************************************************************
8. Nelder-Meade algorithm
The Nelder-Meade downhill simplex algorithm has been added:
METHOD = bfgs nm nr;
******************************************************************************
9. Non-Linear Estimation enhancements
Constrained optimization can now be used with all the non-hill climbing search
methods - genetic algorithm, global optimization, Nelder-Meade and simulated
annealing.- the parameters are constrained to the feasible region through
a penalty function.
MAXIT can now consist of two elements - The first element specifies the
maximum number of iterations. The second element, if specified, determines
the frequency of the printing of the iterations.
Constrained optimization has been enhanced resulting in faster convergence.
The quadratic programming method can be specified for the step type:
STEP = QP;
(The QP algorithm is used for the step method when constraints are specified;
this is called SQP - sequential quadratic programming).
******************************************************************************
10. Bayesian convergence diagnostics
Convergence diagnostics are included with the `s' option. These include
Geweke's NSE (Numerical Standard Error), RNE (Relative Numerical Efficiency),
and a Chi-squared test for parameter stability.
******************************************************************************
11. CORDIM - Correlation dimension
The correlation dimension - an approximation of the fractal dimension -
provides the dimension of a time series using the Grassberger and Procaccia
method.
Example:
CORDIM (p,g) z;
ORDER = 2;
This generates the correlation dimension of z assuming an embedded dimension
of 2.
******************************************************************************
12. LYAPUNOV - Lyapunov exponent
The Lyapunov exponent, which provides a measure of how quickly close
trajectories diverge over time, is implemented using the Wolf algorithm.
Example:
LYAPUNOV (p) x;
ORDER = 3;
This generates the correlation dimension of x with an embedded dimension
of 3.
******************************************************************************
13. KPSS test
The KPSS test for level and trend stationarity.
Example:
test (p) gnp;
method = kpss;
order = 7;
******************************************************************************
14. Hansen Test
The Hansen test of Overidentifying Restrictions is used in an instrumental
variables context.
Example:
test (p) qf;
method = hansen;
order = 2;
******************************************************************************
15. Jarque-Bera test
The Jarque Bera test for normality have been added to the "test" command -
it already exists as an OLS diagnostic.
Example:
test (p) vseries;
method = jb;
******************************************************************************
16. Newey West D Test
The Newey West D test permits restriction testing in an instrumental variables
context.
Example:
test (p) qf1 qf0;
method = nw;
order = 3;
******************************************************************************
17. Bayesian estimation
A number of new examples using MCMC have been added. These include
POISSON, SURE and MNP.
******************************************************************************
18. Maple 9.5 support
Automatic differentiation can now be carried out under either Maple 9,0
or Maple 9.5.
******************************************************************************
19. Normal Process
Permits estimation of an equation or system of equations using ML instead
of NLS
frml eq1 mr2 = pdfn(x)./cdfn(x);
frml es1 rs1 = y1 - (a0 + a1*age + a2*educ + a3*mr2);
frml es2 rs2 = y2 - (b0 + b1*sex + b2*mr2);
frml ellf llf = normal(rs1~rs2);
ML eq1 es1 es2 ellf;
The source is gsx\src\gxprocs.src.
******************************************************************************
20. DGP - data generating process
Gaussian, Wiener and fractional Brownian (fBm) processes have been
added to the DGP command.
struct DGPS gs;
gs.diff = .8;
gs.process = "brownian";
y = dgp(gs);
This generates a fractional Brownian process with a Hurst parameter of 0.8.
******************************************************************************
21. Matrix I/O
While the Gauss loadm and save commands allow matrices to be loaded/saved
in a binary format, the .fmt format is specific to GAUSS. To use
matrices in other applications, it is useful to be able to do matrix I/O as
in pure binary or ascii. GETM returns a matrix from a binary or ascii
data file of type double, while PUTM writes a matrix to a binary or ascii
data file of type double. Source is on gauss\gsx\src\gxprocs.src
EXAMPLE
library gaussx;
x = rndu(4,3);
fname = "c:\\temp\\mydata.bin";
ret = putm(fname,x,0,0); // write to file as binary, no append
xx = getm(fname,3); // read matrix from file
******************************************************************************
22. Cluster enhancement
The Chebyshev metric (mode = CHEB;) has been added to the distance metric
methods
******************************************************************************
23. Spreadsheet processing enhancement
Gaussx uses the data exchange tool to read and write spreadsheets. The
advantage is that it is fast, and does not require any spreadsheet to be
installed. However, it does not support the most recent version of Excel.
An option in the Gaussx configuration file (gaussx.cfg) provides a choice
between using data exchange or the Excel process to handle .xls files. The
Excel process is slower, and requires that Excel be installed.
******************************************************************************
CHANGES from 6.0 (1) to 6.0 (2)
1 ARFIMA estimation
2 ARMA and ARIMA (ML)
3 VARMA enhancement
4 Additonal forecast options
5 DGP - data generating process
6 FILTER command
7 FEVAL - FRML evaluation
8 Miscellaneous commands
9 DENOISE changes
******************************************************************************
1. ARFIMA estimation
The Autoregressive Fractionally Integrated Moving Average process permits the
estimation of long memory models. These models are estimated in Gaussx using
NLS or ML. Stationary and invertibility conditions are satisfied using constrained
optimization.
The ARFIMA (p,d,q) process is given by
phi(L) (1-L)^d y = theta(L) e
where L is the backward shift operator, y is the detrended zero mean time series,
and e is the vector of zero mean white noise innovation.
Example:
param d phi1 phi2 theta1 theta2;
value = .5 .5 0 .5 0;
frml eq1 y = arfima(y, d, phi1|phi2, theta1|theta2);
frml ec1 mroot(phi1|phi2) <= .9999;
frml ec2 mroot(theta1|theta2) <= .9999;
nls (p,d,i) eq1
eqcon = ec1 ec2;
oplist = constant;
forcst yhat;
method = fit blp;
range = 1990 1999;
This example demonstrates how a (2,d,2) ARFIMA model is estimated. d is the fractional
dimension, and there are 2 AR coefficients (phi1,phi2) and 2 MA coefficients
(theta1, theta2). A constant is specified in the oplist command. The parameters
are estimated using constrained NLS, where the constraints are specified in ec1
and ec2, and where mroot is a Gaussx routine for returning the value of the largest
root in the complex plane. In addition, besides the normal AR and MA requirements
for stationarity and invertibility, requirements on d include d > -1 for invertibility,
and -.5 < d < .5 for stationarity.
Estimated values are available using the forcst command. If a range is given,
actual values are used up to the first date in the range, and forecast values for
the dates up to the second date. Two methods are available - Naive and Best Linear
Predictor (BLP). Forecast standard errors are avaiable using METHOD = STDERR.
The source is gsx\src\arfimax.src, and an example is given in test43.prg
******************************************************************************
2. ARMA and ARIMA estimation
See discussion of ARFIMA above. Both ARMA and ARIMA models can be estimated
using ML or NLS. The syntax for ARMA models is:
frml eq1 y = arma(y, phi, theta);
and for ARIMA models:
frml eq1 y = arima(y, d, phi, theta);
where phi is set of AR parameters, and theta is the set of MA parameters. Note that
d, the degree of differencing, is a constant integer.
Examples are provided in test43.prg
******************************************************************************
3. VARMA enhancement
The presence of a constant is now specified using the oplist option, in the
same manner as ARIFIMA
******************************************************************************
4. FORCST enhancements
The standard error of the forecast is available for OLS, ARIMA and ARFIMA
using MODE = STDERR.
Step ahead modes are available for ARFIMA, ARIMA and ARMA using
MODE = NAIVE and MODE = BLP (best linear predictor)
The log likelihood vector is available for ML procedures using MODE = LLF.
The probabilities and category forecasts are available for MNL, MNP, PROBIT(ML)
and LOGIT(ML) using MODE = PROB and MODE = CAT respectively.
The conditional variances are available for all GARCH processes, using
MODE=CONDVAR.
******************************************************************************
5 DGP - data generating process
DGP provides a method for creating a data vector or matrix of a particular
type of process, used either as a GAUSS or Gaussx command. The following
processes are supported:
arfima
arima
arma
arch
arch_t
garch
garch_t
logit
probit
tobit
poisson
linear
linear_t;
Syntax:
y = DGP(vs)
or
genr y = DGP(vs)
where vs is a structure of type DGPS, which is defined as
struct DGPS{
matrix ar;
matrix arch;
scalar constant;
scalar df;
scalar diff;
string distribution;
matrix garch;
matrix index;
matrix ma;
scalar nobs;
matrix param;
scalar prob;
string process;
matrix scale;
scalar stderr;
matrix variance;
matrix vlist;
};
For any particular data generating process, only those elements of vs that are
relevant need to be completed. Thus a typical process could be:
struct DGPS gs;
gs.arch = .4 |.15 ;
gs.garch = .3;
gs.index = 10 + 2*x;
gs.process = "garch";
y = dgp(gs);
This would create a vector y consisting of a structural component (10 + 2*x)
and a residual with a Garch distribution, with the parameters for the arch process
specified in gs.arch (the first element is the constant), and the parameters for
the garch process specified in gs.garch.
A brief description of each process follows. In each case a structure of the
form struct DGPS vs; is assumed.
arfima: vs.ar autoregressive parameter vector
vs.ma moving average parameter vector
vs.diff fractional differencing parameter
vs.stderr residual standard error
vs.constant process constant
vs.process = "arfima";
arima: vs.ar autoregressive parameter vector
vs.ma moving average parameter vector
vs.diff integer differencing parameter
vs.stderr residual standard error
vs.constant process constant
vs.process = "arima";
arma: vs.ar autoregressive parameter vector
vs.ma moving average parameter vector
vs.stderr residual standard error
vs.constant process constant
vs.process = "arma";
arch: vs.index structural component
vs.arch ARCH parameter vector
vs.process = "arch";
arch_t: vs.index structural component
vs.arch ARCH parameter vector
vs.df degrees of freedom for t distribution
vs.process = "arch_t";
garch: vs.index structural component
vs.arch ARCH parameter vector
vs.garch GARCH parameter vector
vs.process = "garch";
garch_t: vs.index structural component
vs.arch ARCH parameter vector
vs.garch GARCH parameter vector
vs.df degrees of freedom for t distribution
vs.process = "garch_t";
logit vs.index structural utility variable list
vs.prob % data in alternative #1 (binomial only)
vs.variance disturbance variance vector or matrix
vs.process = "logit";
probit vs.index structural utility variable list
vs.prob % data in alternative #1 (binomial only)
vs.variance disturbance variance matrix
vs.process = "probit";
tobit vs.index structural component
vs.stderr residual standard error
vs.process = "tobit";
poisson vs.index structural component (lambda)
vs.process = "poisson";
linear vs.index structural component variable list
vs.variance residual variance
vs.vlist Gaussx variable list
vs.process = "linear";
linear_t vs.index structural component variable list
vs.variance residual variance
vs.df degrees of freedom for t distribution
vs.vlist Gaussx variable list
vs.process = "linear_t";
Notes:
Binomial logit and probit is implied if a single index variable is specified.
vs.prob is the mean probability of alternative 1. The logit and probit models
assume the disturbances are distributed with a Weibull or normal distribution
respectively, which requires scaling by specifying either vs.scale (logit),
vs.stderr or vs.variance. This DGP returns a categorical vector with elements
of 0 and 1
Multinomial logit and probit are implied if there is more than a single index
variable - each variable represents the structual utility associated with that
alternative. Fom MNL, only the diagonal elements of vs.variance are used.
This DGP returns a categorical vector with elements {1..k}, where k is the
number of alternatives.
Arch and Garch processes automatically gnenerate the conditional variance as
a GAUSS global stored under the name _ht.
In the linear model, a single equation is implied if a single index variable
is specified. For the normally distributed error, either vs.stderr or vs.variance.
must be specified, while for the t_distribution, vs.df is required.
Multivariate normal or t distribution are implied if there is more than a
single index variable - each variable represents the structual form for that
equation. Both distributions require a residual variance specified in vs.variance,
while for the t_distribution, vs.df is required. This DGP returns an endogenous
variable for each equation, and so the GENR statement cannot be used in this
case. The results are returned in the variable list specified in vs.vlist.
Examples
1. struct DGPS qrs;
genr xb = 4 +5*x1-3*x2;
let qrs.index = xb;
qrs.prob = .4;
qrs.stderr = 1;
qrs.process = "probit";
genr y = dgp(qrs);
2. struct DGPS qrls;
genr x0 = 0*c;
genr x1 = 2-4*z1;
genr x2 = 3+5*ln(z3);
let qrls.index = x0 x1 x2;
qrls.scale = .25;
qrls.process = "logit";
genr ycat = dgp(qrls);
3. let vmat[2,2] = .5 .2 .2 .8;
struct DGPS ls;
let ls.index = xb1 xb2;
ls.variance = vmat;
let ls.vlist = y1 y2;
ls.process = "linear";
call dgp(ls);
print (p) y1 y2;
The first example shows how a binomial process is specified using DGP; 40% of the
generated data will fall in category 1. The second example demonstates a multinomial logit DGP, with 3 alternatives. Example 3 shows how a linear system
is generated with correlated error structure. The structural components (the RHS)
are in xb1 and xb2, and the endogenous variables - y1 and y2 - are created in
the Gaussx workspace, and subsequently printed.
The source is gsx\src\dgpx.src, and examples are given in test07.prg, test43.prg and
test 44.prg
******************************************************************************
6. FILTER command
This command filters data using a variety of filters. Current filters are:
linear
diff
arma
Syntax:
y = FILTER(ftype,x,p1,p2,y0)
ftype - string - filter type
x - NxK data to be filtered
p1 - PxK matrix - first parameter
p2 - QxK matrix - second parameter
y0 - PxK matrix - initial values
ftype = linear:
y = filter("linear",x,a,b,y0);
This is a one dimensional recursive digital filter. The data in vector x is
filtered by vectors a and b to create y. The linear filter is an IIR
(infinite impulse response) or recursive filter. y0 provides the initial
conditions.
y(n) = b(1)*x(n) + b(2)*x(n-1) + ... + b(nb+1)*x(n-nb)
- a(1)*y(n-1) - ... - a(na)*y(n-na)
ftype = diff:
y = filter("diff",x,d,0,y0);
This is a difference filter, where d is the degree of differencing - both
integer and fractional differencing are supported. The first elements of y
are prespecified with y0, which should be of order ceil(d).
y = (1-L)^d x
ftype = arma
y = filter("arma",x,phi,theta,y0);
This is an autoregressive moving average filter. phi is the AR component, and
theta is the MA component. The first elements of y are prespecified with y0,
which should have the same order as phi.
(1-phi(L)) y = (1+theta(L)) x
This is pure GAUSS code; the source in src\filterx.src
******************************************************************************
7. FEVAL - FRML evaluation
The FEVAL statement evaluates a type 2 FRML, and stores the result in the
Gaussx workspace.
Example
FRML eq1 y1 = a0 + a1*x1 + a2*ln(x2);
FRML eq2 y2 = b0 + b1*x1 + b2*x2;
FEVAL yh1 yh2;
eqns = eq1 eq2;
GENR x1 = x1^2;
FEVAL y1;
eqns = eq1;
This evaluates y1 and y2 for each equation respectively, and stores
the result in yh1 and yh2. After the GENR statement, eq1 is reevaluated,
and the result stored in y1.
******************************************************************************
8. Miscellaneous commands
polyinv - polynomial inversion
psi = polyinv(phi,n);
Computes the polynomial inversion psi(B) = 1/phi(B), where
psi is (n+1 x 1) polynomial coefficient vector
polydiv - polynomial division
psi = polydiv(theta,phi,n);
Computes the polynomial division psi(B) = theta(B)/phi(B)
where theta and phi are input polynomial coefficient vectors
and psi is (n+1 x 1) polynomial coefficient vector.
deconv - inverse convolution
x1 = deconv(z,x2);
Computes the deconvolution of a vector. If
z = conv(x1,x2);
then:
x2 = deconv(z,x1);
x1 = deconv(z,x2);
xgamma - Extended gamma function
y = xgamma(x);
Computes gamma(x), where x takes all real values -inf < x < inf
_lngamma - natural log of the gamma function
y = _lngamma(x);
Computes lngamma(x), for x > 0; this effectively extends the lnfact function.
This function is available as lngamma in GAUSS 6.0.17 and above. The manual
shows this function as lngamma, which is now a reserved word.
mroot - returns the modulus of the largest root
r = mroot(phi);
For stationarity and invertibility, the largest root of phi must have a
modulus < 1. Note that for the single equation, this is
equivalent to the roots of the characteristic equation
C(z) = 1 - phi(1)z - phi(2)z^2 - ... phi(p)z^p = 0
having modulus greater than 1, or "lie ouside the unit circle". This
is required for an AR process to be stationary. Similarly, an MA
process is invertible if the largest root of theta must have a modulus
less than unity.
mroot can be used for both vector (single equation) or matrix (VARMA)
arguments.
Example:
frml eq1 y = arma(y,phi1|phi2,theta1|theta2);
frml ec1 mroot(phi1|phi2) <= .9999;
frml ec2 mroot(theta1|theta2) <= .9999;
nls (p,i) eq1;
oplist = constant;
eqcon = ec1 ec2 ;
The ARMA process is estimated using constrained NLS, where the constraints
impose both stationarity and invertibility on the AR amd MA coefficients
respectively.
pdroot - returns the smallest root
r = pdroot(rho);
For positive definiteness, the roots must all be positive.
******************************************************************************
9. DENOISE changes
The FILTER option has been renamed WAVELET
******************************************************************************
CHANGES from 5.0 (1) to 6.0 (1)
1. Improved constrained optimization
2 Automatic differentiation
3 Loadproc and Saveproc
4 Gaussx Configuration file
5. VARMA estimation
6. Non linear binomial univariate, bivariate and trivariate Probit Model
7. Conditional variance forecasts for MGarch models
8. Installation enhancement
******************************************************************************
1. Improved constrained optimization
Under Gaussx, constrained optimization uses the sequential quadratic
programming algorithm. For non-linear constraints, SQP could violate
inequality assumptions, especially during initial estimates when parameters
are far from the optimum. This has been corrected.
******************************************************************************
2. Automatic differentiation.
The default mode of estimating gradients and Hessians for non-linear
optimization is to use finite differencing: eg. d sin/dx = (sin(x+h)-sin(x))/h.
This is a numerical solution, and while it is easy to implement, it is slow especially
when estimating the Hessian with a large number of parameters. An analytic solution
d sin/dx = cos(x) is much faster and more accurate. Gaussx uses the new Maple kernel
available in Maple 9 to provide the AD code - in effect, Gaussx now includes the
full Symbolic Tools package.
AD under Gaussx requires minimal work by the user. The command
OPTION ad;
will result in all gradients and Hessians to be evaluated analytically. Alternatively,
individual analytic gradients or Hessains can be invoked using the existing syntax:
gradient = &symgrad;
hessian = &symhess;
A complete debug mode is available by including the print option 's':
ML (p,i,s) eq1 eq2;
method = nr bhhh nr;
gradient = &symgrad;
AD works with ML, NLS and FIML. Depending of the size and type of problem
analytic gradients and Hessians can result in speed increases for optimization
problems of between 3 and 10 fold.
AD requires that Maple 9 is installed on your machine.
An example is shown in Test42.prg
******************************************************************************
3. Loadproc and Saveproc (AD)
The gradient and Hessian procdures created using symgrad and symhess can be
stored and reused without having to recreate them.
ML (p,i,s) eq1 eq2;
method = nr bhhh nr;
gradient = &symgrad;
SAVEPROC ;
gradient = "garchp";
This example stores the symbolic gradient as garchp.prc on the data path.
proc garchp; endp;
LOADPROC;
gradient = "garchp";
ML (p,i,s) eq1 eq2;
method = nr bhhh nr;
gradient = &garchp;
This example retrieves the previously stored procedure, "garchp", and uses it
for the analytic gradient in the ML estimation. Note that the model structure
and parameter space must remain unchanged for the gradient to be valid.
******************************************************************************
4. Gaussx configuration file
The Gaussx configuration file (gsx\gaussx.cfg) has always been used to set
menu options for Unix platforms. It is now also used to set options for
Windows.
**************************************************** **************************
5. VARMA estimation
Vector Autoregressive Moving Average models are estimated using NLS.
Stationary and invertibility conditions are satisfied using constrained
optimization.
Example:
frml eqw w := varma(y1~y2, p_mat, q_mat);
frml eq21 y1 = c1 + submat(w,0,1);
frml eq22 y2 = c2 + submat(w,0,2);
frml ec1 mroot(p_mat) <= .9999;
frml ec2 mroot(q_mat) <= .9999;
nls (p,d,i) eq21 eq22;
eqsub = eqw;
eqcon = ec1 ec2;
eqw returns the matrix of fitted values based on AR coefficient matrix
p_mat and MA coefficient matrix q_mat. These are estimated using
constrained NLS, where the constraints are specified in ec1 and ec2,
and where mroot is a Gaussx routine for returning the value of the
largest root in the complex plane.
The source is gsx\src\varmax.src, and an example is given in test39.prg
******************************************************************************
6. Non linear binomial univariate, bivariate and trivariate Probit Model
ML estimation of the linear or nonlinear binomial probit model is
facilitated by the PROBIT command. Models include univariate, bivariate
and trivariate probit.
Example:
frml eq1 xb1 = a0 + a1*x1 + a2*x2;
frml eq2 xb2 = b0 + b1*x3 + b2*x4;
frml ellf2 llf = probit(y1~y2,xb1~xb2,r12);
frml ec1 r12^2 <= .9999;
ml (p,i) eq1 eq2 ellf2;
method = bhhh bhhh nr;
eqcon = ec1 ;
This example estimates a bivariate probit model, restricting the
correlation coefficient to lie in the prescribed range.
An example is given in test40.prg
******************************************************************************
7. Conditional variance forecasts for MGarch models
The conditional variance forecast for a multi equation Garch model can be
derived using the FORCST command. A range should be specified -- the
estimated conditional variance is returned up to the first date of the
range, and the forecast based on the information up to the first date
is returned for the period specified.
See test19.prg for an example.
******************************************************************************
8. Installation enhancement
Following changes made to GAUSS, Gaussx can now be installed to a
GAUSS folder with an embedded space - for example c:\program files\gauss50.
******************************************************************************
CHANGES from 4.0 (2) to 5.0 (1)
1. Genetic Algorithm
2. Tabular output
3. Census X12 smoothing
4. Matrix I/O
5. Denoising using wavelets
6. Evaluation of strings
7. Enhanced spreadsheet handling
8. Maple 8 support
******************************************************************************
1. Genetic Algorithm
The genetic algorithm is a search algorithm that attempts to find a
global optimum by simulating an evolutionary process. Each member of
a population (a chromosome) has a set of genes (parameter values). The
members breed, and the genes can mutate. Successful chromosomes - ie
those that are most fit, as evaluated by the optimization process - survive,
while the rest die.
This algorithm can be very useful for testing if one is at a global optimum,
since as in real evolution, change occurs in bursts. It can also be used
for situations when one gets a "failure to improve likelihood'
error message. It is also very robust when parameter upper or lower
bounds are encountered. The GAUSSX implementation allows GA to be used as
one of the step methods in non-linear estimation (NLS or ML);
one can use GA to find the parameter values, and the Hessian can be
derived using one of the other stepsize algorithms for the final method.
An example of the use of GA is given in TEST21.PRG. Most of the
options that are available for GA can be left at the default value.
Control over these options is provided by a 4 element vector called
GENALG; these elements are:
1. The population size (ie number of chromosomes). Default = 30.
2. Number of matings. Default = 4.
3. Probability of mutation. Default = 0.4.
4. Degree of mutation. Default = 0.25;
Unlike the other optimization algorithms, GA is not smooth - change occurs
sporadically. If there is no change in the best fitness after MAXSQZ iterations,
the degree of mutation is reduced by 50%. Convergence is determined by testing
when the standard deviation of the fitness is less than TOL[2]. Since each
breeding consists of an iteration, MAXIT should be set fairly high.
******************************************************************************
2. Tabular output
The TABULATE command now produces a global matrix called STATS which
consists of the numeric portion of the tabular output.
******************************************************************************
3. Census X12 smoothing
X-12 is the most recent seasonal adjustment program developed by the
US Census Bureau. It is based on the X-11 seasonal adjustment method,
which is widely used by statistical agencies throughout the world. X12
also provides the ability to forecast seasonally adjusted time series.
Gaussx provides support for X12 from within the SAMA procedure, and
uses the code made available by the Census Bureau. The user can
either define a specification file (sama_x12.spc), or use defaults
provided by Gaussx. Unadjusted quarterly or monthly data is provided
as an input, and the seasonally adjusted data is created.
Examples of the use of X12 are given in TEST37.PRG. Documentation for
Census X12 is linked in the file gauss\gsx\doc\x12_doc.htm.
Example 1.
sama (p,b) equip_s1;
method = x12;
vlist = equip;
Using the unadjusted data, equip, x12 selects the best ARIMA model, and
creates the seasonally adjusted data in equip_s1. The (b) option generates
brief output.
Example 2.
sama (q) equip_s2;
method = x12;
mode = log;
nar = 2; ndiff =1; nma = 1;
nsar = 2; nsdiff = 1; nsma = 2;
vlist = equip;
In this example, the user defines the ARIMA model to be used, as well as a
data transformation (mode = log). The (q) option generates no (quiet) output.
Example 3.
sama sales_s factors ;
fname = g:\\gauss50\\gsx\\x12\\s11.spc;
catname = d11 d10;
vlist = sales;
In the third example, the user defines a specification file (.spc) which
provides the required information to X12 - thus the full power of the Census
X12 is available. Series d11 and d10 are to be saved; they are specified
in the catname option, and are saved under sales_s and factors respectively.
******************************************************************************
4. Matrix I/O
While the Gauss loadm and save commands allow matrices to be loaded/saved
in a binary format, the .fmt format is specific to GAUSS. To use
matrices in other applications, without having to go via ASCII, it is
useful to be able to do matrix I/O as pure binary. GETM returns an nx1
vector from a binary data file of type double, while PUTM writes a matrix
to a binary data file of type double.
Source is on gauss\gsx\src\gxprocs.src
EXAMPLE
library gaussx;
x = rndu(4,3);
ret = putm("c:\\temp\\mydata.bin",x,0);
vx = getm("c:\\temp\\mydata.bin");
nc = 6;
nr = rows(vx)/nc;
z = reshape(vx,nr,nc);
******************************************************************************
5. Denosing using wavelets
Removing the noise from a noisy signal or timeseries can be accomplished
by first deriving the wavelet coefficients, and then removing or shrinking
those coefficients below a certain value, since low value coefficients will be
associated with noise, and then reconstructing the signal from the adjusted
wavelet coefficients. Gaussx uses the following syntax:
DENOISE xnew;
VLIST = xold;
FILTER = wfilter;
ORDER = wlevel;
MODE = threshmode;
VALUE = threshvalue;
PERIODS = exclobs;
xnew is the name of the denoised series, derived from the original
noisy series xold.
wfilter is the wavelet filter to be used - the available filters are Daubechies
(daub1 to daub10), Symlets (sym1 to sym10) and Coiflets (coif1 to coif5).
wlevel specifies the order of the wavelet decomposition tree. The number of
observations must be exactly divisible by 2^wfilter. Setting wlevel to 0
results in the maximum order possible.
threshmode specifies how the threshold is set - it can consist of three values:
STD [MAD] - method of deriving the standard deviation of the noise
SOFT [HARD] - soft or hard thresholding
[UNIV] MINIMAX SURE - thresholding method
threshvalue permits a user specified value to be used as a threshold
exclobs is the number of initial wavelet coefficients to be excluded from
thresholding. The default is 5.
An example of the DENOISE command is given in test38.prg.
Example:
DENOISE (p) gnpc;
VLIST = gnp;
FILTER = daub8;
MODE = MAD SURE SOFT;
In this example, the original series gnp is denoised to form the
new series gnpc using the Daubechies filter with 8 vanishing moments. The
threshold is derived using the Stein unbiased estimator with soft thresholding
and a MAD estimator of the noise standard deviation. The wavelet level used is
maximal.
******************************************************************************
6. Evaluation of strings
EVAL executes a string consisting of a GAUSS expression(s).
This can be useful when an external application can supply GAUSS
with a string.
Example:
library gaussx;
s = "x=14; x^2;"
eval(s);
GAUSS creates the variable x, and displays 196.
******************************************************************************
7. Enhanced spreadsheet handling
The OPEN statement can use the headers from an Excel spreadsheet
as variable name if no variable list is specified. The SAVE command
will save an Excel file with headers if an extension of .xls is specified.
Excel 95, 97, 2000 and 2002 files are supported.
Example:
open; fname = mydata.xls;
genr x2 = x1^2;
save x1 x2; fname = newdata.xls;
******************************************************************************
8. Maple support
Symbolic mathematics using Maple is now available for Maple 4 through
Maple 8. Note that Maple 8 supports VectorCalculus, and thus the
analytic gradient and hessians are available (see test23.prg).
This option is only available on Gaussx for Windows.
******************************************************************************
CHANGES from 4.0 (1) to 4.0 (2)
1. Command file line length
2. Temp file correction
3. Support for GAUSS 5.0
******************************************************************************
1. Command file line length
The maximum length of a line in a Gaussx command file has been increased
from 80 characters to 256 characters.
******************************************************************************
2. Temp file correction
Variable descriptions in non-linear estimation could print out junk. This
has been corrected.
******************************************************************************
3. Support for GAUSS 5.0
GAUSS 5.0 is supported under this revision.
******************************************************************************
CHANGES from 3.8 (4) to 4.0 (1)
1. GAUSS compatibility
2. Source file location
3. Data processing mode
4. Garch enhancements
5. Feasible MNP and Ranked MNP
6 EM estimation of Gaussian Mixture
7 Double-Bounded Dichotomous Choice Model
8 Conditional variance forecasts for Garch models
9 Error trapping under GAUSS 4.0
10 Markov Switching Models
******************************************************************************
1. Gauss compatibility
Gaussx 4.0.1 is forward compatible with GAUSS 4.0, and backward compatible
with GAUSS 3.5 and 3.6.
******************************************************************************
2. Source file location
In previous versions, non-compiled procs used by Gaussx were placed on
the gauss\src folder. This sometimes causes conflict, since everyone else
places their procs in the same place. To avoid such conflicts, these procs
are now stored on the gauss\gsx\src folder.
******************************************************************************
3. Data processing mode
Gaussx was designed from the outset to be able to process huge data files -
such as census files with millions of observations, and hundereds of variables.
Intermediate storage for data sets of this magnitude has to occur on disk. For
most jobs, however, with typically 500 observations and 10 variables, the
data storage requirements are trivial compared to the RAM available on most
machines. While in the past Gaussx had one data processing mode (on disk),
Gaussx 4.0 permits the user to specify which mode to use. The default is RAM,
which also results in a significant improvement in speed.
The data processing mode is specified in the CREATE statement at the beginning
of a Gaussx command file:
CREATE 1956 1995;
oplist = [RAM]/DISK
RAM, the default, implies that the Gaussx workspace is stored in RAM, while
DISK implies that the Gaussx workspace is stored on disk using the pathnames
specified in the Gaussx Option Screen.
******************************************************************************
4. GARCH enhancements
The following GARCH model has been added:
FIGARCH - Fractionally integrated Garch
The source is gsx\src\garchx.src, and examples using
constrained ML are provided in test07.prg.
******************************************************************************
5. Feasible MNP
The number of parameters in an MNP estimation increases quadratically with
the number of alternatives, because of the elements of the covariance matrix.
Feasible MNP (FMNP) permits an MNP estimation without specifying the
covariance parameters, which are derived in-process using an SEM algorithm.
This procedure is superior to conventional MNP, since it is faster, involves
fewer parameters, and generates estimates with smaller variance. If ranked
data is available, it can be utilized (RMNP).
The likelihood is specified in an FRML:
FRML EQLLF llf = FMNP(ycat,x);
where
ycat Nx1 vector of alternatives chosen for each
observation.(FMNP)
ycat Nxk matrix of ranking alternatives chosen for each
observation.(RMNP)
x NxK matrix, utility evaluation for each observation for
each choice
Source: Jon Breslaw, "Multinomial Probit Estimation without Nuisance
Parameters", Econometrics Journal, Nov, 2002.
The source is gsx\src\fmnp.src, and examples using
MNP, FMNP and RMNP are provided in test33.prg.
******************************************************************************
6 EM estimation of Gaussian Mixture
Estimation of the parameters of a mixed Gaussian distribution using EM.
Based on Hamilton, Time Series Analysis 685-689.
See Test34.prg for implementation.
******************************************************************************
7 Double-Bounded Dichotomous Choice Model
Estimation of the parameters of a DBDC model using ML
Example:
FRML eq1 xb = b0 + b1*sex + b2*educ;
FRML eq2 llf = dbdc(r,t,xb,sig);
PARAM b0 b1 b2 sig;
VALUE = 4 0 1 0;
ML (p,i) eq1 eq2;
TITLE = "DBDC estimation";
The first equation (eq1) specifies the estimate for the latent variable,
typically the willingness to pay. It can be linear or non-linear. The
second equation specifies the likelihood function for the DBDC process:
r is the Nx2 response variable for the first and second questions
t is either:
3x1 vector t_0, = the WTP amount initially
t_u = upper (for yes to 1st question)
t_l = lower (for no to 1st question)
or :
Nx2 vector, the WTP value for the 1st and 2nd question
xb is the vector of fitted value for willingness to pay
sigma is the parameter for the standard error
Source documentation in gsx\src\gxprocs.src
See Test35.prg for implementation
******************************************************************************
8 Conditional variance forecasts for Garch models
The conditional variance forecast for a single equation Garch model can be
derived using the FORCST command. A range should be specified -- the
estimated conditional variance is returned up to the first date of the
range, and the forecast based on the information up to the first date
is returned for the period specified.
frml eq1 e = y - g0 - g1*x ;
frml eq3 llfn = garch(e, a0|a1|a2, b1|b2);
ml (p,i) eq1 eq3;
method = nr bfgs nr;
eqcon = aq0 aq1 bq1 sabq;
forcst hfitg;
mode = condvar;
range = 196701 196712;
See test07.prg for an example.
******************************************************************************
9 Error trapping under GAUSS 4.0
The error log file for GAUSS 4.0 differs from previous versions. Error
trapping following a GAUSS error has been reworked. After a GAUSS error,
type
gaussx; <enter>
from the GAUSS command prompt to view a summary of the cause of the error.
******************************************************************************
10 Markov Switching Models
Markov Switching Models (MSM) allows a given variable to follow different
time series processes over different subsamples. The choice of subsample is
determined by a Markov process. Gaussx enables the estimation of the parameters
in the structural equations, as well as the parameterization of the Markov
process. AR components are permitted. The estimation is undertaken using ML.
Example:
FRML eq1 res1 = y - m1;
FRML eq2 res2 = y - m2 - b1*delgnp;
FRML eq3 lllf = msm(res1~res2, sig1|sig2, p11~p12, phi1|phi2|phi3);
ML (p,i) eq4 eq5 eq6;
title = "Markov Switching Model (2)";
method = nr bfgs nr;
The first two equations specify the two state structural process, and return
the residuals from each state. The third equation evaluates the likelihood
from the MSM process. The arguments for MSM are:
res - Nxs matrix of residuals from s alternative states
sig - sx1 parameter vector (or scalar) of residual standard deviation
p - (s-1)*s parameter matrix of probability loadings for Markov process
phi - narx1 parameter vector for AR process.
In this case, a two state model is estimated with an AR(3) structure. The first
regime is simply a constant, while the second regime uses delgnp as a predictor
of y.
MSM produces the following outputs:
Coefficient values and standard errors
Markov transition probabilities
Ergodic probability for full state vector
Ergodic probability for primitive states
Filtered probabilities
Smoothed probabilities
Source: James Hamilton (1994), "TIME SERIES ANALYSIS" pp 685-689
The source is gsx\src\msm.src, and examples are provided in test36.prg.
******************************************************************************
CHANGES from 3.8 (2) to 3.8 (4)
1. Gauss 3.6 compatible
2. Speed up graphic display
******************************************************************************
1. Gauss 3.6 compatible
Gaussx 3.8.3 is compatible with Gauss 3.6; earlier versions of
Gaussx, which were compatible with Gauss 3.5, crashed on 3.6 because of
changes in the Gauss kernel.
*****************************************************************************
2. Speed up graphic display
PQG is now displayed much more quickly than in earlier versions.
******************************************************************************
CHANGES from 3.8 (1) to 3.8 (2)
1. Support for Gauss when GAUSSHOME not specified
******************************************************************************
Support for Gauss when GAUSSHOME not specified
GAUSS 3.5.35 and later no longer require the GAUSSHOME environment
variable. This is now supported under 3.8.2
******************************************************************************
CHANGES from 3.7 (1) to 3.8 (1)
1. New interface for Windows version
2. Maple support
3. Frontier Production Function
4. Additional statistical distributions
5. Trust regions
6. DW quasi-Newton method for non-linear optimization
7. Multivariate Normal distribution
8. Additional Diagnostics
9. A timer control
10. Rename update
11. Bayesian estimation using MCMC
12. Additonal forecast modes
13. Additonal Kalman Filter options
14. Stochastic Volatility estimation
15 Cluster analysis
16 Robust Estimation
*****************************************************************************
1. New interface for Windows version
Gaussx now is fully integrated in the new Gauss 3.5 interface. All the
editing, execution and help features of the new Gauss interface is
available for Gaussx, as well as additional menu items for specific
Gaussx components.
*****************************************************************************
2. Maple support
Symbolic mathematics using Maple is now available for MapleV, rev 4 & 5,
and Maple 6. This option is only available on Gaussx for Windows.
*****************************************************************************
3. Frontier Production Function
Estimate a frontier production function using ML. The residual, e, consists of
two components: e = v-u, where v is N(0,s^2_v), and u >= 0.
OLS y c x;
param s; value = ser;
param b0, b1; value = coeff;
param lam; value = .1;
FRML eq1 resid = y - b0 - b1*x;
FRML eq2 lf = fpf(resid,s,lam);
ML eq1 eq2;
In this example, OLS is used to get starting values. The frontier production
function takes three arguments - resid, the vector of residuals, and two
parameters - s, the standard error of resid, and lam, the ratio of s_u to
s_v
*****************************************************************************
4. Additional statistical distributions
The following distribution has been added to the pdf, cdf, cdfi
and rnd statements
Laplace: Takes two parameters, p1, the mean, and p2, a positive
scale parameter.
Wishart: Takes two parameters, p1, the degrees of freedom, and
p2, a positive definite covariance matrix of order k.
Normtl: Normal truncated left; takes 3 parameters, p1, the mean,
p2, the variance, and p3 the truncation point.
Normtr: Normal truncated right; takes 3 parameters, p1, the mean,
p2, the variance, and p3 the truncation point.
Mulitivariate t: Takes two parameters, p1, the degrees of freedom,
and p2, the PD covariance matrix (rnd only).
A number of other distributions have been reworked - much faster cdfi
and rnd.
*****************************************************************************
5. Trust Regions
Trust region methods in optimization define a region around the current
iterate, and then choose the step to be the approximate minimizer of
the model in this trust region. The trust region is defined as
||db|| <= del
where ||db|| is the norm of the change in the parameter estimate,
and del is a scalar defining the size of the trust region. If the
quadratic approximation of the function is close to the actual, then
is the constraint is binding, del will be increased. On the other hand,
if the quadratic approximation is poor, del will be decreased. See
J. Nocedal and S Wright, Numerical Optimization (Springer) for
details.
The GaussX implementation for non-constrained non-linear optimization
(NLS, FIML, GMM, ML) use the following commands:
STEP = steptype ;
TRUST = controllist;
STEP The default for step is LS (line search); the alternative
is TR (trust region).
TRUST controllist is a 4 element vector providing control over
the trust region options. These are:
1. Initial size of region - s. del is then calculated as
sqrt(k*s^2), where k is the number of parameters.
2. Maximum size of region - sm. delmax is then calculated as
sqrt(k*sm^2), where k is the number of parameters.
3. Tolerance for Newton estimate of Lagrange multiplier
0.001 is suggested.
4. Maximum number of Newton iterations - 3 is suggested.
Example:
ml (p,i) eq1;
step = tr;
trust = .5 2 .001 5;
method = nr nr nr;
Remarks:
For a problem with a large number of parameters, creating the
Hessian is timeconsuming. For the trust region method, the
bfgs methodology can create an initial step direction and find
the optimum position in the trust region without having to do
a Cholesky factorization - thus this can be an efficient method
for large problems.
The Trust Region step methodology is especially useful for "hard"
problems. In econometrics, finding reasonable starting values for
parameters is often difficult, and poor starting values often
result in a "Failure to Improve Objective Function" error. Trust
Region step methodology can often provide a resolution to this type
of problem.
Example -- see test28.prg.
*****************************************************************************
6. DW quasi-Newton method for non-linear optimization
Recent literature suggests that the Dennis-Wolkowicz (DW) method
may be superior to BFGS or DFP. This algorithm is used when DW is
encountered in the method option:
NLS eq1;
method = nr dw nr;
[J.E.Dennis Jr. and H. Wolkowicz. "Sizing and least-change secant
methods" SIAM J. Numer. Anal. 30 (1993), pp 1291-1314.
*****************************************************************************
7. Multivariate Normal distributions
The functions pdf, cdf, and rnd had problems when the elements
of the covariance matrix were not all positive. This has been corrected.
******************************************************************************
8. Additional Diagnostics
The following methods have been added to the TEST command:
ARCH (Engle's LM Arch test)
TEST vecname;
method = arch;
order = ord;
vecname is the vector to be investigated - usually residuals
ord is the lag structure. See examples in the ARCH command
BKW (Belsley, Kuh and Walsh SVD test)
TEST x1 x2 x3;
method = bkw;
vlist = x3;
x1 x2 x3 are the vectors for which the SVD is to take place.
variables which are logs need to be escaled - these are specified
in vlist.
LBQ Ljung-Box Q test for AR
TEST x1 ;
method = lbq;
periods = nlag;
The correlogram and partial autocorrelogram are displayed, using
nlag lags.
*****************************************************************************
9 A timer control
This control calls the specified function every intval milliseconds.
It is used to create real time simulations
*****************************************************************************
10. Rename update
The rename command has been updated so that it can deal with v96
format
*****************************************************************************
11. Bayesian estimation using Markov Chain Monte Carlo
Bayesian estimation of the following models has been implemented:
ols with residuals distributed t
ols with heteroschedastic residuals
binomial probit
heteroschedastic binomial probit
tobit
An example of this command is:
MCMC (p,i,s) y c x1 x2;
catname = c x1 x2 s2;
userproc = &g_ols_t;
prior = df=3;
replic = 1100 100 200;
The data is read in either as a variable list, or a type I FRML - y
is the dependent variable, and c, x1, and x2 are the independent variables.
Statistics are kept for the parameters in the structural eqn, as well as
the residual variance.
The proc that is to be simulated is g_ols_t. Diffused priors are used, except
for the degrees of freedom for the t distribution, which is set at 3. 1100
replications are carried out, but the first 100 are scrapped, as burn in.
The iteration count is printed out every 200 iterations. At the end of the
run, the statistics for each of the coefficients are printed out
*****************************************************************************
12. Additonal forecast modes
The following types of forecast output are now available
after an OLS:
MODE = FIT Fitted values
RESID Residuals
RESIDSQ Squared residuals
STUDENT Studentized residuals
DFFITS Scaled difference in fitted values
DFBETAS Scaled difference in coefficients estimates
*****************************************************************************
13. Additonal Kalman Filter options
The Kalman OPLIST option has the following additonal components:
OPLIST =
STATE = svec. The state vectors are stored using svec as the root;
Thus if there are three coefficients in the measurement
equation, then the state vectors are stored as svec1, svec2
and svec3 respectively.
PRDERR = vector of one step ahead prediciton errors.
VARPRDER = vector of the variance of the one step ahead prediciton
errors
SMOOTH/[NOSMOOTH] computes smoothed estimates of the state
vectors.
CTRANS = constant for the transtion equation (kx1). Default = 0.
CMEAS = constant for the measurement equation (scalar). Default = 1
VMEAS = constant, variance of the residual in the measurement
equation. Default value is unity.
CTRANS, CMEAS and VMEAS also apply for Kalman (ML).
*****************************************************************************
14. Stochastic Volatility models
SV models can be estimated using ML; a quasi maximum likelihood
estimate is generated, based on the Kalman Filter algorithm. The vector
to be estimated should be detrended and have zero mean.
The transition equation has the form:
log(h_t) = g0 + g1*log(h_{t-1}) + eta_t
where eta_t is N(0,vmu). Thus there are 3 parameters to be estimated,
g0, g1 and vmu.
PARAM g0 g1 vmu;
value = .5 .5 1;
FRML eq1 llf = sv(res, g0|g1|vmu);
ML eq1;
The state vector, log(h_t) is available after the estimation.
*****************************************************************************
15. Cluster analysis
This procedure creates an hierarchical cluster tree, and optionally
graphs the tree - a dendrogram. It can be used in the standard
manner of deriving categories across spatial dimensions, or can be
used to group observations that are similar based on economic or other
characteristics. Cluster information is available as a Gaussx vector
after analysis.
Five distance metrics are available:
Euclidian distance.
Standardized Euclidian distance.
Mahalanobis distance.
City Block metric.
Minkowski metric.
Four linkage methods are available:
Single or nearest neighbour
Complete or furthest neighbour
Average linkage
Centroid linkage
Example:
CLUSTER (p) xcord ycord;
TITLE = "Mahalanobis metric";
MODE = mahal;
OPLIST = plot forecast=clustvar;
VALUE = .03 1;
CATNAME = "Mtl" "Tor" "NY" "LA" "SF" "Seatle";
*****************************************************************************
16. Robust estimation
This procedure allows for the estimation of linear models when the
distribution of the residuals is unknown. OLS is a ML estimate only
when the residuals are normal.
Six methods are available:
Quantile Regression
Least Absolute Deviation
Huber's t Function
Ramsay's E Function
Andrew's Wave Function
Tukey's Biweight
Quantile regression is carried out using the interior point algorithm
by Daniel Morillo and Roger Koenker. The other methods are evaluated
using iterated weighted least squares. Since the underlying distribution
of the residuals is unknown, the parameter covariance matrix is estimated
using bootstrapping.
Example:
ROBUST (p,i,s) y c x1 x2;
METHOD = quantile;
VALUE = .25;
REPLIC = 100;
*****************************************************************************
CHANGES from 3.6 (1) to 3.7 (1)
1. Symbol enhancement
2. Bitwise arithmetic
3. Quasi random sequences
4. Efficient Frontier
5. Maximum Entropy
6. Enhanced List command
7. Updated Gxedit.exe
8. Additional statistical distributions
9. TAB enhancement
10. GARCH enhancements
11. Mathematica support
12. Monte Carlo enhancement
13. Breusch Pagan test
14. Additional single equation diagnostics.
******************************************************************************
1. Symbol enhancement
Occasionally, a value comes up which is identical to a symbol -
for example,
g = 3;
j = .9329889551081693;
t = typecv(j);
t will have the value of 6, since $j is the symbol g. This occurs
rarely, but when it does, Gaussx now correctly identifies j as a value,
and not a pointer.
******************************************************************************
2. Bitwise arithmetic
The following radix and bitwise functions are defined - these are
pure Gauss functions:
radix(x,b) Convert x from decimal to base b
radixi(x,b) Convert x from base b to decimal
iand(a,b) Bitwise a and b
ior(a,b) Bitwise a or b
ieqv(a,b) Bitwise a eqv b
ixor(a,b) Bitwise a xor b
inot(a) Bitwise complement a
ishft(a,s) Shift bits s positions (+ve or -ve)
In all of these, the default word size is 16 bit.
******************************************************************************
3. Quasi Random Sequences
The Sobol sequence generator generates quasi random sequences of
numbers upto 6 dimensions. Such sequences fill a space more uniformly
than uncorrelated random sequences - in a sense they are maximally
avoiding of each number. See Numerical Recipes, Press et al, 2nd ed.
p 300.
x = rndqrs(n,k) will create an nxk matrix of these sequences.
******************************************************************************
4. Efficient Frontier
The Markowitz efficient frontier is created with the Gaussx command
FRONTIER. Given a set of assets and associated rate of returns, the
frontier is evaluated for the number of points specified.
FRONTIER ibm intel msft hp;
vlist = 6.5 9.3 8.9 8.6;
periods = 20;
will calculate the efficient frontier at 20 points for the four stocks
based on the current sample. A plot is also created, unless
OPLIST = NOPLOT is specified. The Gauss variables _RISK, _RETURN
and _WEIGHTS are returned, with dimensions 20x1 for _RISK and
_RETURN, and 20x4 weighting element for _WEIGHTS. An example is
given in Test25.prg
******************************************************************************
5. Maximum Entropy
ME evaluates the probabilities for a Maximum Entropy class of problems.
p = ME(k,&fct);
where k is the number of alternatives, and &fct is a pointer to
a procedure that computes the moment consistency constraints. Full
details are given in the source file - \gauss\src\maxentx.src
Example: library gaussx ;
k = 6; x = seqa(1,1,k); y = 3.0;
proc mcproc(p); retp( y-x'p); endp;
p = me(k,&mcproc);
This example evaluates the probability of each fall of the
dice based on the single observed "average" mean - in this case
3.0. For a fair dice, y would be 3.5.
An example is given in Test26.prg
******************************************************************************
6. Enhanced List command
The LIST statement has been enhanced to permit the creation of a
list of variable names with a common stem. The stem is specified
using the SYMBOL command, and the range in the RANGE command.
Example:
LIST xlist ;
symbol = xx;
range = 1 20;
In this example, the list xlist consists of the twenty variable names
xx1 through xx20.
******************************************************************************
7. Updated GxEdit.exe
The GxEdit.exe file, used for launching the editor on a stand
alone basis, was not correctly updated from Vsn 3.5. This has been
corrected.
******************************************************************************
8. Additional statistical distributions
The following distributions have been added to the pdf, cdf, cdfi
and rnd statements
Cauchy: Takes two parameters, p1, the median, and p2, a positive
scale parameter.
Gumbel: (Extreme value) Takes two parameters, p1, the mode, and
p2, a positive scale parameter.
Logistic: Takes two parameters, p1, the mean, and p2, a positive
scale parameter.
Pareto: Takes two parameters, p1, a positive location parameter,
and p2, a positive scale parameter.
******************************************************************************
9. TAB enhancement
GAUSS does not recognize the TAB character as a white space, and
this often causes problems. GAUSSX files can now accept a TAB
character - this is converted to a space character during the parsing
process.
******************************************************************************
10. GARCH enhancements
The following GARCH models have been added:
AGARCH - Asymmetric Garch
PGARCH - Power Garch
IGARCH - Integrated Garch
TGARCH - Truncated Garch
Other options include:
Non-normal distributions (student-t)
Garch in the mean
Leverage option
The source is src\garchx.src, and examples using
constrained ML are provided in test07.prg.
******************************************************************************
11. Mathematica support
Symbolic algebra can be used for symbolic differentiation and
integration, exact linear algebra, and equation solving. Gaussx
can use Mathematica to permit Gauss to undertake symbolic
manipulation. [Support for Maple is already part of Gaussx].
Simply include the Mathematica statements within the command file,
mark the Mathematica statements, and click the Mathematica button.
This works only for Gauss for Windows, and requires Mathematica 3
or higher.
An example is given in Test27.prg
******************************************************************************
12. Monte Carlo enhancement
If there are k elements in the vector specified in vlist, and
there are n replications, then the nxk matrix is stored in
_mcmat, which is available after the Monte Carlo run.
The current MCS iteration is stored in _mciter.
******************************************************************************
13. Breusch Pagan test
The Breusch-Pagan test for heteroscedasticity is undertaken using
the following syntax:
frml eq1 y c x2 x3;
test (p) eq1;
method = bp;
vlist = c x2 x4;
This tests for homoscedasticity of the residuals in eq1 based on
the auxiliary regression of the squared residuals against the
explanatory variables listed in vlist.
******************************************************************************
14. Additional single equation diagnostics.
The following diagnostics have been added to the single equation
printout.
Akaike Information Criterion: AIC = ln(RSS/T) + 2*k/T
Schwarz Criterion: SC = ln(RSS/T) + k*ln(T)/T
******************************************************************************
CHANGES from 3.5 (2) to 3.6 (1)
1. Sampling with/without replacement
2. Bug fixes
3. Likelihood gradient
4. Lagrange Multiplier Test
5. Choice of perturbation size for gradients
6. Pearson distribution
7. Normality test for Probit
8. Symbolic algebra
9. Data exchange
10. Gauss interface
11. 32 bit Gaussx desktop
12. Exact high dimensional MNP
13. Spectral analysis
14. Enhanced Neural Network Model
15. Enhanced PARAM statement
******************************************************************************
1. Sampling with/without replacement
Given a population of size n, and a required sample of size m, the
command:
y = rndsmpl(m,n,c) returns an mx1 vector of unique numbers
between 1 and n if c = 0; and an mx1 vector with replacements if
c = 1.
******************************************************************************
2. Bug fixes
Logical operations in Type II FRMLs did not work correctly in 3.5(2);
this has been corrected.
Ordered probit didn't work properly when the first category wasn't
unity. This has been corrected.
******************************************************************************
3. Likelihood gradient
For unconstrained parameter estimation, the gradient of the log
likelihood at the optimum is zero. In the constrained case, this
is no longer true, and the kx1 vector of dLLF/db is available as
a GAUSS global with the name GRADVEC.
******************************************************************************
4. Lagrange Multiplier Test
The Lagrange Multiplier test is a general test for testing the
restrictions imposed on a model, when the model estimated is the
restricted model. Thus it can be easily carried out after a
constrained estimation. The LM statistic is dL/db'Inv(I)*dL/db,
where dL/db is the gradient of the likelihood with respect to
each parameter, and I is the information matrix, both evaluated
at the restricted parameter values. The syntax of the command is:
TEST gradvec;
METHOD = LM;
ORDER = order;
VALUE = vcov;
gradvec is the gradient vector available as a GAUSS global from the
restricted estimation, order is the number of restrictions, and
vcov is the restricted parameter covariance matrix.
******************************************************************************
5. Choice of perturbation size for gradients
Numerical gradients are found using the standard (f(x+h)-f(x))/h.
The default for h is 1e-06, but there are some circumstances when
the function is particularly complicated where h should be smaller.
The value of h can be changed in an OPTION statement - for example:
OPTION gradh = 1e-08
******************************************************************************
6. Pearson distribution
The Pearson distribution is a very general three parameter distribution
that includes as special cases the beta, gamma, normal and t
distributions. As such, the restricted distribution can be used for
testing normality via an LM test. The family is modeled assuming
zero mean, and the only restriction is that c1 > 0. For the standard
normal distribution, c0 = 1, c1 = 0, and c2 = 0. The pdf, cdf,
cdfi and rnd are all supported:
prob = pdf("pearson", x,c0,c1,c2);
******************************************************************************
7. Normality test for Probit
The Bera, Jarque and Lee (IER, 1984) test for normality has been
included in the Probit procedure.
******************************************************************************
8. Symbolic algebra.
Symbolic algebra can be used for symbolic differentiation and
integration, exact linear algebra, and equation solving. Gaussx
uses Maple to permit Gauss to undertake symbolic manipulation.
Simply include the Maple statements within the command file,
mark the Maple statements, and click the Maple button.
This works only for Gauss for Windows, and requires MapleV, rev 4,
or higher.
******************************************************************************
9. Data exchange
Data exchange between Gauss and spreadsheet, database and ASCII
files (in both directions) is accomplished from the Gaussx for
Windows desktop file menu. The following file formats are
supported:
"WKS" "WK1" "WK2" Lotus v1-v2
"WK3" "WK4" "WK5" Lotus v3-v5
"XLS" Excel v2.1-v7.0
"WKQ" "WB1-WB6" Quattro v1-v6
"WRK" Symphony v1.0-1.1
"DB2" dBase II
"DBF" dBase III/IV
"DB" Paradox, FoxPro, Clipper
"CSV" "TXT" "ASC" ASCII delimited and packed
"PRN" ASCII formatted
******************************************************************************
10. Gauss interface
Gaussx for Windows can be used to run either Gaussx command files, or
a file containing Gauss code. The default for the latter is to run the
whole file, and record the output to an output file. However, an
interactive mode is available, whereby a marked block or single line is
executed, and the Gauss output is returned. This is equivalent to
running Gauss for Windows from the Gauss prompt, except that execution
of marked blocks is possible. In addition, all the power of the
Gaussx editor (split screens, bookmarks, multiple pages etc) are
at one's disposal.
******************************************************************************
11. 32 bit Gaussx desktop
The Gaussx for Windows desktop has been redone as a 32 bit Windows 95
application. This results in much better multi-tasking, and smoother
operation. The downside of this is that it requires a 32 bit
environment - Windows NT 3.51 or above, or Windows 95.
******************************************************************************
12. Exact high dimensional MNP
MNP estimation requires estimates of thethe multivariate normal
cumulative density function. For low dimension (k), the standard
GAUSS functions (cdfn, cdfbvn, cdftvn) work well. For higher
dimensions, one either used restricted covariance matrices (factor
analytic), simulated methods (GHK), or the recursive GAUSSX routine
CDFMVN, which was very slow for k > 6. Using the new GAUSS CDFMVN dll,
MNP can be estimated using CDFMVN very much more rapidly. Setting
_mnpint = 1 sets the integration algorithm to CDFMVN.
******************************************************************************
13. Spectral Analysis
A gauss command - SPECTRAL - has been implemented which generates
the periodogram for a given vector. An example is given in TEST24.PRG,
showing both a periodogram and filtering. A description of this
command is given in GXPROCS.SRC on the \GAUSS\SRC directory. SPECTRAL
can be used both as a GAUSS or a GAUSSX command.
******************************************************************************
14. Enhanced Neural Network Model
The ANN command has been enhanced.
1. The augmented model can now be estimated by specifying AUGMENT in
the program control options. Under this option, the hidden layer
consists of the sum of the hidden transfer function and a linear
function of the inputs - consequently the output weight vector (beta)
will have k extra elements.
2. The CDFN transfer function has been added.
3. The OLS transfer function has been added. This only applies to
the output function when the model is being estimated using NLS.
When the output transfer function is linear and the optimization is
undertaken using {\tt NLS}, it is possible to express the
output weights (beta) in a closed form. This results in a
significant reduction in the number of parameters that need to be
estimated. To use this option, set the output transfer function
to OLS; the output weights are available as a global called _nnbeta.
See the example in test20.prg for syntax.
******************************************************************************
15. Enhanced PARAM statement
The PARAM statement can be used to create a matrix of parameters
by specifying a single matrix name, as described in the manual. The
elements of this matrix can be altered in the usual way, element by
element. A block can now be changed by specifying a block position
in a range statement.
Example:
PARAM amat ;
symbol = a;
order = 4 3;
arow2 = 1~2~3;
CONST amat;
value = arow2;
range = 2 1 2 3;
In this example, a 4x3 matrix of coefficients is created with elements
a_ij. The second row is set to a constant {1 2 3} - the range
statement giving the initial row, initial column, final row,
and final column. The order of the matrix specified in VALUE must
match the block specified in RANGE.
******************************************************************************
CHANGES from 3.5 (1) to 3.5 (2)
1. Constrained parameter estimation
2. Spreadsheet conversion error corrected
3. New GAUSS functionality
4. Windows online help
******************************************************************************
1. Constrained parameter estimation
GAUSSX now supports non-linear parameter constraints under FIML,
GMM, ML and NLS. The constrained non-linear programming is undertaken
using sequential quadratic programming. In addition, the confidence
region for any specified confidence level for each parameter is
calculated.
Constraints are specified in the GAUSSX FRML command as logical
statements, as shown below:
FRML cq1 b0 + b1 >= .5;
FRML cq2 b3 == 0;
FRML cq3 b0^2 + 2*b1 <= 2.4;
The constraints can be linear or non-linear, and the relational
operator must be >=, ==, or <=.
Within the estimating procedure, the contraints are activated by
specifying the constraint equations within an EQCON option:
nls (p,i) eq1 eq2;
maxit = 100;
eqcon = cq1 cq2;
method = nr nr nr;
This example would undertake non-linear least squares on eq1 and eq2
in the usual manner, except that the constraint equations cq1 and cq2
are invoked. Obviously, the parameters specified in the constraints
must also occur in the structural equation. A sensible procedure is
to first estimate the model in the unconstrained case, and then use
these parameter values as the starting point for the constrained case.
For each constraint, GAUSSX prints out whether the constraint is
binding or not, as well as the value of the Lagrange mul