2. Files in the
package.
readme.txt - description of the package,
lsm.ico - file with the icon for the
main program,
LSM23-s.exe - the main program, single precision version
LSM23-r.exe - the main program, real precision version
LSM23-d.exe - the main program, double precision version (less maximum number of points)
LSM23-demo.exe - the main program, demo and real precision version
rtm.exe - auxiliary programs and modules
dpmi16bi.ovl,
egavga.bgi,
litt.chr
ELLIPSE.50 - directories with demo data,
PARALINE.47
HYPERB_2.50
3DStLine
3. Installation.
It is necessary to copy the contents of
the package to a directory (named LSM for example).
The program doesn't even touch Windows registers.
4. Description of
the program.
LSM program carries computation with the
Least Squares Method (nonlinear regression), i.e. fits parameters of given in analytic way function(s)
to given points.
Such approximation may be done :
- in many dimensions,
- with input uncertainties on each axis,
- using input covariances or correlation coefficients, if needed,
- with curves that are or are not functions
of any variable x,y,z,... (e.g. an ellipse) or implicit functions
- using many approximated curves simultaneously;
curves may be stuck with some their parameters as well as with some points.
Each curve must have the form
of an expression obtained by transferring all terms into one side of an equation.
Every parameter must be denoted by a1,a2,a3,...,
and every point coordinate - by x1,x2,x3,...
(instead of x,y,z,...).
In current approximation there is a possibility of omitting some points or some curves (if there are many).
In case of using the input covariance matrix, calculation time is approximately proportional
to the third power of the product of point number and dimension number.
Calculation time is significantly longer
due to the method nonlinearity caused by the assumption of input errors for
all quantities measured.
There are four ways of supporting each
kind of data:
a. using a keyboard,
b. using previously entered
data, if an approximation
with the same fundamental
data has been used recently,
c. using data contained in a text file
in a user subdirectory;
each such a file must
have a standard name (given in sections below);
the program also
makes possible saving automatically output data in this subdirectory;
user subdirectory once
selected is not allowed to be changed when running one set of base data,
d. using data contained in any text file.
Data in a text file
must be separated with spaces, tabs or end-of-file signs.
4.1.
Input data (in spreadsheet format if in a file) are submitted to the following scheme:
4.1.1. Basic
data - in the following order:
a. number of point dimensions (max. 3275)
b. number of points
(max. 8190 - LSM24-s version),
(max. 5460 - LSM24-r version),
(max. 4095 - LSM24-d version),
c. number of functions (curves) that connect
point coordinates (max. 20),
d. number of parameters in
all functions (max. 3275).
Additionally, the following condition must
be fulfilled:
a product of point number and dimension
number must be less then
16384 - LSM24-s version,
10922 - LSM24-r version,
8192 - LSM24-d version.
The name of suitable data file in user
subdirectory: basedata.dat .
4.1.2. Entering
analytical form of functions.
Every parameter must be denoted by a1,a2,a3,...,
and every point coordinate - by x1,x2,x3,...
(instead of x,y,z,...).
Each function must have the form
of an expression obtained by transferring all terms to one side of equation.
This form enables applying complicated curves that are not functions, applying curves given by an equation from which no variable can be derived or changing the form of equation to be always computable.
The last case may be useful when some point coordinate or a parameter value changed during computation go beyond domain boundary.
Therefore, express your fitting function x3=f(x1,x2) in the implicit form F(x1,x2,x3)=0
without including the "=0" part. Here is an example:
replacing z = a1*x^2 + a2*y + a3 with x3= a1*x1^2 + a2*x2 +a3
and writing it in the form F(x1,x2,x3)=0, without the "=0" part, this becomes:
a1*x1^2 + a2*x2 + a3 - x3
Every function must be continuous in the
calculating interval, considering point coordinates as well as parameters.
Additionally, every function must be approximately
linear in the surroundings of every point (plus minus error)
and in the surroundings of every parameter.
Every function (curve) may share some its parameters (as well as some its points) with other function.
The funtions and their derivatives are calculated in extended precision (19 significant digits) but the data are stored in real precision (single, real or double, depending on the version) in order to save memory.
The names of the suitable data files in a user subdirectory: function.1 , function.2 , ... .
List of accessible operations and
basic functions:
+,-,*,/ - basic operations;
x^y - raising x to a power y;
pi - the value pi=3.141592653589793238;
sqr(x) - second power of x;
sqrt(x) - square root of x;
exp(x) - exponential function;
ln(x) - natural logarithm;
cos(x) - cosine function (radian);
sin(x) - sine function (radian);
tan(x) - tangent function (radian);
ctan(x) - cotangent function (radian);
arctan(x)- arc tangent (radian);
cosh(x) - hyperbolic cosine;
sinh(x) - hyperbolic sine;
tanh(x) - hyperbolic tangent;
abs(x) - absolute value;
sgn(x) - sign function (signum);
H(x) - unitary jump (Heaviside function);
H(x)=1 when x>=0, otherwise H(x)=0;
random - random value from the interval
[0,1) ;
rand(n) - random value from the interval
[0,n) ; n is integer ;
besj0(x) - Bessel function of first kind,
0-th order;
besj1(x) - Bessel function of first kind,
1-st order;
besj(n,x)- Bessel function of first kind,
integer order;
besy0(x) - Bessel function of second kind,
0-th order; x>0, (Weber f.);
besy1(x) - Bessel function of second kind,
1-st order; x>0, (Weber f.);
besy(n,x)- Bessel function of second kind,
integer order; x>0, (Weber f.);
besi0(x) - modified Bessel function of
first kind, 0-th order; { besi0(x)=besj0(ix) };
besi1(x) - modified Bessel function of
first kind, 1-st order; { besi1(x)=-i*besj1(ix) };
besi(n,x)- modified Bessel function of
first kind, integer order; { besi(n,x)=(i^n)*besj(n,ix) };
besk0(x) - modified Bessel function of
second kind, 0-th order; x>0, (MacDonald f.);
besk1(x) - modified Bessel function of
second kind, 1-st order; x>0, (MacDonald f.);
besk(n,x)- modified Bessel function of
second kind, integer order; x>0, (MacDonald f.);
gamln(x) - logarithm of gamma function;
x>0 or x<0 and x belongs to (-2*k,-2*k+1), k is integer;
gamm(x) - gamma function; x>0 ;
gamma(x) - gamma function; x different
from -k , k is integer;
gammp(a,x)-gamma function, incomplete;
a>0 , x>=0;
bet(x,y) - beta function; x>0 , y>0 ;
beta(x,y)- beta function; x,y,(x+y) different
from -k, k is integer ;
beti(a,b,x) -beta function, incomplete;
a>0, b>0, x>=0, x<=1 ;
erf(x) - error function;
lgndr(n,x) -Legendre polynomials; n is
integer, |x|<=1 ;
plgndr(n,m,x) -Legendre polynomials, associated
(spherical harmonics);
n,m are integer, |x|<=1, m>=0, m<=n
;
4.1.3. Entering
values of point coordinates.
Values of coordinates, when using a
text file, should be separated with a space, tab or with an end of line and
located in the following succession (in 3D case):
x1 x2 x3 ... (point No 1)
x1 x2 x3 ... (point No 2) etc.
... ... ... ...
Each value should have a real form with
maximum 7 (single precision), 11 (real precision) or 15 (double precision) significant digits.
In order to denote a power of ten one should use the letter E or e, e.g. 1.2E-6 .
Each positive value must be greater than 2.9E-39 and less than 1.7E38 (5.0E-324 and 1.7E308, respectively for double precision).
The name of the suitable data file in a user subdirectory: points.dat .
4.1.4. Entering
point uncertainties.
The program method is assumed that uncertainties
are submitted to the Gauss normal distribution. Data form should be the same as in Section 4.1.3.
Uncertainties may be entered in two ways:
1. Standard deviations for each point (and for each coordinate).
When using a keyboard, the whole process
may be shortened if deviations are the same for a given dimension of every point.
Since the square deviations are calculated, the acceptable range of deviations amounts to from 1.E-19 to 1.E19 (1.E-162 and 1.E154, respectively for double precision).
Entering succession is the same as
in Section 4.1.3.
The name of the suitable data file in a
user subdirectory: errors.dat .
Correlation coefficients for different dimensions of every point may be added.
They can be used when errors of different dimensions of every point are correlated.
Entering succession for every point in four-dimensional case is as follows:
r12, r13, r14, r23, r24, r34.
The name of the suitable data file in a user subdirectory: correls.dat .
2. Covariance matrix of points (from a text file only).
It may be used especially when errors of different points are correlated.
Its size figures to n*n, where n is a
product of number of points and number of dimensions.
Its main diagonal consists of standard square deviations.
The name of the suitable data file in a user
subdirectory: covars.dat .
4.1.5. Entering
subordination of points to each curve.
There is no need to do it, if there is
only one function (curve) in the program.
Data, when a text file is used, have
the following form:
numbers of points are located in the line
suitable for the given function; succession of function lines is identical
with that when entering functions, e.g. (if 3 functions are used):
1-10 20 21 22
0
5 7-12
In current approximation there is a possibility
of omitting some points,
subordinating one point to several
functions (curves) as well as
taking no account of some curves by
an insertion the 0 sign into the suitable line.
The name of the suitable data file in a user subdirectory: whichpnt.dat .
4.1.6. Entering
initial values of unknown parameters.
The method used in the program requires
initial values of unknown parameters to start its iteration. If the function
assumed considerably differs from the linear function, the iteration
will be convergent only when the initial values are a good approximation
of the true ones.
Entering succession should correspond
with increasing parameter numbers.
The name of the suitable data file in a
user subdirectory: params.dat .
4.2. Data calculated
are led out according to the following scheme:
4.2.1. Values
of function parameters found, together with their uncertainties
(both to screen and to aprxpars.dat file).
4.2.2. Matrix
of parameter covariances ( if needed, but to the text file parcovar.dat
only).
4.2.3. Coordinates
of 'adjusted' points located on a found curve, together with their uncertainties
(both to screen and to aprxpnts.dat file).
4.2.4. Matrix
of 'adjusted' point covariances ( if needed, but to the text file pntcovar.dat
only).
4.2.5. Graph
x2=f(x1) , in two-dimensional case only.
There is a possibility to plot either error
rectangles or error ellipses on the graph of curve(s) found.
Each curve may be regarded either as a
function of one of variables (x1 or x2) or as an implicit function.
In the last case the graph may be not satisfactory and then an improvement
may occur after having extended the range of the curve.
The possibility of selecting the range
of every curve is also to prevent an error resulting from exceeding the function
domain.
To save the graph in bitmap file (bmp) press Ctrl-S
4.2.6. General
information on approximation.
a. Chi2 value
(that follows chi2 distribution);
this
value should equal (with after mentioned tolerance) the
number of freedom degrees (see below).
Too large chi2 value may result
from inappropriate choice of analytical function form or from assuming
point uncertainties to be too small. And vice versa: too low chi2
value may testify to granting parameter number to be too large or to
assuming point uncertainties to be too large.
b. Number of freedom degrees
;
this
number equals the point number decreased by the number
of
parameters searched. The number of freedom degrees is also equal
to
an expected value of chi2 distribution of the chi2
value.
c. Standard deviation of distribution of
chi2 value.
d. Probability that ,theoretically, chi2 value would be greater
(smaller)
than
the value found from the approximation.
e. Same-Side-Correlation Coefficient (SSCC) together with its standard deviation. When the measuring points are located randomly on both sides of the fitted curve (surface, hyper-surface), SSCC takes on a value that is close to zero. When the points are grouped in big clusters each located on one side of the curve, SSCC considerably differs from zero, i.e. is greater then its double standard deviation, or even approaches the unit, thus indicating that the curve is selected incorrectly (maybe more parameters are needed) or the points are correlated (maybe the points in one cluster come from a biased meter). When SSCC is negative and considerably differs from zero or even approaches minus one, it is statistically hardly probable (maybe the points have been created with "human hands").
f. Numbers of points for
which the difference between any appropriate coordinate of measuring point
and coordinate of adjusted point is greater then the triple adequate measuring uncertainty.
4.2.7. Scaling uncertainties.
If you are sure of your fitting curve and the reciprocals of uncertainties for all points and dimensions, instead of the uncertainties themselves, you can multiply (the Opt menu item) all input (point) and output (parameter) uncertainties by the square root of the ratio of the output chi2 parameter to the number of freedom degrees. Thus you can obtain the best estimates of uncertainties in this case of incomplete information on input uncertainties.
4.3. Description
of the calculation method.
The method consists in minimization
of a quantity M equal
M = dT*Gy*d
where the sign T denotes
transposition, d is the coordinate difference vector that is equal in two-dimensional case
[ (X1-Xo1) , (Y1-Yo1) , (X2-Xo2)
, (Y2-Yo2) , ... ] ,
where Xoi,Yoi
are the coordinates of i-th measuring point
and Xi,Yi
are the coordinates of i-th 'adjusted' point located on the curve found;
Gy is the inverted covariance
matrix (see Section 4.1.4).
In two-dimensional case with zero-covariances (i.e. errors only) the quantity M is equal
_____
\
\
(Xi-Xoi)2
(Yi-Yoi)2
/
---------- + ----------
/_____ (Sxi)2
(Syi)2
i
where Sxi,Syi are the appropriate standard deviations of i-th point.
Chi2 value (see Section 4.2.6) is equal (in two-dimensional case with zero-covariances)
_____
\
\
[ -F'(Xi)*(Xi-Xoi)
+ (Yi-Yoi) ]2
/
--------------------------------------
/_____
[F'(Xi)]2*(Sxi)2 +
(Syi)2
i
where F'(Xi) denotes the partial derivative of the approximating function with respect to x, calculated in i-th point; the remaining designations are as above.
An iteration is interrupted when the difference
between two values M from succeeding iteration steps is lower then
0.1*sqrt(M) and when each difference between two values of each parameter taken from
succeeding iterations is less then adequate parameter standard deviation decreased
10 times.