Comparison of the Least Squares Method program to others, Curve Fitting, Nonlinear Regression

Back to the main page

All other programs I saw need a function and point coordinates, and on this basis they calculate values of parameters with their uncertainties. Additionally, those programs can also estimate measuring uncertainties for one coordinate of all points (usually it is the Y-axis). But to do it, the method must assume that all uncertainties in other dimensions equal zero, and that all uncertainties in this one dimension are the same or the weights are known. Therefore, if such a program enters uncertainties (e.g. standard deviations) in the Y-axis, it uses them only as weights when fitting the curve to the points. These uncertainties entered aren't very "real", they mean only that: the smaller the uncertainty is for a given point the closer the curve will pass this point. Volume of input point uncertainties is of no importance, mutual ratios matter only. These mutual ratios affect the values of parameters and their uncertainties found. So the method is called weighted least squares.

The method used in my LSM program (nonlinear least squares regression with input errors) needs a function and point coordinates together with their measuring uncertainties, and it assumes that all point uncertainties (for every point, for every dimension) are known and may be different. On the basis of both scatter of points and point uncertainties (not of weights of point uncertainties in the Y-axis, like the other programs) it finds curve parameters with their uncertainties. The method consists in minimizing the sum of the squares of both vertical and horizontal distances of the points from the line divided by the appropriate x and y uncertainties (in two-dimensional case, with zero-covariances). Therefore these input uncertainties are more "real", they mean not only that the smaller the uncertainties are for the given point the closer the curve will pass this point. Values of input uncertainties, not only the mutual ratios, affect the values of parameter uncertainties found by LSM. Like the aforesaid programs, my program finds the "goodness of fitting", and beside the scatter of points, it takes into consideration the point uncertainties (or the whole covariance matrix). But the values of input point uncertainties, not only the mutual ratios, affect the goodness of fitting.

The difference between results of the two types of methods is clearly visible in extreme case of fitting a straight line to two points. Regardless of the uncertainties entered, "other program" informs of a "perfect line", and gives both slope and intersection without their uncertainties. It's obvious that it can not find these uncertainties since it draws information from the scatter of points only, and the points are in the straight line found. Differently behaves my LSM program. After an entering the point uncertainties in Y direction (and forcing X uncertainties to be zero, in order to create the same conditions), LSM gives the same parameter values as "other program" but with their uncertainties. It can find parameter uncertainties even in two point case since the uncertainties of each of the two points are known and, graphically speaking, the points are "spread out" over their uncertainty area so different possitions of the line are possible.

When the number of points is greater than the number of parameters (i.e. the number of degrees of freedom is greater than zero) the method used in LSM program gives exactly the same values of parameter uncertainties as "other programs" if in LSM program all input X (or X1, X2, ... for many dimensions) uncertainties are fixed at zero and input Y uncertainties are such that the quantity called chi-sqr (from the menu item GInf) takes on the value equal to the number of degrees of freedom.
And this is the answer to the question about the difference. Therefore in two-dimensional case, only with zero X uncertainties and zero input point covariances, the two methods discussed are equivalent.

The difference is in usage. Using "other program", you can estimate the goodness of fitting among other things by the value Sy (that is the standard deviation of the vertical distances of the points from the line, i.e. the estimate of the "real" measuring Y uncertainties); the closer value Sy to the real measuring Y uncertainties shows the greater probability that you pick an appropriate model (function) (but the real uncertainties must be known). Using LSM program and entering the "real" measuring Y uncertainties, you can estimate the goodness of fitting by the chi-sqr; the closer chi-sqr value to the number of degrees of freedom shows the greater probability that you have picked an appropriate model.

However, my program is not so comfortable as contemporary ones, so use it especially when the point uncertainties are in more than one axis and they all are known, or there are point covariances, or the problem is in more than two dimensions, or the curve is not a graph of any function (e.g. a circle), or the curve is given by an equation from which no variable can be derived, or you use a combination of functions stuck with some points.

A little example.
Let's imagine that we enter a set of 24 points and a function (a second order polynomial with 3 parameters) to an "other program", without any point uncertainties (standard deviations). Then we get the values of parameters with their uncertainties as well as a special quantity named Sy. This quantity estimates the y uncertainties of all points (to estimate it, the program must have assumed all x uncertainties to be equal zero).

Then we feed the same set of 24 points and the same 3-parameter function to the LSM program. This time we also enter equal y uncertainties and zero x uncertainties. As a result, we obtain the same parameters as the "other program", but their uncertainties are, let's say, 1.7 times greater. We aren't disappointed; we look at the chi-sqr value (shown by GInf menu item); it amounts to 7.27. So we think: it would be the best thing if the chi-sqr value is equal to its expected value, i.e. to the number of degrees of freedom that is equal 21 here (24 points minus 3 parameters). We guess that we didn't know the exact values of y uncertainties and we assumed them too big, so, knowing the definition of chi-sqr (readme.htm), we decrease the y uncertainties sqrt(21/7.27)=sqrt(2.89)=1.7 times. So we enter these y uncertainties to LSM program and repeat the computation. Now the chi-sqr value is equal 21 (that is the number of freedom degrees) and the values of parameter uncertainties are the same as those given by the "other program". What's more, these y uncertainties decreased 1.7 times (they may be concidered as the estimates of the real y errors) are equal to the Sy value we obtained with the "other program".

To sum up, I will repeat the sentence from above: The method used in LSM program gives exactly the same values of parameter uncertainties as "other programs" if in LSM program all input X uncertainties are fixed at zero and input Y uncertainties are such that the quantity called chi-sqr (from the menu item GInf) takes on the value equal to the number of degrees of freedom.

You could say the use of "other programs" is easier. It is true in case of zero x uncertainties. But my program was created to manage other cases of nonlinear regression I mentioned above.

Back to the main page