Index: [thread] [date] [subject] [author]
  From: Borries Demeler <demeler@bioc09.v19.uthscsa.edu>
  To  : Daniel Ryan <drya9411@mail.usyd.edu.au>
  Date: Sat, 7 Oct 2000 11:52:03 -0500 (CDT)

Re: A Question

> Hi,
>     This is Daniel Ryan again. I have one question about the non-linear =
> least squares analysis of sed. eq. data. I am going to use a modified =
> Gauss-Newton algorithm. The only thing I don't quite understand is the =
> fitting of multiple data sets simultaneously. I can't seem to find in =
> the literature anything that explicitly explains it. For each data set =
> do you separately apply the algorithm and obtain separate a (vector of =
> the fitting parameters) for each set, then what happens. This is were I =
> get lost because I can't find an explanation anywhere, (My knowledge of =
> this type of stuff is not great). If any one could shed some light on =
> this topic it would be greatly appreciated.
> 
> Cheers Daniel Ryan.

Dear Daniel,

To turn a local NLS fit into a global fit, you will first have to
decide on which parameters are local and which are global. Once you know
that, you will calculate a solution for *each* dataset, keeping global
parameters the same for each dataset (for example, in an equilibrium
experiment, the global parameters may be the molecular weight and,
in the case of self-associating systems, the association constants,
while the baselines may be local). 

Next, you will calculate a solution with your initial guesses and 
compare that solution to your experimental data in a least-squares
sense, for *each* dataset. In that process you will have a separate
function for each dataset that reflects the various parameters of
each model/dataset, keeping global paramaters the same for each
model function. Each dataset/model funtion will result in a residual
(chi-square). For the NLS routine, you will need the total residual,
i.e., the sum of all residuals and use that to adjust your parameters.

In the case of the Gauss-Newton method, for example, Levenberg-Marquardt
or modifications of the LM method, you will have to calculate a NxM
Jacobian matrix, which is the matrix of partial derivatives with
respect to each paramater at each point of all the datasets, where N
is the sum of all datapoints and N is the sum of all parameters, local
and global. 

You can just combine all datasets and set the nonglobal derivatives
to zero if they aren't used in a particular dataset, like for example,
the baselines. This results in a sparse matrix. At some point in the GN
algorithm you will invert (or better, Cholesky decompose) the square of
the Jacobian and calculate the gradient, using the inverted square of the
Jacobian. The gradient will then tell you in which direction your initial
guesses should be adjusted. This process is iterated until some tolerance
is reached, for all practical purposes, this should be when the change in
residuals is zero. This is the same as with a local fit. 

There are tons of references to the Levenberg Marquardt/Gauss Newton
method on the web, and they are all very similar in concept and all depend
on the calculation of a squared Jacobian (which is really an approximation
to the Hessian). If you want to use a quasi-Newton method, which would
probably be more robust and higher-performance, especially if you have
many parameters, you would use a linesearch (look for More-Thuente on the
web) and calculate the Hessian directly. This can be a pain for implicit
functions, but recent advances in automatic diferentiation can be really
helpful there.

I am not aware of any publications that explicitely state what I just
wrote, but this should help you get started, because this is how it is
done. If this is *not* homework, let me know, and I'll send you some
short C++ sample source that will show you how it is done.

Hope it helps, -Borries
*******************************************************************************
* Borries Demeler                                                             *
* The University of Texas Health Science Center at San Antonio                *
* Dept. of Biochemistry, 7703 Floyd Curl Drive, San Antonio, Texas 78284-7760 *
* Voice: 210-567-6592, Fax: 210-567-4575, Email: demeler@biochem.uthscsa.edu  *
*******************************************************************************

Index: [thread] [date] [subject] [author]