Index: [thread] [date] [subject] [author]
  From: Borries Demeler <demeler@bioc09.uthscsa.edu>
  To  : rasmb@bbri.harvard.edu
  Date: Wed, 8 Sep 1999 08:02:08 -0500 (CDT)

the significance of a fit is the heart of the matter

Joel,

I have to concur with everything that has been said, just want to stress
the points that I think are the most important and apply in general when
you fit any data:

1. Adding additional parameters in order to obtain better fits can
drastically reduce the confidence in any or all parameters. If you add
too many parameters, you might as well fit to a sum of polynomials or
a sum of sines and cosines, there is just as little meaning in that.
This is not to say that your model with all the additional parameters
may not be correct - it might very well be - but the information contained
in your raw data does not support the fitting of too many data since the
solution is no longer unique (i.e., you can end up with more than one
solution that produces the same chi square, leading to a large error
range for each parameter).
 
2. Your data may possibly not be appropriate for *any* model that can
be used with a nonlinear, global fitting procedure. If your model (containing
a certain number of parameters that are appropriate for the quality and
range of data) doesn't produce random residuals, you may be out of luck,
then you just have to accept it and try to obtain your answer with a 
model-independent approach or an entirely different technique. I have
seen too often cases where either too many parameters were chosen for
the information contained in the data or the residuals contained
runs and were non-random. For those cases, the numbers are simply
MEANINGLESS (this is not to say that you may be able to detect certains
trends in your data). Equilibrium data are hard to fit, since sums
of exponentials are exceptionally ill-conditioned for these kinds of
methods - they are so darn featureless.

3. As Jack pointed out already, the best thing you can do is to include
as much data as possible and to have this data cover as large a range of
the characteristics you are investigating (i.e., vary the concentration,
wavelength, speeds and initial loading concentration - just make sure
to get accurate extinction coefficients if you are after equilibrium
constants!). Also make sure that you don't include data in the nonlinear
or too-noisy range above 0.8-1.0 OD. If you are fitting for multiple 
species, you need to have GOOD signal from ALL species. In an association
system you can only get significant monomer at low concentration, and
higher order species at higher concentrations. Make sure your XLA gives
you good, clean data. Noisy data can render your analysis meaningless.

> That is not to say the question then comes down to statistical opinions, =
> which incidentally is the only way we can ultimately judge any fitting =

Statistics will draw a straight line between an unwarranted assumption
and a foregone conclusion (so be careful to rely too heavily on stats,
though they may be useful...)
 
> I & others like to make a species plot that essentially shows fraction of =
> each polymer vs total conc and allows you to judge the degree of saturation=
> .

yes, and then look for the equilibrium concentration and see if you are
measuring in the correct range!

Good luck, -Borries

*******************************************************************************
* Borries Demeler                                                             *
* The University of Texas Health Science Center at San Antonio                *
* Dept. of Biochemistry, 7703 Floyd Curl Drive, San Antonio, Texas 78284-7760 *
* Voice: 210-567-6592, Fax: 210-567-4575, Email: demeler@biochem.uthscsa.edu  *
*******************************************************************************

Index: [thread] [date] [subject] [author]