Index: [thread] [date] [subject] [author]
  From: Peter Schuck <pschuck@helix.nih.gov>
  To  : rasmb@alpha.bbri.org
  Date: Thu, 27 Apr 2000 14:01:02 -0700

linear approximation for error estimates

With regard to Yujita's comment about the validity of the "linear
approximation", I'm very much surprised that this should be the case.
Maybe there is a misunderstanding.  I can see that with large data sets and
with Gaussian noise the statistical assumptions underlying least-squares
optimization is valid, but not that the actual confidence limits are those
from "linear" least-squares.  

The reason is that by simply mapping the error surface for an equilibrium
model, i.e. if you keep one parameter fixed at a non-optimal value,
optimize the others and observe the chi-square of the constrained fit, I
find that a symmetrical parabolical minimum is really the exception (the
linear least squares implies a symmetric parabolical minimum).  I think
Michael Johnson has worked a lot on this, and in his book chapter "Comments
on the Analysis of Sedimentation Equilibrium Experiments", he explicitely
says that "[the asymptotic standard errors from the covariance matrix]
almost always significantly underestimate the true confidence intervals of
the determined parameters" (Todd Schuster and Tom Laue's book, Birkaeuser,
1994, p. 51)

My experience has been completely consistent with this.  If the parameters
you're looking for are binding constants, then these traces are frequently
fairly asymmetric, and sometimes I even only get a one-sided error limit.
I understand theoretically that a large number of data points does help,
but I usually work with about 10 long-column equilibrium scans, and there
the asymmetry can definitely be very pronounced, depending on the model.
In my hands, the comparison with the correlation coefficients shows that
they frequently seriously underestimate the errors, and I would caution
about using them for any quantitative interpretation of the errors of the
best-fit parameters.  They may be OK in some cases with good data and
well-behaved models, and certainly can give you a feel for what to expect,
but I agree with Olin about the necessity of the rigorous error analysis in
the end. 

In case that the unknown parameters are the species concentrations only,
which is the simplest case, the model actually looks like a linear
least-squares model, because the parameters are linear.  However, even
there, the non-negativity of the concentrations can cause the error surface
to be asymmetric.  It has been shown (using algebraic methods for linear
least squares with inequality constraints)  that even in this simple case,
the error surface is described by step-wise parabolic functions.  As a
consequence, even here you have to map the error surface, although in this
case the confidence intervals can actually get smaller than those predicted
from the linear approximation (this is described in Progr. Coll. Polym. Sci
(1994) 94:1-13).





***********************************************************
Peter Schuck, PhD
Molecular Interactions Resource
Division of Bioengineering and Physical Science, ORS
National Institutes of Health
Bldg. 13 Rm. 3N17
13 South Drive 
Bethesda, MD 20892 - 5766
Tel: (301) 435-1950
Fax: (301) 496-6608
email: Peter_Schuck@nih.gov
***********************************************************

Index: [thread] [date] [subject] [author]