Index: [thread] [date] [subject] [author]
  From: Tom Laue <tom.laue@unh.edu>
  To  : rasmb@bbri.harvard.edu
  Date: Thu, 10 Apr 1997 11:36:35 -0400

data formats

Dear RASMB,
I think that the file format discussion has been useful, but it seems to be
somewhat parochial. If we remain in the 1 file = 1 scan mode, and that the
file format must remain three columns of numbers, etc., then we might as
well stay with ASCII, too. However, I see the present discussion as being
extremely important to future developments. Think 10 years out, not where
we are today. 

The disk segment problem brought up by John Philo tells us that to switch
to binary *while retaining the exact same file information* would not gain
much disk space.  I think that this diverts attention from what might be
done if the whole issue of the file format/contents/cataloging is
addressed. What we decide now will cement where we can go. The idea of
using the present situation to forecast how file I/O should be done would
be terribly restrictive. Let's free ourselves from that, and think of how
we want it to be (with justification, of course). I am afraid that most of
the discussion so far is fairly defensive and focused on what will be lost.
Nothing has to be lost. The present ASCII output format will always be
there, unless it dies of its own accord (which will take generations, I
suspect). Now let's get on with how it could be better!

Best,
Tom Laue

PS It is significantly slower to load and write ASCII files when the
comparison is made head to head with an equivalent, well-written binary
file- it has to be. ASCII I/O has to be converted to binary for the machine
to function with the numbers. The routines to do this are considerably more
complex than those that go from binary->binary (or just stay in the IEEE
binary formats). The comparison of different machines and different
programs' implementations of ASCII I/O is not terribly meaningful, and
might mislead some. Perhaps performance issues will not be significant anyway.


Tom Laue
Biochemistry and Molecular Biology
University of New Hampshire
Durham, NH 03824
Ph:  603-862-2459
FAX: 603-862-4013

Index: [thread] [date] [subject] [author]