Index: [thread] [date] [subject] [author]
  From: Walter Stafford <STAFFORD@bbri.harvard.edu>
  To  : stafford@tiac.net, stafford@bbri.harvard.edu, jkzmm@clemson.edu, david@spin6.mcb.UCONN.edu, jiawen@spin6.mcb.UCONN.edu, yujia@spin6.mcb.UCONN.edu, RLW@shcc.org, jwallace@biochem.adelaide.edu.au
  Date: Wed, 9 Apr 1997 18:45:11 -0400 (EDT)

binary files

Hi RASMBer's

	Time to add my 2 cents - since I seem to have started this debate 
about binary files by announcing the new Mac version of dcdt.

	I should explain that the main impetus for using binary format in
my lab was to reduce the cycle time for data acquistion so that more 
pictures could be taken for dcdt analysis. More data means better 
signal-to-noise ratio. The reduction in disk space usage and increased
speed of data reading was secondary. 

On the Model-E Rayleigh system, I was using a MAC-II (with 50Mz 68030 
accelerator) and a 230Mb hard disk (both kinda slow). I was able to reduce
the cycle time by more than half by switching from an ascii format to
a binary format. The Mac system will take a picture, process it and write the
file to disk every 3.5 seconds. Now, this is plenty fast for many
applications. It means that 5 cells can be acquired every 20 seconds.
Convenient but actually still not fast enough for really high precision/low
concentration work. 

	I was hoping that the XL-I, which actually has to process a CCD
array only about 1/3 the size of the Model-E camera, would run even faster
(what with 200MHz machines and all). A very large fraction of the cycle
time on the XL-I is involved with writing that hugh ASCII file to disk each
time. It would be nice to find ways to reduce the cycle time so that the
full potential of the Rayleigh optics can be achieved (at least in
sedimentation velocity analyses at very low concentration using dcdt). 

	The optics are capable of seeing 1 microgram/ml protein boundaries
in a dcdt analysis (at least on the model E system - I haven't tried it yet
on the XL-I). The XL-I optics are considerably better. It seems a shame to
be hampered by a sluggish data acquistion system whose speed could be
improved without going to too much trouble. 

Another advantage of binary files is that they are a lot smaller and 
therfore not only take up less spce but can be transfered from place to 
place a lot faster. At any rate binary file capability exists now in the new
version of dcdt for the Macintosh, and that should make life easier for the
Mac people who have to move and analyze the data.

It might be possible to consider writing the data in binary format during 
the experiment to get the speed and then converting it to ASCII after the 
run is over. This could be a menu item in the acquistion software. That way 
we could have it both ways. The hi-byte/lo-byte order problem that John 
Philo mentioned is not serious since it would not increase the data 
processing time by very much especially since a byte switching algorithm 
would be pretty darned fast. The main problem with reading ASCII data, is 
that it is record oriented (at least in FORTAN) so that each line of the 
file is accessed sequentially by excuting read statement for each line. A
binary file can be read very quickly by executing a single read statement. 


I ramble ....

I hope this clears up some things.

-Walter Stafford

Index: [thread] [date] [subject] [author]