2-3:30pm, Friday, February 4, 2004
On Friday afternoon, Thaala and Ron went to visit B.CI.8 in Herrin Labs. She is a graduate student in her 5th year, and studies plant physiological ecology. She is exploring the difference in growth patterns between plants that live in nutrient rich soils, and plants (such as cacti) that live in nutrient poor soils.
The factor that determines the nutrient mix is the presence/absence of high or low levels of P and N (phosphorous and nitrogen).
Number of Experiments
She has been working on three experiments at the same time. One experiment is run in the pygmy forest in Mendocino County four hours north of us in California http://geoimages.berkeley.edu/GeoImages/Johnson/Biomes/BiomesSub/PygmyForest.html.
A second experiment is run in a local area where she has ~500 plants (baby pines) that she has been monitoring. These plants are in different categories of treatment, and are actually staggered in time; they are in different stages of the experiment, due to unforeseen events. Her greenhouse work lasted for one year.
A third is where she takes census of different species, with date, site, GPS location, species seen, and any notes that she takes.
Summary: A system for
keeping track of experiments and experimental progress might be useful.
The Field, the Lab, and the Office
The biologist works in three different places. On her trips
to gather specimens and measurements, she will drive four hours to Mendocino
County. The pygmy forest is “the field.” After a day of collecting, labeling,
data taking, she will drive back to her office and lab at Stanford.
In
the field, she will have a dewar filled with liquid nitrogen, her notebook,
vials, labels, and a hand lens. Samples are cut from the tree, and dropped in
liquid nitrogen immediately. They are not thawed, because the nucleic acids
(what she analyzes) are affected when thawed. She takes six field trips in one
year (four sampling trips), about 19 person days in the field.
In the lab, there are tools for measurement (masses, gel
electrophoresis machines, etc). She will bring her notebook to the lab to
record data.
Her office is separate from the lab where she works. Her
office is where she collects research literature, plans her experiments, enter
data into Excel, analyze data, and work on visualizations, results, and papers.
Summary: The field and
the lab seem like temporary work spaces, where she goes to get a
specific segment of work done (collection, measurement, etc). The office is
where it all comes together.
Note-Taking, Entering Data, and Analyzing It
Data is frequently encoded. For example, species names can be encoded PNEA, TORR, …
Treatments are also encoded as LNHP/HNHP… (Low Nitrogen, High Phosphorous, …).
This encoding of data can lend well toward handwriting recognition software.
However, it currently (according to her gut feeling) lends toward higher error
rates. Entering data from her notebook into her computer takes a long time, and
is boring, thus lending towards human error.
Plus,
she also says that her coding scheme may have been poorly chosen. For example,
LNHP and HNHP look similar, but are totally different treatments! This would
also be an issue in any system we implement.
Her experiment was also blocked out in space, as the
greenhouse had a specific layout. However, she did not take a map approach to
data entry (as one of our other biologists did, at RMBL). Since different
pieces of the map were at different stages of the experiment, she instead just
used her notebook as the log-based file system. J
She uses software called Sigma Plot and JMP to analyze her
data. Initial data is imported from notebook to Excel. We should visit a
session of data entry and possibly video tape and analyze this process
(Thaala?).
The biologist said that she wished she had a simple method
to barcode all her samples (in a human readable format) so that she could
maintain links between her specimens and her data rows (in notebooks) and her
excel spreadsheets. She’d love a system that could beep whenever she entered data that was suspect (for
some reason or other).
Her description that the notebook organizes links
essentially confirms our observations that the notebook is the central
organizing artifact of biology research. She said that she doesn’t share her
data with anyone. Her experiment cycle is about two years, from conception of
her hypothesis, to carrying out the experiment, to paper publication.
Summary: Transferring
data from notebooks is tedious, and may lead to errors. Much of notebook
content is encoded (e.g., LNHP) and could lend well toward automation.
Other Stuff
We asked her “what if we had some magic computer system that
could do all your tedious, work for you?”
She said that she would keep up with literature more, and probably spend
more time doing the asking of questions, designing experiments, and identifying
problems to solve.
A quote from her: “tedium breeds inaccuracy…” Her lab work is tedious… so she is
worried that the data isn’t all correct. A system could both help with accuracy
and relieve her of the tedium by making the work take less time.