This morning, out of the blue, I got a call from my best high-school buddy, David, who these days is a bio-informatics Ph.D. student at Baylor. We got into a conversation about the nature of information in the sort of biological research he does, which involves a lot of pattern-matching with proteins and DNA. He's frustrated because he feels that the kind of information system he wants, which would basically index the occurrences of certain patterns in genes which are known to be related, should exist but doesn't.
For me, looking at this problem through my information-colored glasses
, this is an interesting problem, and one which is completely different from everything we've been talking about in my classes. I'm so used to thinking of cataloging and indexing in the context of libraries and library materials that it was kind of a shock to remind myself that these same principles have applications in other settings. It's also interesting to think about the similarities and differences between natural language and DNA as ways to encode and store information. Both are governed by rules which are hard or impossible to build into a dumb computer, but with natural language, humans have the advantage of already knowing the grammar. David compared decoding DNA to reading an unknown foreign language with nothing but a (partial) thesaurus.
I don't know how David's needs would best be served, but I am highly pleased by the fact that he recognizes, as it seems many do not, the need for information professionals to help in the digestion of raw scientific data.
David's also given me another reason to take Boiko's XML class next quarter; he wants me to learn it and then teach him about it. I don't know how well I'd be able to do that, but at least I could do a better job than the half-baked explanation I attempted today.
Post a comment