|
the
bio-
information
explosion 
|
The challenge
of keeping up with the latest research confronts the FDA,
NCTR, NIH and numerous other public and private agencies that
are regulating and participating in the drug discovery process,
as well as the growing number of companies using bio-information
to create new technologies. In the past, print journals were
the primary repository of the results of new research. But
today, the omnipresent digital domain has enabled new forms
of communication that demand intelligent language processing
technologies capable of isolating the latest information that
is most important to the researcher or regulator.
An objective
of bioinformatics is to extract useful knowledge from the
flood of data, including biological texts, for the purpose
of further analysis leading ultimately to drug discovery;
in short, turning the flood of new bio-information into useable
knowledge. But the data mining and knowledge management technologies
that are being deployed today to assist researchers and regulators
are unsuited to this task, for three reasons:
- Ever-expanding
information - The results of groundbreaking research
are being published every day, at a rate faster than any
researcher or regulator can keep up with.
- Ever-expanding
sources - To make matters worse, updated bio-information
resides in many different locations: print journals, web-based
journals, chat rooms, and various intra- and internet data
resources. With the sheer amount of new information out
there, more than ever it is imperative to return search
results that are relevant, so that time is not wasted sorting
through irrelevant information.
- Bioinformation
language - The language used in bioinformation texts
presents unique challenges to any information technology
because of its proliferation of special terminology and
symbols.
- Conventional
search technology - The technology underlying typical
knowledge management applications is not up to the task
of dealing with these unique challenges, much less recognizing
information that is relevant to a particular researcher
or research program.
If there
were a way to customize a knowledge management application
to handle the challenges unique to bio-information texts and
return only information relevant to the researcher's interests,
then researchers could spend more time applying the knowledge
they've gained than on searching for it.
Clearly
what is needed is a next-generation technology that is intelligent
enough to read unstructured text and return only relevant
information. In short, a technology that is intelligent enough
to turn simple information into knowledge that researchers
can use confidently in the development of new bio-technologies.
|