In robotics, Brooks'
idea that “The world is its own best model” and that maintaining
an internal model of the world is neither possible or necessary, has
proven an extremely fertile approach.
Today I'm wondering if
the same idea does not apply to the semantic WEB.
Let's take
apply this approach to TEA (term extraction analysis) for instance, a
robot would be a web agent in charge of acting on a document (the
world) which model would be all those external ontologies, reference
corpus, and models we have representing language and domains.
Following on Brooks' idea, new web 2.0 applications are built
today based on the sole analysis of the document without external
sources. Red Panda contextual browser
is one such example of software quickly analyzing web pages
on-the-fly.
The question that comes to mind then seems to be
the irreconcilable difference between the “heavy CPU”, “heavy
Model” approach that tries to extract some abstract knowledge
representation (and hopefully meaning) from within the document and
the much more light-weight and apparently highly operational approach
that consists in doing document self-contained analysis.
At
the end, one would imagine that structuring web pages with semantic
tags would make acting on those pages much easier. But a flurry of
new applications shows us that in order to extract value from those
documents you might not need to access this elusive meaning at all.
For instance, it's of little help to name all the concepts in a web
pages if the goal is to determine whether a given text is criticizing
or promoting a given brand.
How could we help those two
approaches to converge?
Comments