In robotics, Brooks'
idea that “The world is its own best model” and that maintaining
an internal model of the world is neither possible or necessary, has
proven an extremely fertile approach.
Today I'm wondering if the same idea does not apply to the semantic WEB.
Let's take apply this approach to TEA (term extraction analysis) for instance, a robot would be a web agent in charge of acting on a document (the world) which model would be all those external ontologies, reference corpus, and models we have representing language and domains.
Following on Brooks' idea, new web 2.0 applications are built today based on the sole analysis of the document without external sources. Red Panda contextual browser is one such example of software quickly analyzing web pages on-the-fly.
The question that comes to mind then seems to be the irreconcilable difference between the “heavy CPU”, “heavy Model” approach that tries to extract some abstract knowledge representation (and hopefully meaning) from within the document and the much more light-weight and apparently highly operational approach that consists in doing document self-contained analysis.
At the end, one would imagine that structuring web pages with semantic tags would make acting on those pages much easier. But a flurry of new applications shows us that in order to extract value from those documents you might not need to access this elusive meaning at all. For instance, it's of little help to name all the concepts in a web pages if the goal is to determine whether a given text is criticizing or promoting a given brand.
How could we help those two approaches to converge?