9 June 2004

Questions about Longhorn

Three excellent postings on Jon Udell's weblog (as usual) on the issues of Semantic Web (abbreviated to SemWeb), RDF and Microsoft's new file persistence layer in Longhorn.

Questions about Longhorn, part 1: WinFS

Questions about Longhorn, part 2: WinFS and semantics

Questions about Longhorn, part 3: Avalon's enterprise mission

(from part2 ....)
"It seems that the point being argued is that with RDF you can get more understanding of the information in the document than with just XML. Being that one could consider RDF as just a logical model layered on top of an XML document (e.g. RDF/XML) I find it hard to understand how viewing some XML document through RDF colored glasses buys one so much more understanding of the data. [Dare Obasanjo] Dare aims this critique at RDF/SemWeb, not WinFS, but I'll take the liberty of extending it to both. And I'll argue that in theory, an information system based on explicit knowledge representation -- using triples, or relationships, or whatever flavor of item-linking you prefer -- is way more powerful than a system in which the same knowledge is available only implicitly. But in practice, I wonder if anybody, whether it's Tim Berners-Lee or the Longhorn architects, can mandate such an approach given the chaotic messiness of reality. My favorite Joshua Allen quote, for example, is this one -- which I also used in my XML 2003 keynote: The lesson, of course, is that real-world information is chaotic. In any but the smallest "proof of concept" systems, the best that one can hope for is to be able to recognize small pockets of structure within a sea of otherwise unstructured information. [Joshua Allen]

Maybe it depends how you construe "small pockets of structure." I've been getting decent mileage using nothing fancier than unschematized XML fragments. Microsoft, meanwhile, has taken a great leap forward in Office 2003 with support for schematized XML documents. The first glimmer of this stuff came almost two years ago. It shipped last fall. If asked to paraphrase the Office XML strategy then, I'd have put it this way:

Let's get schematized information out into the open, where any XML-aware tool can see it and touch it and work with it -- locally and globally, on Windows or any platform -- and then let's see what happens. If we play our cards right we'll broadly legitimize schematization, and we'll be able to use Windows to layer semantic value on top of it.
If asked to paraphrase the WinFS strategy now, I'd put it this way:
Let's put schematized information into Windows, where any CLR-aware Windows application can see it and touch it and work with it.

The first strategy envisions a plurality of schemas arising from the grassroots. You won't often hear support for this strategy from Microsoft, but I heard it last fall at the Enterprise Architect Summit from Jean Paoli, who appeared (with Sun's Jon Bosak) on my panel Schemas in the wild.

The second strategy envisions a canonical set of schemas woven tightly into Longhorn. Years from now it'll ship. Years later, it'll reach critical mass, developers will have mastered its APIs, and schema-aware Windows apps could start to make a "semantic" way of organizing and finding information real for lots of people.

Why wait? Microsoft is telling us to disregard the grassroots Office XML strategy, which is here now and doesn't lock us in, in favor of the ivory-platform WinFS strategy, which is years away and does lock us in. If a compelling argument can be made for the second approach, I haven't seen it yet."

Posted by mofoghlu at June 9, 2004 10:27 AM | TrackBack