Ontologies, Semantic Web and Knowledge Sharing Technology

In the early 1990s when Tom was a researcher at Stanford University, a group of AI research groups got together to pave a road to bring AI out of the lab and into the mainstream. Their goal: define a standard architecture stack for allowing intelligent systems to interoperate over a knowledge bus and share data, models, and other knowledge without sharing data schema or formats. The idea was to allow independently developed AI systems to “talk” to each other to work cooperatively or on behalf of their users, and to build common knowledge repositories for the world — like a WikiPedia for AIs. This could not be done with data-level agreements, like the database schemata that define enterprise databases. It required a new kind of knowledge sharing technology, a semantic or knowledge level agreement.

Why did this matter? At that time AI was in a “winter” of funding and progress was limited to research institutions. Research on intelligent systems created impressive standalone demonstrations (such as the virtual documents that explain how things work) that used proprietary representations of knowledge and specialized reasoning mechanisms. The Knowledge Sharing group wanted to enable a more mature engineering discipline for building AI systems, where reusable parts with well-understood properties could be built on to make progress, and larger more sophisticated systems could be made from coalitions of separately designed AI systems. DARPA, which funded much of the work on AI and needed robust applications, was eager to facilitate.

Essential to the effort was the definition of terms that AI systems would use when exchanging data or operating on common knowledge repositories. Tom co-led the effort for this part of the stack, which proposed the use of human and machine-readable ontologies that defined the terms. He proposed a definition of ontology as used in computer science and a methodology for ontology engineering. Tom’s foundational papers for the field of ontological engineering became among the most-cited articles in computer science for a while, with one article as the highest cited article of all time for the journal in which it appeared.

While at Stanford, Tom established the DARPA Knowledge Sharing Library, a web-based public exchange for ontologies, software, and knowledge bases. He also created open-source tools for working with ontologies as software artifacts, which could convert standard ontological definitions into various formats used in knowledge bases. His work on ontologies was foundational for what later became the standards of the Semantic Web, proposed by Sir Tim Berners-Lee, the creator of the original standard of the World Wide Web. The Semantic Web proposed that knowledge be shared on the Web in a form that both humans and machines could read; the machine readable part was defined by standard ontologies as Tom had proposed.

Welcome to the Machine: AI and the Evolution of Web 3.0

This was an interesting point in the evolution of AI and the Internet. Tom was evangelizing knowledge sharing and ontology engineering during the era of Web 1.0. The Semantic Web was proposed at the end of Web 2.0. John Markoff, Pulitzer Prize winning journalist for the New York Times, suggested that we were entering the era of Web 3.0, defined by semantic computing. By the time Semantic Web and Web 3.0 had reached the mainstream conversation, the Social Web was on the rise as the dominant form of online information exchange.

The Social Web began as Web 2.0 players such as Flickr, Digg, WordPress, and WikiPedia showed the power of user-contributed content. The problem: These sites aggregated user contributed content but did not embrace AI or standards for sharing machine readable data. Tom proposed a synthesis of the social web and the Semantic Web in which the “folksonomies” of user-contributed tagging could be made machine understandable with ontologies, and a synthesis of unstructured and structured information could enable sites like WikiPedia to be a source of human-generated information that is also understandable by machines. Today, Wikipedia and its structured counterparts WikiData and DBpedia are the basis for “general knowledge” used by Google search and all of the major virtual assistants.

However, with the rise of social media like Facebook, Twitter, and YouTube, larger economic forces dominated the next phase of the Social Web. These kinds of sites are driven by advertising, which value content only for its attraction to humans. For ad-based systems, there is no incentive for machines to understand the factual information or meaning of the content being shared. Instead, these companies employ AI technology to optimize how humans interact with content at scale, to keep them clicking and scrolling. Today Tom advocates for a different use of AI, one that counters the exploitation of users with AI that augments people’s intelligence and well-being.

Illustration attribution: “ontology” graphic: cc: The Linked Open Data Cloud from lod-cloud.net