Innovations

If you use Siri on your phone, learn from a threaded discussion on the web, collaborate with colleagues in a virtual workspace, or use a computer to help you speak, it is likely that Tom helped invent it, influenced its development, or was designing an early version of it before it entered the mainstream. Tom has a strong background in research, experience building companies, and knack for product design — but his real gift has always been to identify a problem, make connections across disciplines that haven’t been made before, then set about inventing a solution that doesn’t yet exist. Many of the innovations Tom created or influenced are taken for granted today, woven into the pattern of our digitally mediated lives.

Innovations arise in the context of applications. A driving force in Tom’s work has been applications that augment human intelligence, individually and collectively. People need better ways to communicate, to remember, to learn from each other, and to access the world’s knowledge. These are the problems that his innovations address.

Intelligent UI: The Virtual Assistant Paradigm

In 2007, Tom co-founded the company that created Siri, which established the virtual assistant interaction paradigm for the mainstream. The Siri vision required innovation on several fronts, including speech recognition, natural language processing (NLP), and conversational UI. Tom’s role was to lead the design of the intelligent user interface, which included typed and spoken language input, a conversational dialog integrating query and response, and semantic autocomplete — a personalized, predictive language interface that helps users formulate sophisticated queries in natural language. Siri was purchased by Apple in 2010 and remains central to the user experience of all major Apple products. Today, virtual assistants are used by billions of people around the world, and the interaction paradigm is taken for granted.

In 1995, at the dawn of the Web, Tom invented a product to help realize the potential of the internet to foster collective intelligence. The problem is simple and profound: Humans are intelligent beings as individuals, but as they gather in larger and larger organizations, the intelligence of those organizations does not increase. Individuals have brains and the ability to learn from other individuals through language and working together. Organizations do not have analogous cognitive skills. For example, it is common for large organizations not to know what their people know. As a result decision makers are unable to take advantage of the best knowledge and experience, wheels get reinvented, and organizational knowledge is not shared and reused.

Tom set out to equip organizations with basic cognitive skills so that they could learn from their experience. He created a product that brought the new technologies of the day -— email, web, search, and online collaboration -— into the enterprise. The result was an organizational memory that enabled the organization, and everyone in it, to learn from the collective experience of people doing their work together online. Today, we see the ecosystem of products such as Slack, Google Docs, Microsoft Teams, and Dropbox serving a similar function for group collaboration, but lacking the unified search capabilities that create the organization-scale memory.

Following his early attempt at collective intelligence for the enterprise in the mid ’90s, in 2005 Tom became excited by the possibility of aggregating the knowledge of Internet consumers. Once again, the principle he applied was to find ways to collect the products of people working online into a common knowledge repository. The domain he chose was travel, a personal favorite as well as a dynamic and abundant source of input. The resulting site, RealTravel, provided an environment for a community of travel enthusiasts to create beautiful travel journals of their adventures, share them with friends and family, and find other like-minded travelers.

People looking for information about where to go, where to stay, or what to do could learn from the authentic experiences of those who have been there. The site was optimized for organic search, driving millions of users to well indexed travel stories. It also used machine learning to discover latent dimensions of travel blogs, and offered an interactive travel advisor that recommended travel and itineraries based on multidimensional matches to the experience of people who have been there. The lessons learned from this project led to a major conference keynote, an invited journal article explaining the principles of Collective Knowledge Systems and a T-shirt.

In the early 1990’s, as a research scientist at Stanford University, Tom was part of a coalition of AI researchers who wanted to create a new set of standards for AI. Their vision: enable independently developed AI systems to “talk” to each other, in order to work cooperatively or on behalf of their users, and to build common knowledge repositories for the world. Think of it like a WikiPedia for AIs. Tom’s role was to define the Ontology layer, a way for AI systems and people who built them to agree on the meaning of terms. The technology of common ontologies made it possible to define terms in a machine-understandable way, so that programs could read and write sentences using those terms that represented knowledge to be shared and reused like data. Tom became accidentally famous for providing a theoretical foundation for (and technical definition of) Ontology as used in computer science, and he worked to promote a principled discipline of ontology engineering. His work on ontologies was foundational for what later became the standards of the Semantic Web, proposed by Sir Tim Berners-Lee, the creator of the original standard of the World Wide Web.

In the 1980’s, the dominant AI technology was called expert systems. Never heard of it? The main reason expert systems did not become ubiquitous is the knowledge acquisition bottleneck: It’s hard to build systems that model and act on the knowledge of human experts. Tom worked on this problem for his doctoral research. Like others in the knowledge acquisition community, Tom argued that the programming approach should be replaced with a learning approach. Instead of programmers learning from experts and then writing computer programs, knowledge systems should be created by machines learning directly from experts. His dissertation research created a very early kind of supervised machine-learning system in which people taught machines how to do things by demonstrating for them how to make strategic decisions in context.

Over the centuries we humans have developed an incredibly effective knowledge sharing technology called writing. As wonderful as the written word is for the advancement of science and technology, it has its limits as a medium for recording the collective knowledge of lots of people working together on large projects. For example, when a complex technological product like an airplane or space vehicle is designed, tons of paper documents are written to try to capture the design rationale: why it was designed that way and how it is supposed to work. Years later, someone who needs to operate, maintain, or upgrade the product may not get what they need from what was captured in words by the original engineers. Tom believed that AI and knowledge representation technology could do better. As part of a project for representing the design rationale and operation of very large electromechanical systems such as NASA’s Hubble Space Telescope, he built intelligent interfaces to model-driven simulations, which could explain how things work in natural language. The interfaces were like hypertext documents connected to AI systems that could dynamically answer questions asked by the reader, like “what happens if I push this button?”

In 1983, Tom wrote his first practical AI application, a system designed to augment human speech, just as eyeglasses do for vision and hearing aids do for hearing. The problem was not generating synthetic speech; even then, there were computer programs that could generate comprehensible spoken output. The problem was user interface. Someone without the ability to articulate the muscles required for speech cannot typically operate a keyboard. They typically can operate a single switch, with limited speed and accuracy. They need a better interface than cycling through all the letters on a keyboard. Tom built a way for a personalized language model to drive completion of these precious one-button inputs, to accelerate the generation of speech for people with cerebral palsy. It enabled people to speak for themselves by themselves, using computers strapped to their wheelchairs. A similar idea, using language models for predictive typing, which we all use on our phones today, showed up in patents filed 10 years later. Thirty-five years later, insights from this work was influential to the invention of Semantic Autocomplete for Siri. Today Tom is advising a company that is developing this application using AI interpretation of brain waves, allowing people to literally speak with their brains.

In 1994, at the dawn of the World Wide Web, Tom saw the potential for sharing knowledge among people as well as AI programs. He noted that a lot of knowledge was created and shared via email lists, but this was only available to people who were parties to the conversation. Even in those early days it was clear that if knowledge could be published as Web pages, it could be picked up by search engines and become part of the world’s online knowledge repository. Tom connected these two sources of knowledge, creating one of the first integrations of the Web with email: Hypermail. Hypermail turned standard mailing lists into published web accounts and institutional records of conversations, so that after the conversation, others could learn from it. Hypermail had thread and link finding algorithms that can be found in most modern mail and threaded discussion systems today. It established the convention that each message should have its own URL, so that one could discover and link to the message itself as archival content and follow links to related pages. It was distributed freely and widely in the early years. Its progeny are part of standard open-source distributions and used all over the Web. Its surprising success as a simple group-memory application influenced the development of other projects in collective memory.

Patents

Tom has been the primary author on dozens of patents, many of which are starting to be issued after years of working their way through the patent offices. The original Siri patents and follow-on patents document inventions relevant to building virtual assistants, including fundamental techniques for managing context, multi-modal interaction, and conversational interaction. Patents are also pending and issued for several inventions motivated by the need to offer hands-free and eyes-free interfaces for people who are permanently or situationally impaired. Another family of inventions are related to intelligently reminding and notifying users, helping with memory and sleep, as well as other augmentative services.

Publication Archive

As a professional researcher, Tom chose not to pursue the tenure track and instead focused on building systems that did new things. However, he managed to knock out a few academic publications and achieve a modest influence in the literature (over 60,000 citations). Here are a few pieces of writing that were important at the time or may be of interest today. They are organized by general topic, although they typically span several of these areas of interest.

Innovations

Intelligent UI: The Virtual Assistant Paradigm

Collaborative Knowledge Management: Intelligent Organizations That Learn

Collective Knowledge Systems: Learning from Everyone Who’s Been There, Done That

Ontologies, Semantic Web, and Knowledge Sharing Technology

Knowledge Acquisition: Machines Learning from Humans

Virtual Documents: Conversational Interfaces that Generate Natural Language Explanations

Knowledge-based Communication Prosthesis: Enabling Speech for Every Human

Hypermail: Creating a Shared Memory of Knowledge Created in Conversation

Patents

Publication Archive