Big Think Small Screen: How Semantic Computing in the Cloud Will Revolutionize the Consumer Experience on the Phone | Keynote | Web 3.0 Conference | January 27, 2010
A week before Siri was released to the Apple app store, Tom gave a keynote address at the Web 3.0 conference, which was about intelligent applications on the web. In this forward-looking talk, he explains why intelligent apps like Siri were about to burst on the scene. Using the metaphor of the “perfect storm”, he explained how the “big think” of computing in the cloud combined with broadband connectivity to the new class of “small screen” computers (smartphones) would enable intelligence at the interface, and a whole new way for consumers to interact with the rapidly expanding universe of services on the web. He explains how Siri takes advantage of this moment in history, and relates to other vanguard technologies of the time including Yelp, Pandora, Shazam, Google Goggles, and Wolfram Alpha.
A journalist for Wired saw the talk and wrote this insightful piece: Web 3.0: Rosie, Jeeves & That Thing in Your Pocket
Citation: Tom Gruber. Big Think Small Screen: How semantic computing in the cloud will revolutionize the consumer experience on the phone. Keynote presentation at Web 3.0 Conference, January 27, 2010.
legacy link to Web 3.0 Conference: http://www.mediabistro.com/web3/
Transcript
Thanks for the opportunity to speak. I will be speaking about the future, but also the present. I’m Tom Gruber, the CTO and co-founder of Siri. Siri is one of the applications I’ll be speaking about today. We’re going to be talking about is a new generation of intelligent applications that are enabled by a kind of a sea change of the environment in computing, how we call it, “big think, small screen”. And that’s the environment where really a big computation in the cloud is conducted by pretty big pipes, all the way down to brilliant interfaces.
Let’s get started. I thinking about what we would have, if we could design the kind of computing system we really want. Well, for most of our careers, a lot of us in AI and related communities have been thinking about making computers, natural, like humans.
It should be natural in the sense that if you talk [00:01:00] to him, you can talk to it in your language, not its language, it should be natural in the sense that it like you has the sensation of the world. It’s in the time of the place location, even the temperature, and it should help you actually solve your problems, not doing information crunching, but solve your problems.
That’s what it’s there for. And in fact, it wasn’t just solving problems. A computer system should really be. at your service, it should be able to help you do things for you. And that vision has been motivating great work throughout the decades. And I’m here to announce today that I think not only will we see this in our lifetimes, we’re beginning to see it today.
I’ll start with
wait, wait, wait. That’s not the metaphor. That’s a hurricane now that we don’t want to see it. But the idea is kind of [00:02:00] there. It’s a perfect storm. The perfect storm is a combination of things that are happening on a macro environment that are making intelligent applications possible. What we call intelligence at the interface, that is applying all this great semantic and artificial intelligence technologies to the human need at the interface.
There are four elements of this perfect storm I’m going to talk about. And then I’m going to show you a series of examples of applications.
The first is the fact that the cloud is an amazing place to compute today. The next is the pipe. How does that compute get down to the interface The third is the interface itself. And the fourth is the ecosystem in which all of this lives. So let’s talk about the cloud. Now, everybody knows about cloud computing, what I mean by cloud computing is the thing that Google and Yahoo run on. Huge server farms, parallel processing, lots and lots and lots of compute. That’s basically the [00:03:00] basis of Web 2.0. It wouldn’t be possible to exchange attention for money, if the cloud computing wasn’t an economic force. When did you do a Google query, hundreds of machines are harnessed to your needs within a few milliseconds. Other applications, we’ll see, are even more extreme ratios. In fact, I think, one of the things that has been keeping natural intelligence from occurring on the interface has been the basic problem is the stuff that’s hard to do with computation it takes, some machine muscle. And innovation not only just raw compute, but the kind of computing that we’re doing in web 3.0 requires some innovation, which means that normal, small companies need to be able to afford to do it. Maybe the cloud is ready for this. Let’s talk about the properties of it.
What is it about the cloud today that makes it possible that we might be able to see this new level of computation, this new level of innovation? Well, is it just cheap computing? I actually started out by research, getting out an [00:04:00] old copy of Moore’s law curve and actually the original as Joy Mountford has shown in her talks, the original that Gordon Moore showed actually drew the curve backwards, a mathematician would. Most of us show it as up to the right like a business person. But in any case, it is, in fact, an exponential curve caused by a doubling of the density. But you know what, that’s not the secret behind cloud computing. In fact, the actual price of cloud computing is not going according to Moore’s law, but other things have been going on in cloud computing.
What the Berkeley study in 2007 said about it is that the cloud gives you the illusion of infinite scalability. That is, you can make it seem to the user by using parallel processing and on demand surfacing that you can in fact, pretend that you have all the compute power in the world. And you don’t have to be Google to do that.
The hint here about why this works, it’s kind of like lightning. Lightning is [00:05:00] dramatic and powerful though. It only happens for a very short amount of time. When lightning strikes, all that power is concentrated in a single bolt in a fraction of a second. And that’s kind of extreme, but that’s the idea behind elastic computing.
Why that cloud is so powerful, we can bring to bear enormous amounts of computation dynamically, and then take it back again, economically. So it’s not just cheap computing, it’s how you deploy it.
The second thing is, lots and lots of data. We talk about search engines with huge indexes of the world. Is that the secret behind cloud computing? Well, I think we dig just a little bit more below the surface. It’s not just that there’s big data, but there’s the illusion of omniscience. That is people now say, let’s just Google it. Let’s just Wikipedia. Let’s just search on Facebook, whatever. People believe that the cloud, because there’s so much that since data is taken for granted to be living in the cloud now, and that the costs from a cloud co-lo one machine, in one co-location facility can access [00:06:00] another machine in another co-lo with no effort whatsoever. That low friction of cloud to cloud computing,
and the fact that you can scale horizontally, means that in fact, you could pretend as if all the world’s information were available to you, if you live in the cloud. And the hint here is that it’s like a weather system. A cloud doesn’t exist by itself, it exists as part of a larger climate.
And that is why cloud computing is exciting because every cloud really being like a server farm somewhere, no one even cares where they are anymore. But the backbone interconnectivity between them has been going up hyper logarithmically that is the fiber between them is almost friction-free now.
So the illusion and omniscience is possible, if we connect them together.
A third thing that isn’t really about geek Moore’s Law techno bandwidth thingy is really has to do with the fact that people actually live in the cloud now. What I mean by [00:07:00] that is, people assume that they can keep their data in there and they do put their data there.
And you can mine that data, all the data, Facebook and flicker and all the rest, the blogs, all that’s in the cloud. And so if we live in the cloud, we can harvest the cloud.
The next thing is the pipe. This beautiful photograph is actually an image generated by a computer by an artist named Aaron Koblan: IP traffic between New York and other places.
I believe it was cell phone IP traffic, but I’m not sure but it makes the point. That actually every cell phone, every laptop of course, is connected effortlessly by this pipe. And the pipe has been growing.
Eye chart number one is that the bandwidth to the mobile has reached a point where 3G enables a whole class of applications. Duh, right? You can do video and YouTube and all that kind of thing.
It’s also reaching a plateau of penetration, as they call it affectionately in the business, which is about the number of people who [00:08:00] actually buy into 3G.
But I found another way of looking at these numbers that kind of put it together for me. This blue chart here on the left, the blue pipe thingy ramp was what happened in DSL in the house.
Around 2002, 2003 DSL went from IDSL, what is it called? The fast phone line version to the first megabit per second DSL then the three to five megabit per second. And when that happened, there was like a tipping point. And not only penetration as everybody got DSL, but the DSL was fast enough to enable a whole bunch of applications. YouTube would not have happened. YouTube was launched in 2004, right where ADSL three to five megabits is starting. By the time YouTube actually got mature, most of the world in North America had it. Now the second chart is actually in the same calendar. We see a similar curve happening in wireless bandwidth. It started out slow. And we have [00:09:00] 3G. We have kind of a tipping point in 3G that’s really going to take off at 4g. And all those applications and all those users are going to make this broadband be the new broadband.
The third piece of the perfect storm is the interface. Okay. Again, we’re not living under a rock or in a cage, but we have noticed that there’s a really smart phone out there and there’s a few more coming.
The Nexus One is a brilliant phone there’s a few others that are coming on this new Android platform. That’s enabling innovation and competition in that space. And it turns out that they’re so cheap, because they’re subsidized in the U.S., that just a whole big chunk of all phones to be sold from now on will be smart phones. I don’t know exactly. I don’t know when it hits 51%. I think it’s estimated to make it in early next year. But just a lot of people, hundreds of millions of people will have smartphones, not just phones in their pocket. So the question is for this ecosystem of [00:10:00] intelligent application, what makes the smartphone smart?
And again, if we dig below the surface, it isn’t what you might think. It isn’t Moore’s Law making that ARM processor a little faster. That kind of helps, but there’s a much bigger thing going on here. It’s the sensation. It’s the senses. These phones have senses. They have a sense of touch. They have a sense of hearing.
They listen, they hear. They have sight now in video sight as well as static visual. They have proprioception. They know where they are and they know when they’re being shaked. That’s amazing amount of sensation, which really is an extension. It’s sensors on you. You’re wearing that phone.
It’s your phone. So it’s really the sensors on you. And they’re tasteful. The only thing that they don’t really have is a brain. Now they have a processor, but as we know, trying to build software [00:11:00] for them, not that much of a processor yet, but mainly because of the battery. But let’s not wait for the processor on the phone to be the same as the processor in the colo farm.
That’s not how a new wave is going to work. It’s the sensation and the ability to put a brilliant interface, a tasteful interface in front of the user, combined with that broadband to the phone, combined with the cloud, is starting to create this perfect storm.
Now the fourth piece of this is especially relevant to this group, to this conference, web 3.0. Because the ecosystem is not just a bunch of pipes and bits and processors and so on, but it’s, what’s in that ecosystem that can be the substrate for the growth of these intelligent applications. And in Web 2.0 that food chain started with search that enabled a huge crop of innovation based on exchange of attention for money.
Which was really driven by the organic search advertising afforded by Google and [00:12:00] others. So the food chain, the krill of that ocean was search. We think in the web 3.0 environment, the food chain will be based on a different kind of information. Maybe a step up from pages. In fact, it’s not really just about data at all.
Yes, the semantic web and structured data is part of the story. But again, it’s the connectivity of that data that’s going to be the tipping point, I think, for these intelligent applications. What we call the Gigantic Join (I’ll show you a picture of that in second) is when you take that structured data from one source and structured data from another source and combine them produce a new service that was never there before.
Speaking of service, it’s not just data. I was very happy to be part of the early days of the semantic web. And I think it’s fantastic, but I think in 2010, the next level is here. It’s Semantic Web with [00:13:00] APIs on top, and it’s the APIs, that deliver services that’s going to make these intelligent applications happen.
And in fact, just like data needs to be connected. Services need to be combined. So intelligent applications of the future of today are going to be masters of mashed up. We call the mother of all mashups. So I mentioned this gigantic join idea. Think about this is the linked data cloud. As we know this in this group, all of that kind of structured data that has been tagged by standards and so on.
And they’re already linked nicely by semantic web standards, but even for. Just ordinary applications. aren’t even using that. There’s a kind of a gigantic during that’s already happening without those standards on top of them or in spite of them. And that is the join across the fundamental dimensions of living in the world.
You have, what you Things like, what is it? You, your identity the identity of a business the identity of a location in space, the [00:14:00] who, who are you, who is the site? You’re going to the where and geolocation. The how of which service you’re using and so on or how to achieve a task all. If you join these things time, place, and identity, of person identity of entity in the world, you get enormous power. And we see that’s really the key to make taking advantage of that connectivity of the data. And the services ecosystem has exploded. I’m not going to say it’s exponential, people throw that word around and it’s not actually exponential. Very few things in life really are, but it’s not linear. The growth of APIs. It’s really, really, really growing fast. The last couple of years, we’ve got some number over 1600 API.
These are public APIs. There is an enormous more number of dark deep web APIs, private APIs that will become public or can be through business relationships. This means that the ecosystem that intelligent applications are building on now, not only has access to the whole linked world of data, but also the whole [00:15:00] combined world of services.
It turns out also there’s kind of a network effect of services that when you combine them, they have even more power than they do individually. These graphs are are ready to get. If you go to Programmable web, you can go and see some visualizations of APIs. These are graphs of ways that APIs have been combined or connected together.
And they look an awful lot like social networks. And that’s not a coincidence that there’s kind of a social life of services. That they can get better when they get in small clusters together to form groups. And this connectivity of services is part of what is really new about this perfect storm.
So we think that storm is now the ecology in which a new generation of the big think small screen apps will come. And when we’re thinking about how to characterize these, there’s really four new capabilities that this paradigm gives us, that really weren’t there before. And I’m [00:16:00] going to go give examples of applications in each of these.
First, I’m going to show you Siri, which shows examples of all of that. Let’s see, I recorded a demo the other day,
[some volume please…] …restaurant for say, tomorrow night, “I’d like a romantic place for Italian food near my office.” I can just say it to Siri in English. Siri can take my speech, turn it into text, interpret my words in terms of things that it can call services for. In this case, it was able to call several services to find Italian restaurants.
It knew the cuisine type, and it also knew a way of solving the problem of finding a romantic one. Let’s take a look at the first entry. Here’s La Pastaia It is close to my office. And as the reviews here, say it’s a romantic place.
Let’s say I want to get a reservation at my favorite [00:17:00] restaurant. “I’d like a table for two at Il Fornaio in San Jose tomorrow night at 7:30”. I can say what I want directly to Siri. And in one click, I can get a reservation that I requested. Let’s say, I wanted to see a movie first, I’ve always wanting to see Avatar, and people tell me it’s really good in IMAX and 3D. “Where can I see avatar in 3-D IMAX?” Siri turns those words, again, into something that it can in turn operationalize in services, and this case, it was able to know that they were parameters to a movie search and it found the available times, showing tonight.
It’s late but I might be able to catch a show. Let’s see where it’s playing. I see it’s playing in a couple of places. Oh, here’s the place I can get tickets. So Siri, you can one-click and let me get tickets reserve tickets, so I’ll have a seat. Here it offers at a [00:18:00] local theater. This is using a service partner named movietickets.com.
And I can use a credit card and Siri will remember the entire transaction and get me tickets, waiting for me at the door.
“What’s happening this weekend around here.” Siri can understand the words, turn them into appropriate service calls, and find a set of events that are ranked by popularity and proximity. Here I see there’s all kinds of things happening this weekend. Oh, I noticed there was a comedy show nearby at the Flint center, near my current location. I can get tickets. I can also just ask Siri to modify the query, just using speech. “How about San Francisco?” It understands the context, and then can interpret this as a request for what’s happening in San Francisco this weekend. And I’ll probably find out that there’s even more things to do at this weekend.
Oh, look, let’s check out the music. I see there’s music there and there’s a musical [00:19:00] called Wicked. Oh, I can, maybe I can get tickets. Let’s check it out. And it can show the actual theater, the seating chart in the theater and the ability to get tickets, by clicking through.
Siri’s helped me find a lot of things to do. Let’s say I’ve had a big night and I need to get home. “Take me drunk I’m home.” Siri can interpret my language as an request to get a taxi. You can call an online taxi service and send one directly to my… okay, so you get the idea. We’ve been working on this for a while.
And some of you have been following this. And it actually works. That’s not faked. That’s the real application. You talk to your phone, it’s the new way to get things done. Talk to your phone. And it’s not finding information. It’s not searching, it’s solving your problem. I need to go… I wanted to see Avatar. I wanted to see it in IMAX. It knew where I was. It knew what time it was. It knew how many shows were left the same day. It knew how to get tickets for them. It solved my problem. That’s what assistants should do. In fact, that’s what we call this. We call it a [00:20:00] virtual personal assistant.
Now here’s a little bit of a diagram about how it works. Let’s relate it back to the themes of the perfect storm. The service ecology is on the left. Siri has dozens and dozens of API is connected to a lot of these brand names services. Like when we did the restaurant search, it’s talking to four or five sources of information about the quality of the restaurant, the location, the restaurant, a service that I can reserve tables in the restaurants and so on.
When you’re talking about events, several events, sources combined integrated to help solve the problem. Hey, what’s happening in San Francisco? Those services now are connected at the cloud level cloud to cloud, in the big data center to servers at Siri servers that then connect to the phones on the right.
And since the phone’s job was basically to be a smart interface, there’s no knowledge in the phone, but we can put it onto many different kinds of phones and have it preserve the intelligences in the cloud. And I’m not going to [00:21:00] go into the detail on this talk but the key. One of the key technologies inside Siri, why we can pull this off is because we have knowledge about human life in the domain models and task models. We have models of what it means to go to a movie to do something on then weekend and have a meal once in a while. So Siri we think is kind of a poster child for this new, big think small screen applications. But there are a few others and I’d like to share them today.
Let’s talk about those four things. I didn’t really go into them. I’ll now go into them in detail. What are the four kinds of things? Well, language as interface is first. And I’ll show you each one in turn.
Language is an interface. It’s the way humans mostly, you know, face to face talk. Now why haven’t computers done it until now?
Well, they’re just starting to now. So remember you just saw the example, I just said, “where can I see avatar in 3d IMAX?” And did you notice that you even got back the proper names, IMAX, acronyms, capitalize, all that stuff? It got that because Nuance, which we’re using as a [00:22:00] speech to text converter, is working in conjunction with the data about movies and about theater parameters.
It’s got really good speech models in there that can do the speech to text. It’s doing it in the cloud. Many of us have Dragon Dictate on our computers, on our PCs. On the old PC it was kind of struggling to do voice on the computers.
It finally got to where you can do them on the computers fine. And now we’re doing them in the cloud even faster because they’re really bigger servers and more parallelism. And that’s available on this puny little processor on the phone because we’re not doing anything smart on the phone, just sending a voice up to the cloud.
This is not a new idea. Vlingo and Google are both doing this as well. And this is a sea change in capabilities. We’re going to see that people talk to their phones. And so the race is on to see who can make the most of that information. You saw the difference in the demo between just talking to your phone and having the phone actually understand your language.
Cause speech to text [00:23:00] is one step of natural language understanding. But in order to make sense of “3D IMAX avatar tonight, near me” well, I didn’t even say “tonight near me”, because that was implicit in the sensors. Then all these other pieces are assembled together.
Think of it as a match against the kinds of services one might offer. So natural language understanding is actually a quite robust, it’s not parsing and grammar wise. It’s looking for semantics in that utterance and converting into an operational form. You see on the right it’s paraphrasing. “I looked for IMAX movies named avatar that are in 3D”. That kind of stilted language is a way of paraphrasing back the parameters, that the machine understands: what Chomsky would call the deep structure of the utterance. And then once we have the deep structure, we can then dispatch to services and get a nice, beautiful mash up of all the things we know about movies, like trailers, maps and so on.
Now it doesn’t always work. So the thing about new technology is that it’s pretty wrong some of [00:24:00] the times, speech to text and natural language understanding. A funny example happened when, we were doing some testing and one of the subjects said, “tell my husband I’ll be late.” And here’s what it said [click].
It was thinking and said… And then watch down here, Siri bugs you, if you don’t hit okay after a while. It says, “not getting what you want?” For best results. Um, I don’t know. It was a joke. The subject thought it was pretty funny. So, we’re going to find that to be true in general, about speech to text, we’re going to have fun with this, cause it’s wrong so much time, but I was wrong in some really hilarious ways.
And not only is it wrong in speech-to-text, it’s wrong in natural language understanding, too. Because, you know, despite all the highfalutin artificial intelligence and semantic data, et cetera, et cetera, it’s still just a computer. What it’s really doing is trying to make sense from its world models of what you have to say.
So it’s kind of a Forest Gump that way. But we’re really excited [00:25:00] about it, because it’s really the first time it’s ever been done for the mass consumer.
The next kind of new class of applications are those that take advantage of that sensory awareness on the phone, send it out that pipe to the server farms and the cloud, and do something really exciting with it. This used to be a really innovative, cool idea. Two years ago, it was a whole conference about this, that I went to location awareness. Now you can get, you know, dozens and dozens and dozens of iPhone apps because the SDK app basically made it a commodity.
In other words, you can pick up your phone. It knows where you are, and then it can do a join with some database of location, index database, to find things that are near you around you. Now, applications like Siri do go beyond just around where your phone is. You can also talk about places like, “send a taxi to my home” and so on.
That location awareness is a key ante to play the game here in intelligence systems. And this is a picture of Yelp, which is very good at this, and Around Me and these other [00:26:00] applications. Since this wave have come out, they’ve gotten even better.
I’m really excited about this Google goggles idea.
It’s called visual search, and there’s a few other examples of this. Google goggles is available on Android now. You aim your phone at something. It takes a picture and then it does its Googlish best to OCR and image-match it, to find something in the cloud database that matches what you’re pointing at.
An example that works really well: you can point it at a book cover. Google has indexed the pictures and the titles of all the books that it scanned. And so it can pull it up in its results. Now, if we’re thinking about the new ecosystem, there’s a broader play here. It’s not just that Google has this cool thing, has the cool visual input and it has this cool database.
You don’t have to be Google to play this game. If you have the visual input, you can then scan anything and combine it with your data. So just an example of the Open Archive, the open library at the Internet Archive. They also have a large database of books. And they also built a portal to all the books in the world.
And if you combine the visual search with the open data like this, [00:27:00] imagine the possibilities.
This is also very cool. This is called Shizam. A lot of you have seen this. You don’t know what the song is playing and it’s driving you crazy? With the Shizam app for the app store, you just hold up your iPhone to the song and in seconds, you’ll know who sings it and where to get it.
Pretty good, it conveniently has a particular store that’s easy to get to. So, you know, what’s exciting about this. If Google Goggles is “look up by looking”, this is “look up by listening” and we’re seeing a lot of potential here. Think of all the possibilities. It’s called voice fingerprinting or a sound fingerprinting.
If you can fingerprint a sound signal and compare it against your data, you can do an amazing mashup like this.
The third strain of new applications in this realm of the big think and small [00:28:00] screen are problem solving. Now, this is kind of a general term for a lot of us in AI. We always would talk about this for years and years and years. But now we’re starting to see applications where there’s a real difference between browsing and problem solving.
Browsing is a means to an end. It’s a thing you do when you’re trying to solve a problem. Problem solving is different. And when the intelligence systems become more active in the problem solving, you get more, you get that result. For instance, in Siri, if you were trying to find something to do, Siri has more of a back and forth dialogue.
You just said something vague, like “what’s happening”. Siri says here’s some categories of things. You drill down there, you can get some more. And you can say, change my location, change my time. It’s a dialogue and it’s a drill down. It’s a refinement interaction process. It’s a kind of a problem solving.
There’s also a notion of a linear sequence of this kind of problem solving where you might say, plan a movie, plan a dinner, do the taxi, and so on in a sequence. Well, there’re other applications of doing kinds of problem solving, too. One of the most exciting ones is Wolfram Alpha. You can’t read probably anything on the screen, but I’ll talk you through it. [00:29:00] Wolfram is one of those really great examples of big think.
It has a just gigantic amount of computation applied to every user utterance, and it has huge structured data that’s been very carefully curated. It’s a fantastic piece of work. What I was doing here is I just said, I’m kind of a nutrition geek and it’s always fun to talk about the factoids of nutrition. Did you know, do you know which has more calories, a bagel or a donut? And of course the answer is a bagel, but you know what, they are within a statistical mean of each other. They’re basically the same amount of calories. The difference is where do you get the calories from, fat or wheat and so on.
So I asked Wolfram, I just said, bagel versus donut. And Wolfram actually drills down into all the facts that if you’re actually trying to problem solve the problem, should I eat a donut or a bagel, gives you the real information, not a search engine optimized page to sell you bagels, but the actual structured information, and a kind of hypothesis space that you can explore. And this is what our users deserve. This kind of information.
You saw that we did similar [00:30:00] things in Siri. The trick to doing it in Siri is partly the task models and domain models, but also, in large, the fact that we’re actually applying lots of information sources. So here are the logo field over on the right, those are the services that were involved in that one or two seconds it took to find romantic restaurant near my office.
This is a dynamic mashup. It’s not hardwired that way. Here’s the query, it does the deep structure analysis, maps it to an ontology, then the ontology maps it to a set of possible services. It delegates dynamically the services, gets the results, aggregates them back, unifies them. That’s important because in this world, you appreciate that the restaurant in database one may not be the same as restaurant in database two.
You have to line them all up. Get the dates and times and everything else, and then give your results back. That is the modern way of doing mashups. And that’s, that’s what we’ve been having fun with, with Siri.
The final, I think thread of applications is it’s [00:31:00] really not about information retrieval anymore.
It’s about solving human problems. That sounds maybe pretentious or something, but there’s some examples of that, that are really exciting.
Think about, is it about finding a song, or is it about learning, hearing, discovering new music that you like? You can’t go to a database or a search engine and say, “uh, songs that I like?” It can’t possibly answer that question with a query.
But Pandora says, well, we know that there’s a structured database designed by, built by musicologists. That indexes all the music that I can find. And then there is a matching algorithm that can take examples, points in that space, and find similarity or proximity in that space.
And that’s how it works. You could give it a set of examples. It even starts with just one, and it will build you a radio station of songs like that. And the similarity metric, you don’t need it to program it yourself. You just give examples and then you tune it with positive and negative examples. [00:32:00] This is an amazingly powerful service.
When that thing landed on the iPhone, it revolutionized FM radio. Radio is never going to be the same again, because now in Europe or everywhere you go, you can have the intelligence of the cloud building your stream of music for you.
Now, I also point out livekick. It’s not really a radio station. It’s a way of getting tickets and other things. But what’s interesting about the way they do this is, if you upload your iTunes library, it will then analyze that and naturally against all the music, it knows the artists. And it’ll tell you when your artists that you care about are going to play.
And of course you can go wild with the ideas. You can look at playlist. How often do you play something? And so on, you can think of all the ways you might mine data about how people play music to find new opportunities to turn you onto some new music.
Siri is also going deep in personalization and personal services.
We saw this silly little example of sending you a taxi. Yes, there are services that say, send me a taxi to my [00:33:00] phone, but you can also in Siri say send it to my house and to my office and to my boathouse, whatever. And you can have Siri start to learn your world. You can imagine other kinds of personalization, tell my partner something, remind me to do something, remind my partner to do something, send a message to Dag.
And it says, you mean the one in your contact whose first name is Dag and so on. All this kind of personalization, it’s not brain surgery, but if you are a big think application, you have access to that personal data that’s in the cloud and you can connect the dots. Furthermore, you can start doing Pandora like things and learning preferences again, by just looking at examples of, if you have people using, discovering things, finding things, learning things, you can start to generalize those preferences from that.
So finally, let me conclude with back to those four themes and one of those Tired Wired charts. I never tire of those. That it’s just a great metaphor. If the new way is language as interface, the old way was [00:34:00] keywords. If the new way is sensory awareness, the old way was typing. Don’t type in where you are, have it know where you are. It’s not browsing, it’s problem solving. And it’s not just information retrieval anymore. It’s [unintelligible].
One final thought here and then a springboard to Q and A. I described kind of a perfect storm, a macro situation in which these trends that are unarguable trends, are converging on a point here, what we’re going to have intelligent applications. Now we’re the beginning of this.
Siri is putting our best bet on this one. We’re thinking it’s going to be a good one. We’ll see how it turns out. There’s a whole space of these. Think of it as an ecology. What will it take to thrive in this ecology? I think that one of the core competencies here is not accumulating large indexes or large data sets. That’s nice, but what really matters is connecting and combining them. Connecting and combined services, not just having a service, connecting those 1600 APIs.
[00:35:00] Addressing human tasks. If we focus on applications, on the human tasks, not only will we make more satisfied users, we’ll make more money because that’s why you get paid. It’s all human tasks, not to have information retrieve. And finally, this is sort of a personal bet for our company. We’re saying that all that fancy AI and semantic stuff, we’re aiming it like a laser at the user experience.
We’re saying it has a value when it shows up in a user experience. We call that intelligence at the interface. And we think that’s a good thing for a fitness function for this new ecology as well. Okay. Thank you very much. And I’m open for questions now.