Wednesday, March 28, 2012

SIRI usage, and the future of voice interfaces

Business Insider just published an interesting tidbit on SIRI usage and how it's 'low'.

Given SIRI's limited scope and capabilities I think those numbers are amazingly high - and they are a clear indicator that people want this technology.  Today SIRI is the digital equivalent of a lobotomized assistant who can perform only the simplest of tasks.

My personal favorite of how SIRI falls short is the statement "SIRI call me an ambulance" -- if you tell her to do that, she'll respond with "Okay, from now I'll call you /an ambulance/".

Apple needs to let developers extend the SIRI interface with our own capabilities. That's when the real power of voice enabled phones will be unleashed.  We can only assume this is coming eventually. Prior to launch, Apple had Nuance (the company who makes SIRI) remove features like the ability to update Foursquare, and Facebook, but nothing has been announced yet. 

Although there are lots of things that future versions of SIRI will likely have. Besides extensibility, people should expect to see new capabilities such as:
  • Personalization: expect SIRI to learn your voice, and be able to manage more than just security to your phone. That also means being able to understand your commands while the radio on the car is on. 
  • Initiate a conversation/provide assistance - examples include:
    siri: "you seem to be lost, perhaps I can help."
    siri: "it looks like you're running late to your important business meeting, should I send an email for you to let them know when to expect you?"
    siri: "i just checked and your flight has been delayed, want me to search yelp and find a place to grab some coffee?"
  • Maintain context in a conversation: currently siri is a one statement = one command, it cannot remember context in a conversation, for example:
    me: "where can i get a healthy snack?"
    siri: "how about subway."
    me: "no i don't really feel like a sandwich, how about a smoothie?"
    .. so today that second statement would confuse siri, once she's provided an answer, she's done.  Context also can involve personalization - for example me: "dammit siri, i hate subway" - siri: "got it boss, no more subway".  context is important, because obviously i meant subway the restaurant chain, not the mass transportation vehicle.
  • Hold/queue information until it's appropriate:
    siri: "you seem to be stopped at a red light, you got a text from xyz that says ..."
    me: "i'm going to the gym for an hour, greet callers, ask them if it's an emergency otherwise tell them i'll call them back"
  • Interact with other siri agents: this will be extremely useful for example letting people in my friends and family be able to update my social calendar or activities automatically.  For example having a shared grocery/shopping list across multiple members of a family, or letting certain people be able to proxy my personal calendar, and list of like's/dislikes.   For example - "see if matt wants to get together for a drink on either wed or thurs night" - if Matt is also a siri user, the exact date and location can be
At this point SIRI is out in front, but it will need to innovate to become really useful.  The computing power is there, and there are certainly enough people (myself included) who would be more than happy to pay extra for those services listed above.

Even if SIRI doesn't get it, an equivalent android app will be able to be extended by native apps on the device in the future - and that's when the real interesting applications can start being created.

On a personal note - I can't wait to voice enable shopping (and purchasing) by extending SIRI (or equivalent application).  It's unlikely that one vendor (ex: nuance) will be able to deliver every capability, instead it should focus on letting apps register to handle certain types of statements.

No comments: