Walt Mossberg on Siri ⇥ theverge.com

Walt Mossberg:

If you try and treat Siri like a truly intelligent assistant, aware of the wider world, it often fails, even though Apple presentations and its Siri website suggest otherwise. (And I’m not talking about getting your voice wrong. In my recent experience, Siri has become quite good at transcribing what I’m asking, just not at answering it.)

In recent weeks, on multiple Apple devices, Siri has been unable to tell me the names of the major party candidates for president and vice president of the United States. Or when they were debating. Or when the Emmy awards show was due to be on. Or the date of the World Series. When I asked it “What is the weather on Crete?” it gave me the weather for Crete, Illinois, a small village which — while I’m sure it’s great — isn’t what most people mean when they ask for the weather on Crete, the famous Greek island.

Mossberg isn’t alone. Earlier today, Neven Mrgan asked Siri to play music recently added to his library. No variation of that request was successful. Last week, I asked Siri what the weather would be like in Banff the next day, and it provided me with a weekly forecast, not the hourly forecast anyone would expect for that question.

These requests are not complex — Mossberg says that Apple fixed many of these commands in the weeks after he tweeted about them, which suggests to me that it’s trivial to reprogram a given query. I tested some of Mossberg’s questions about the 2016 U.S. election and found many of my questions were answered, but not consistently or reliably.

This comes down to two key gaps in the Siri development chain. First, Apple says they update Siri every other week; I maintain that it should be updated far more frequently than that. Second, Apple told Mossberg that they don’t prioritize trivia:

It puts much less emphasis on what it calls “long tail” questions, like the ones I’ve cited above, which in some cases, Apple says, number in only the hundreds each day.

My hunch is that questions like these are asked less frequently not because people don’t try, but because users have tried and Siri didn’t answer. Over time, users teach themselves that Siri simply isn’t good for this kind of information, and they stop trying.

Update: John Gruber:

These sort of glaring inconsistencies are almost as bad as universal failures. The big problem Apple faces with Siri is that when people encounter these problems, they stop trying. It feels like you’re wasting your time, and makes you feel silly or even foolish for having tried. I worry that even if Apple improves Siri significantly, people will never know it because they won’t bother trying because they were burned so many times before. In addition to the engineering hurdles to actually make Siri much better, Apple also has to overcome a “boy who cried wolf” credibility problem.

Entirely agreed, with one minor exception: I think the inconsistencies are worse than outright failure. The inability to answer a query implies a limitation which, while not ideal, is understandable. Inconsistency, on the other hand, makes Siri feel untrustworthy. If I can’t reliably expect the same result with basic queries that are almost identical, I’m much less likely to find it dependable.