Siri’s Complicated and Fraught Life So Far ⇥ wsj.com

Tripp Mickle reports for the Wall Street Journal (work around the paywall via Twitter) on Siri’s stumbles within Apple:

Siri’s capabilities have lagged behind those of rivals elsewhere, as well. In tests across 5,000 different questions, it answered accurately 62% of the time, lagging the roughly 90% accuracy rate of Google Assistant and Amazon’s Alexa, according to Stone Temple, a digital marketing firm.

A separate study by Loup Ventures, a market-research firm, shows Siri performs better than rivals on core iPhone functions, so-called command-related queries — making calendar appointments, placing phone calls, sending text messages — but doesn’t do as well answering questions accurately from the web.

Apple has tried to close the gap through acquisitions. In 2015, it purchased VocalIQ, a Cambridge, England-based startup that designed a system to improve a virtual assistant’s conversational ability.

It’s not the inability for Siri to process complex conversational queries that worries me; it’s Siri’s lack of rudimentary contextual understanding. A simple example: Tuesday night, at about 11:00, I asked Siri on my Watch “is it going to rain tomorrow?”; Siri responded by displaying a ten-day forecast. This is wrong for two reasons:

My query was binary, and displaying a forecast does not answer it. Asking the same thing to Siri on my iPhone resulted in a direct answer, so I would expect a yes or no in the more time-constrained context of the Watch.
If I’m asking about what the weather will be like “tomorrow”, it makes far more sense to show me the hourly forecast.

My second objection is, I admit, subjective — a couple of people replied to my tweet asking why the hourly forecast would make sense if no rain is expected. But I think the use of “tomorrow” should supersede that and show me a more fine-grained forecast.¹

My first objection, though, seems entirely obvious to me: I’m asking a question, and it should provide an answer. I think it’s fair to limit that expectation to avoid Google’s “one true answer” problem, but this is a question already answered on the iPhone in plain terms.

This is just one example; I’m sure you can think of your own instances of baffling inconsistencies and total disobedience. My experiences with Siri over the years have been mixed, and it’s stuff like this that drives me up the wall. I would love if Siri could start understanding more complex and nuanced questions; I can’t understand why, nearly six years later, it fails to do the right thing with the most basic kinds of queries.

I also think that the hourly forecast should begin at the time I’m usually awake instead of midnight. Siri knows what time my alarm is set for, and I use a sleep tracking app that feeds into HealthKit, so it has more than enough entirely local information to be able to figure that out. ↥︎