AI: Crossing the Chasm One Clock Radio at a Time
Things set in motion acquire inertia and with luck inertia acquires purpose and with purpose accomplishments follow.
At least that’s the theory.
I was reminded of this the other day while reconfiguring the settings and linkage from Alexa and my smart home hub for the umpteenth time. This usually happens after a change of seasons, software updates, power outages and coincidental nuances in the prevailing lunar cycle. Or put another way, all the freaking time!
I remember when we first brought Alexa home. It was like bringing home a pet, a new member of the family, prominently placed to afford it the best earshot, in the beating heart of our home, the kitchen. Then we integrated it with our other devices, lights, sensors and media. We even introduced it to our friends and immediately proceeded to play stump the band.
“Alexa, play Sons of Champlin, “1982-A”. No? Can’t find it? Oh,well.
We then proceeded to buy a bunch of dots and taps and put them everywhere. And after a couple of years and several unprovoked or rather “un-invoked” Tourette’s like outbursts, numerous Siri interruptions and the occasional radio commercial induced spoof, we discovered something quite unexpected. The only time we used Alexa was to set a wake up alarm followed by flipping it over to music.
But we were by no means the first to discover this.
In fact, a hack for the clock radio “value proposition” on Echo devices, if that’s what you deem it to be, first appeared in 2015 and was so popular that Amazon codified the functionality in its December 2017 release. But if this value proposition occupies the pole position in the power curve of Alexa skills, and it appears that it does, it begs a much bigger question.
A billion here, a billion there, pretty soon you’re talking real money
By several analyst estimates, Amazon has so far spent at least a billion dollars in pursuing the Alexa dream. An investment predicated on the notion that “occupancy” in the home will provide an unprecedented privileged position when it comes to capturing subsequent consumer transactions. The idea that “occupancy” commands value has been around for a long time, especially in tech circles. Once a technology is placed in service, it accrues a number of immediate and potential revenue bearing opportunities. Software and hardware need to be maintained. Platforms accumulate value by attracting partners through lower integration costs. Adjacent capabilities can be easily monetized and sold. And suddenly buyers are faced with circumstances they previously hadn’t considered: opportunity, sunk, switching and exit costs. Decisions have numerous trailing costs and consequences most of which don’t factor in until sometime after “occupancy” commences.
For instance, at what price point would a consumer consider replacing Alexa with Google Home or Apple HomePod? Well, that might in part depend upon the expected value of occupancy. If, as some analysts have suggested, the follow on value of consumer voice-initiated transactions is somewhere in the neighborhood of $400 per household per year, voice occupancy vendors should be willing to give the technology away merely to capture customers. But even if they did, would that actually be an attractive offer for your average consumer? That depends. If the consumer is going to spend the next couple of days reconfiguring the settings on their smart home hub, probably not.
Recent estimates have put the value of consumer voice initiated commerce at $40b by 2022 for the US alone. If this proves to be the case then Amazon’s current household occupancy of nearly 70% of all voice deployments more than justifies its current rate of Alexa investment. It also justifies all the other investments by all the other technology players who plan to get a piece of this action. But voice based transactions won’t necessarily be tethered to the home or home deployed devices. Over the next several years voice initiated commerce will be integrated into almost every device and application imaginable. So the question becomes what kind of voice transactions will be initiated by what kind of consumers over what kind of devices?
Here things begin to get a little more fuzzy.
The Information recently reported that sources familiar with Amazon revealed that in 2018 only 2% of Alexa users have made a voice initiated purchase and of those who had made such a transaction, 90% did not make a subsequent one. This led VentureBeat to opine that Alexa is little more than a Dash Button with speakers, which supports the idea that, so far, voice initiated transactions are essentially delegated purchasing decisions, ones that consumers don’t consider to be significant enough to merit deeper engagement. So we’re talking detergent and dog food and not much else. If this becomes ingrained consumer behavior, instead of capturing $40b by 2022 we’re looking at a substantially smaller number.
At the same time, it appears that the creation of new Alexa skills, due to weakening monetization, is beginning to tap out. Thanks to a $100m in incentive funding, Amazon has managed to accumulate about 30,000 skills to its Alexa platform, far in excess of Google’s mere 1800. But anecdotally, the skills play appears to be exhausted, and if you looked at the power law by invocation you would more than likely find that the most popular skill is the one that was hacked back in 2015, the Alexa clock radio.
Don’t Look Down…
A recent spate of announcements would seem to indicate that, when it comes to AI, the tech industry has recognized it is entering a new and somewhat delicate market phase. The early adopters, those with big money, big skills and big opportunities to pursue, the IBM’s, Amazons, and Googles of the world, have shifted their focus to monetizing the logistics of merchant-based, enterprise driven AI adoption. Look, nobody said this was going to be easy, so why not make a bunch of money getting all these non-tech pioneers over to the promised land. However, if the only way the big early adopters can make money in AI is in hosting services for large data sets and processing fees on proprietary deep learning architectures, then its going to be a long tough slough to profitability ( see AI: The Next Cambrian Explosion – February 2016).
Google’s recent release of its AutoML suite seems to be a clear reflection of AI’s current circumstances. AutoML provides templatized learning facilities that can accelerate the acquisition of Machine Learning based artifacts for deployment in image recognition and natural language processing applications. Likewise, firms like Labelbox have created models that accelerate supervised learning over unique data sets or DataRobot that asserts they can provide similar capabilities by “fitting” data profiles to the most effective analytic algorithms. Nvidia and NetApp jointly announced Ontap AI, a combination of capabilities to automate the data pipeline for AI deployments. And Amazon recently announced its Kinesis Streaming, a service focused on eliminating the need to collect and store large amounts of data prior to submitting it to complex analysis. All of these represent instances of automating Machine Learning logistics to speed adoption and accelerate time to value; the chasm that AI currently needs to cross.
Editor’s Note: Crossing the Chasm was a concept first popularized by Geoffrey Moore in the early ’90s. It was only after publishing this post that it dawned on me that some readers may not be familiar with this concept.
Back in 2016 in a post entitled “Platform Strategies in the Age of AI” we suggested that Natural Language Processing would likely be the “thin edge of the wedge” which established tech companies would use to engage an ever widening pool of prospects, both of the consumer and business variety. And the race to dominate the smart speaker market would seem to be bearing this out. At this point, the enterprise market for voice-based interface has barely been touched and one of the impediments to its adoption is Machine Language Understanding, the ability of the interface to handle the ontological challenges unique to every knowledge domain and business environment. (see The Dawn of Agency – May, 2018) So far, not even the early adopters, with all their skills and money, have ventured to install in their very own offices business based voice user interfaces. However, the technology required to build it is out there. Cruse on over to Github and check out vui-ad-hoc-alexa-recognizer, a tantalizing skill and potential killer app, the original Heath Kit of business voice interfaces.
Which brings us back to the primary question that Amazon’s Alexa begs. What if, after consuming a billion dollars or more in technological prowess, Alexa’s primary value proposition IS the reinvention of the clock radio? And that’s it. Not that this would bother Amazon much. A billion dollars is just walking around money. However, by this point, most global enterprises are eyeing a list of AI projects as long as your arm and none of them represent a $40b potential opportunity in 2022. Not even close. And if, on average, its going to cost a billion dollars to get one of these apps out of garage and on to the road, the intended value proposition had better have a greater than 2% probability of realization.
It could well be that for many AI applications there is no short cut when it comes to time to value. It’s the nature of the beast. Stochastic convergence can’t be rushed, it needs to eat and you need to feed the beast if it’s ever going to produce even modest amounts of value, let alone earth shattering insights. Just ask IBM. (see AI’s Inconvenient Truth – July 2018) Even after years of refinement voice user interfaces are only 95% accurate and most practitioners would tell you that they will never be 100% reliable.
So either one of two things needs to happen. Either the expectations of all these AI pioneers need to be tamped down a bit or AI needs to find its inner Evel Knievel and a brand new flux capacitor.
For a more recent take on the issues behind these topics please go to “AI’s Inconvenient Truth” in the Epilogues tab.
Cover graphic courtesy of Tested.com all other images, statistics, illustrations and citations, etc. derived and included under fair use/royalty free provisions.