Kirk Klasson

AI: The Next Cambrian Explosion

Will open sourced AI prove productive or problematic?

Science generally agrees that life here on the rock began about 3.8B years ago, loitering on the edges of Gowanda, trying to figure out how to exchange genetic material for fun and profit. Then about 3.2 or 3.3B years later something happened. Nobody yet agrees as to exactly what but in the short span of 10’s of millions of years life exploded into nearly all the modern phyla that we see today. Behold! The Cambrian Explosion. Now imagine for a moment that all the data and code sloshing around the planet is little more than the genetic soup washing up on Gowanda and Artificial Intelligence (AI) becomes the trigger that ignites the next explosion. You might want to sit down for this one.

Invisible Ubiquity

In previous posts we discussed the notion and importance of the confluence of events, a serendipitous gathering of moments, technological, cultural, economic and social that propel the world in a new direction. (see Prophets on the SILK road –  October 2012 and Hacking the Future 2.0 – January 2015) And with the burgeoning practical application of AI we seem to be on the cusp of such a moment. For everyone in the developed world you have probably already encountered AI but it has been so deeply embedded in the encounter that you probably don’t even know it. Say for instance a credit card company believes that they have detected the fraudulent use of your card and autodials your mobile phone where a pleasant sounding female voice asks you a few simple questions to ensure your identity hasn’t been compromised and then offers you a new service from one of their partners whose office just happens to be three blocks from where you are standing. You have just experienced a boatload of AI and may be a bit annoyed at the intrusion but certainly no worse off for having entertained it. These days fraud detection, speech synthesis, natural language processing, conversational queries, device/user concatenation and offer identification and promotion are all based on various forms of AI.

The Economics of Innovation

A few posts back we examined the impact that open source might have on a phenomenon called the economy of innovation (see Requiem for a Business Model – January 2011). The hypothesis of the post, originally put forward by C. Gordon Bell, was that special purpose architectures were not as economically viable as general purpose architectures in that both required relatively expensive and complex business ecosystems to succeed and that special purpose instances could not threaten enough opportunities to afford to build sustainable business ecosystems. Open source influenced these circumstances by causing code-based solutions to coalesce into a common source base, supported by a costless community or ecosystem. These circumstances afforded open source solutions the opportunity to then go after new categories of products and new markets that entrenched providers could not afford to tackle. Android being a case in point, a low cost/no cost foundation that spurs innovation in a host of new smart devices that enterprise technology players couldn’t bring to market with their existing operating and economic models (see The Ghost in the Machine – November 2011).

A similar set of circumstances seems to be emerging in the AI space. Over the last six months most of the big players have open-sourced their primary AI platforms or significant portions of their AI capabilities. Google released TensorFlow, Microsoft open-sourced its Computational Network Tool Kit, Amazon did the same with Machine Learning, Facebook released several key components to the Torch Project and IBM turned its SystemML code over to Apache. All of this was in addition to more established AI initiatives such as Darpa Deepdive, Apache UIMA, OpenCogs and Open Advancement of Question Answering or OAQA. So here are at least nine established platforms vying to be the primary focus of future AI development. But wait, there’s more.

A few years ago Linas Vepstas, a research engineer at Hanson Robotics, pulled together a taxonomy of the then AI landscape. He identified 12 unique categories of projects along with 106 active projects most of which were of the open source variety. The list included 22 Natural Language Processing projects alone. The sheer number of projects and their relative degree of specialization suggest that AI will begin to undergo the same transformation suggested by the economy of innovation, coalescence towards a more general purpose foundation along with an increased ability to threaten more opportunities at much lower cost. All the ingredients required for an explosion.

How to train your dragon

Much of the hype and promise of AI currently rests in distilling enormous amounts of data to produce specific verifiable and reliable assertions and insights. The data involved could be transactional, visual, aural, sensory, the list is almost endless and the data almost infinite up to and including the AI code itself. Reliable assertions and verifiable insights are a bit harder to come by; hence the current push for automatic machine based learning.  By looking at a million photographic instances a machine might assert that it is looking at a cat. Several million instances more and it might conclude that cats like mice. The nature of the problem and the techniques employed to solve it require enormous amounts of data storage and processing power and therefore currently lend themselves to proprietary, cloud-like architectures.  At first glance, for Microsoft, Amazon, Google and IBM who just happen to own huge data centers this seems like good news. Host your transaction based web site in their clouds, let them vacuum up all that data, let loose a bunch of Recurrent Neural Networks and voila! you’ll soon discover that in addition to mice, cats like catnip. But as with any networked based solution the economics surrounding the logistics of data and processing might seek out differently configured, distributed solutions.

Sometime in mid 2015 Siri began processing more than 1B requests per week and its not unimaginable that it could pass 1B requests per day by 2017. Now there are lots of ways one could architect a solution that processes these requests. Siri’s own architecture does a significant amount of processing on the device to determine exactly how to satisfy the request. It does additional processing in the cloud to establish what resources might need to be invoked to satisfy requests that can not be serviced by the device independently and locally. (see Prophets on the SILK Road – October 2012)When it was originally introduced the only way to make Siri successful was as an end-to-end solution. Configurable, commercial, lego-like components weren’t generally available. However, since Siri’s introduction voice integration API’s have been normalized and are currently being provisioned on a host of applications. At the same time advances in speech recognition have allowed for the separation of the device from the request and the request from the service provider(s) who ultimately handle it. Recently, products from Sound Hound and Mindmeld intimate that Siri’s value proposition is about to be aggressively unbundled and parsed into several robust constituent elements. Mindmeld envisions on-prem, enterprise based voice provisioned applications operating over semantically unique proprietary collections of data. At the same time MIT just announced a neural network processor called Eyeriss that could easily be embedded in smart devices obviating the need for backend server processing of voice requests.

Taken together this suggests that AI like capabilities will be parsed and distributed throughout the internet and not necessarily handled by a handful of proprietary cloud providers. One could easily envision an embedded network based AI capability that could handle requests on a highly distributed basis where cooperative peer to peer brokers operating from the edge of the network bid on and satisfy requests specific to various providers’ domain expertise. This then begs the question of why would a knowledge based business turn over all of its data and domain expertise to Watson or Google or Amazon or Microsoft to reward them as gate keepers and toll takers when the value added might be marginal at best? (see Digital Strategies…Part I – February, 2016)

High Stakes, Big Players 

IBM has postulated that Watson represents a $20B per year revenue opportunity in the next couple of years although they make a point of saying that they have no intention of breaking out Watson specific revenues. Google, Amazon, Microsoft have made similar noise when it comes to the prospects of the proprietary commercialization of AI. One has to wonder if that will actually materialize if open source manages to re-organize the AI industry into freely licensed, configurable, distributed, networked components. If the bulk of Watson’s or for that matter Amazon’s AI revenue is simply additional hardware due to hosting data for use by customer owned neural networks then it becomes a yawn. If, on the other hand, it can structure AI revenues as a kind of intellectual property where insights garnered are monetized on the basis of royalties or licensed revenues things might get a bit more interesting. However, if IBM is producing those insights using open sourced AI software chances are customers won’t entertain that revenue model for very long.

A while back, IBM released Bluemix, its Watson based developers platform. By signing on, developers have access to a robust set of IBM’s tools and are able to use them to create unique instances of AI code to run against data stored in IBM’s cloud offering.  The code so created doesn’t leave IBM’s facilities so one has to guess that the revenue model is to increase the customers’ hosted data footprint. Seems kind of short sighted and cynical to go to all that trouble just to enlarge PaaS revenues.

Within a generation it is increasingly likely that AI will infuse every aspect of modern life, from the moment we get up until we go to sleep. Like today, there will be instances where we won’t even know that we have encountered it, but increasingly we will be conscious of its place in our lives. For the moment, entrusting significant AI code to open source distribution and community management is a big bet, one that could exponentially increase commercialization and adoption but only if more than a few big players get to enjoy it. But either way, an explosion is about to ensue.

Graphic courtesy of the Alpha Omega Institute

Sorry, the comment form is closed at this time.

Insights on Technology and Strategy