By Larry Velez, Sinu co-founder and CTO
IBM's Watson supercomputer chats with researchers and other professionals about natural solutions to big problems (Photo Credit: Georgia Tech as reported in Gizmag article, "IBM's Watson gets chatty to act as a sounding board," 11/13/15)Seamless transitions from connectivity to no connectivity are the key to the future of software. Siri, Google Now and Amazon Alexa – all audio-based interfaces, by the way – are possible today because the software is connected to a very powerful and smart Artificial Intelligence (AI) cloud from these companies. Consumers can tap into that always-on intelligence directly and through third-party apps that harness its power.
One such app made me realize the value of being seamlessly connected to audio. I began using the NPR One app recently which seems to cache data so that I can continue to listen to content even when I’m in the subway.
I have long thought that cached data is an important part of any software, especially for an app, so a person can continue to access content without worrying about being connected or not; no more waiting to access that service or data. In some ways, it creates more time – a commodity people report they are beginning to value even more than money. And, unlike money, time (we only have 24 hours in a day) is inherently scarce. So if we cannot buy more time, we can buy devices, apps and services that save us time and/or make us feel like we have more time.
Cached data is data stored on a device so future requests for that data can be served faster (like for stored passwords) or it duplicates the data stored elsewhere (like in the case of NPR One when I could listen to audio content offline). While apps running on cached data are convenient, they can take up quite a bit of memory on mobile devices and need to be cleared regularly. The real future lies in development of apps that can run locally on devices which have limited storage space.
Google is currently developing such an app. According to ZDNet, “the company has developed a speech-recognition system that's small enough to run ‘faster than real time’ on a Nexus 5 without an internet connection.”
The new system, which was developed for smartphone for dictation and voice commands, promises to “overcome the need for a reliable network connection to use speech recognition on a smartphone, smartwatch or any other memory-constrained gadget.”
Another company, IBM, introduced its first voice-driven native mobile apps with IBM Watson and IBM MobileFirst last summer. The apps are available for both iOS and Android, and are designed to simplify and streamline integration with speech-to-text and text-to-speech services.
This is just the beginning. Not unlike adding mobile versions of software a few years back, I believe we will see a big push toward software and app development that leads with audio and speech recognition. Why? The human race has spent tens of thousands of years fine tuning the ability to communicate in two ways: audio and face gestures. Huge parts of our brain are dedicated to these two communication methods so it seems logical that technology will work best for humans when it communicates the way we have evolved to communicate. Adding audio and speech recognition interfaces (APIs) to software also allows a us to complete complex actions seamlessly with minimal interaction with the device. Something we humans will likely prefer over the tedious, time-consuming, and often clumsy, process of typing, touching and swiping. One day, we will look back and wonder why we even used these archaic devices called keyboards!