The Ultimate Smartphone and PC Killer
Are you also disappointed by all recent product launches? The technical differences between $250 smartphones and $900 smartphones are only marginal. There are plenty of smartphones for less than $400 who have super fast octa-core processors, 6GB of RAM, fingerprint sensors, and high definition displays.
So what is the real difference to brand new flagship phones of Samsung, Apple, Google, and other brands? The price premium you pay for getting an iPhone 7 Plus, a Google Pixel, or a Samsung Galaxy S7 EDGE is oftentimes ridiculous. For an additional $500 you will get a slightly better camera, a pretty expensive smart assistant, and other tiny software improvements. You don’t pay this extra price for an innovation rather for tiny features.
With every product launch, I am hoping for a breakthrough. I am not talking about bendable displays, a superior camera, or a USB-D port. I am talking about taking the communication experience to a completely new level.
The same is true for laptops and computers. The latest innovations were a touch-bar from Apple or a knob controller from Microsoft. The only reason to upgrade or buy a Mac is because of its superior software. But this software advantage is also about to set equal with Windows 10. Where is the big breakthrough in the computer industry?
Computers and smartphones will become one Device
It is about time to disrupt the smartphone and computer industry. Here are two things that annoy me with computers and phones:
- Fixed Display: Whether you are using a phone or a computer. To interact with your device you need to use a screen. You need to look at it in order to control your device. This is the reason why so many people are walking down the street while staring down on their phone display.
- Typing: Typing is the most annoying thing – especially on a phone. Most of our current digital communication is based on text. In order to communicate with text messages you still need to type what you want to say – at least in most cases.
Alternatives to Displays and Typing
Every phone is already equipped with a personal assistant which is already pretty good in understanding speech. If you own an iPhone this assistant is called Siri. Google Now offers you the same service on Android phones (and even iPhones). You can use these speech recognition assistants to write and send messages, ask for the weather, time, or your calendar appointments. But we still cannot control the whole device with it. The same assistants are available on computers. Microsoft introduced with Windows 10 an assistant called Cortana, the last iOS update brought Siri to the Mac. But you cannot work on your PC or Mac without using the keyboard, mouse, or touch screen.
What we need to do to replace phones and computers (in the traditional sense) is to combine three (or four) powerful technologies: augmented reality (and virtual reality), artificial intelligence, and speech recognition.
Virtual reality headsets are already replacing the need for a TV or a PC display. Augmented reality will replace the need for any screen within the next ten years. Advanced augmented reality devices will allow you to create a screen anywhere at any time. Just place a virtual display anywhere in your field of view and adjust the size yourself. This will reduce or eliminate display devices such as TVs, PCs, phones, and laptops.
When AR and VR headsets will make display devices unnecessary how will we interact with our new AR devices? There are several technologies which will bring the way we interact with computers to the next level. The solution for the next 5 years will be speech recognition and motion recognition. We will simply use our native language to give commands to computers. Amazon Alexa is already a great example for that. Instead of using a knob or an app to turn on lights we can simply say: “Hey Alexa please turn on the lights.” To interact with AR displays we may use our hands or eyes to browse and interact with the user interface. In order to make this experience as great as possible, we will need to combine all these technologies with artificial intelligence. This will ensure that we experience a 100% accurate speech input and motion control.
First Use Cases
Microsoft announced exactly a week ago the launch of its “Custom Speech Service”. It is part of Microsoft’s “Cognitive Services APIs” which allows developers to use one of the most powerful machine learnings technologies out there. Custom Speech Service, in particular, is a more advanced version of Siri or Google Assistant. Custom Speech Services is combining two technologies: CRIS and LUIS. Here is the difference to current technologies:
Google Now or Siri rely on specific words or phrases which are intentionally mapped by a programmer to a given action: “Restaurant nearby” or “Best burger restaurant” will show you some restaurants nearby. LUIS, however, cannot just understand what you say but LUIS is trained to understand what you actually mean. You don’t necessarily need to say: “Restaurant nearby” but also “find lunch, I need a burger, I can concentrate anymore, I am too hungry, etc.”.
This CSS (customs peer service) will simplify the life of programmers and it will make human-computer interactions feel real.
While Microsoft is not itself building a product around the program (but who knows?) it is a offered as a software to clients such as virtual reality content studios.
One example is the game Starship Commander. This virtual reality movie-game was built using CRIS and LUIS. As a result, the characters in the game are able to understand and respond to unique vocabulary. You can have real conversations with the characters in the game and control the game/movie accordingly.
The Custom Speech Service of Microsoft is starting with a free model. That means that with CRIS and LUIS Microsoft is on a mission to make “AI available to every organization and every person”.