The future of machine learning is mobile.

In by contextere

Machine learning and artificial intelligence have been entering into the mobile device arena for some time now. Their potential is undeniable with virtual assistants and conversational user interfaces expected to be used by 97% of all iOS and Android users in 2020[1].

Despite the massive interest in wielding machine learning models to enhance user experiences and workflows, particularly in regard to natural language processing and image recognition, nearly all of the significant computational architecture required to develop and train these models remains cloud based.

To the uninitiated, the term ‘artificial intelligence’ (AI) refers to a field of computer science whose aim is to create machines that can think intelligently. While ‘machine learning’ (ML) is a subset of AI which itself contains dozens of different methodologies, including ‘deep learning’ (DL) which is a critical component for solutions that focus on speech recognition and natural language processing.

Deep learning is a hotbed area of research and development and typically requires large data sets and high-performance computing hardware. As a result, when deep learning models are being developed and refined, the computation is usually done on server farms via cloud service providers like AWS or Azure.

Impressive Potential

What users really need are machine learning models that are deployed on mobile devices, allowing for offline functionality. Currently, virtual assistants, such as Siri, Alexa, Google Assistant, Cortana, etc., require active network connectivity to fully function. Deploying machine learning models to mobile devices for edge inference offers several advantages:

  • Offline functionality – In a tunnel? In a remote area? Down a mine shaft? No problem. All ML functions occur on the mobile device itself rather than in a central server.
  • User-specific ML output – ML models provide micro-level specificity on a specific user’s data, which allows for greater fidelity and optimised personalization.
  • Reduced network bandwidth requirements – ML models’ computations are performed using a mobile device’s hardware, thereby reducing bandwidth and cloud resource costs as there is no transfer of data between edge devices and the cloud.
  • Greatly reduced latency – Because the ML models run locally, no delays occur due to varying network transfer speeds.
  • Increase privacy & security – By eliminating the reliance on networks communications, potential bad actors will have fewer avenues for acquiring sensitive data.



The potential of ML deployment on edge devices is certainly tantalizing. However there remains numerous barriers to widespread deployment of mobile machine learning models, not least of which being limitations of mobile CPU and GPU hardware. In order to train ML models, significant GPU performance is required. Yet, less than 20% of mobile chips have a GPU that is more powerful than the CPU, and, according to 2018 data, only 25% of all smartphones implemented CPU cores designed in 2013 or later[2].

Along with increased CPU performance requirements, training ML models on mobile would require devices with large-capacity storage. True offline ML training capability necessitates that both the ML models and any associated datasets be deployed to, and stored locally, on the device. Depending on the dataset, this may be an impractical limitation.

Exacerbating both of these barriers is the fact that there is no standardized mobile chipset to develop for. There are 25 mobile chipset vendors, each of which producing a mix-and-match of custom-designed components with IP blocks licensed from other companies. The fragmentation of hardware is particularly apparent within the Android ecosystem, where there are over two thousand different chips compared to a little more than a dozen chips on iOS.

A hybrid online/offline solution of training on cloud and inferencing on mobile may be a superior solution because ML inferencing can be achievable on edge devices. The key difference being that ML inferencing does not re-evaluate or adjust the overall ML model, but rather uses new data on a pre-trained model to infer output.

What the future holds

Despite these barriers, the potential of edge-based ML modelling remains an area of extreme interest and activity for R&D. All the more so knowing that the pressure of consumer demand is causing mobile CPU OEMs to produce chips with ever-greater processing power. OEMs are also undertaking significant research and investment towards the development of hardware dedicated specifically to machine learning, such as the Qualcomm Artificial Intelligent Engine on Snapdragon[3].

Long-form question answering has long been a goal for developers of natural language generation (NLG) solutions. So far, all NLG has been restricted to short phrases or single sentences. Any long-form answering capabilities that currently exist are reliant on context extraction and verbatim text-to-speech solutions. True long-form NLG has been a hotbed of development in recent years and research papers and R&D projects are beginning to emerge into the public spotlight.

The future of edge-based ML solutions is quite promising and with 5G peeking over the horizon, we can expect streamlined over-the-air deployment of ML models as well as superior online/offline hybridization. With the optimal application of expertise and ambition, discrete machine learning solutions can be autonomously applied and trained entirely on a user’s device allowing for more personalized ML output as well as reduced reliance of network connectivity.