Durga Malladi, SVP and GM for technology planning and edge solutions with Qualcomm Technologies (pictured), inevitably made the case for AI being placed on devices rather than the cloud during the company’s analyst and media workshop in June, as he argued the current focus on the technology is not another false dawn.
As a chip company, Qualcomm’s argument for AI to be on device is not unexpected: Malladi explained the company is on a “relentless mission” to shift processing from the cloud “towards the edge running directly onto devices”.
He noted the huge leap in computational processing capabilities on today’s devices along with advances in connectivity technology makes such a shift entirely possible, although conceded the same high-speed networks which enable on-device AI could equally provide decent cloud-based service, particularly as the number of parameters in the models used begin to range in the billions.
Malladi noted questions of scaling AI on device remain common, despite the “computing horsepower” available.
Running AI on device may be challenging, but the Qualcomm executive argued the pros outweigh the cons of relying on the cloud, citing cost and the growing complexity of tasks the technology is being asked to handle.
Malladi explained the cost of inference “scales exponentially if you run everything solely on the cloud”, stating this would prove problematic in future.
This is no longer a chatbot
Durga Malladi – SVP and GM for technology planning and edge solutions Qualcomm Technologies
He elucidated by referring to research published by Reuters in 2023 into the level of generative AI processing needed to run a proportion of Google Search queries through the cloud, which showed the cost is “mind boggling”. It would offset any gain made by reducing the price of the hardware involved, Malladi said.
“The second thing is that the kinds of applications are getting very rich now. This is no longer a chatbot.”
Services involve more multi-modality, Malladi said, pointing to images, voice and other additions he explained make it “tougher” and harder to scale. Throw in the sheer number of actual users and the numbers involved in “token generation or image processing” become even more overwhelming.
Malladi highlighted environmental concerns associated with the growing demand for cloud computing, citing predictions the amount of power AI will require could amount to 3.5 per cent of the world’s total energy consumption by 2030.
A new dawn
Malladi referred to the current hype around AI as the third spring for a technology he explained had existed since at least the middle of the last century.
He noted the development in the 1950s of the Turing Test, which Encyclopaedia Brittanica states is an assessment of a computer’s ability to reason in a way people would, as one of the early moves in what he called the first “spring” of AI.
This spring was characterised by a lot of original concepts, including the development of ELIZA, referred to by the New Jersey Institute of Technology as a natural language processing programme written in the mid-1960s by Massachusetts Institute of Technology Professor Joseph Weizenbaum.
An interesting aside is ELIZA was first called a chatterbot, a term now slightly abbreviated to chatbot.
Malladi said this initial spring quickly turned to winter, as research later in the 1960s proved the amount these chatterbots could learn was nowhere near the “lofty goals” expected.
It took until the early 1980s for the second spring of AI to begin, with expert systems, deep convolutional networks and parallel distributed processing capabilities paving the way. Malladi explained factors including human expertise and the start of PCs becoming mainstream led to the collapse of this round of interest in the technology by the early 1990s, the second winter.
Despite this second breakdown, Malladi noted progress in concepts around handwriting and number recognition, citing the potential for ATMs to recognise numbers on cheques being deposited.
Ironically, it was developments later in the 1990s which give Malladi the confidence that the current AI spring will not peter out again.
He pointed to the birth of the consumer internet, which brought access to a vast amount of data, the lack of which had been a hindrance in the preceding two decades. The second factor was a dramatic increase in the amount of computing power available. Malladi noted desktops and laptops gained more processing capabilities, changing the foundations of AI.
“So we are in this third spring of AI and our prediction is there’s no going back now”, Malladi said, explaining the processing power of devices and amount of data available from public and enterprise sources mean there is “tonnes of automation that can be done already” concerning consumer and productivity use cases.
Security
Malladi brought this back to the case for on-device AI by looking at the type of data involved today.
The executive noted a growing demand for more personalised responses from AI-powered consumer services, but also higher levels of security. Using medical records as an example, Malladi explained an AI voice assistant must offer personalised information rather than rely on details sourced from the public domain, arguing this presents a risk when cloud processing is involved.
“Do you want access to that data and then send it to the cloud so that it can run inference over there and come back? Why would I want to do that if I can run it directly on device?”
Another potential use was demonstrated during Qualcomm’s Snapdragon Summit in 2023, when a person sought information on what they were looking at by pointing their phone at it. Malladi explained context is required to generate a response, including deriving the user’s position from various sensors, a task involving a “lot of information” which is “very local and contextual”.
Malladi argued these examples of the need for data privacy is the reason why on device “is the way to go”.
For enterprise scenarios, he explained there may be a need to access data off-site, noting access to corporate servers or cloud services may vary depending on where the employee is.
“But regardless of connectivity, you want to have a uniform AI experience”, he explained, noting if you can run the technology directly on a device “you actually have that capability to get the responses with absolutely no bearing on how the connectivity is”.
Common goals
As with many recent high-level discussions about AI, Malladi noted the importance of partnerships and ethics.
He highlighted Qualcomm does not create genAI models, meaning the development of standardised approaches to assessing these is increasingly important because developers tend to employ their own rules regarding what is fair or safe.
Qualcomm is contributing towards developing those standards, with Malladi referencing work on the AI safety working group of ML Commons, an engineering consortium focused on the technology.
This has been a really good initiative which is recognised, at least within the US, as a starting point
Durga Malladi – SVP and GM for technology planning and edge solutions Qualcomm Technologies
The company’s partnerships play into its role in developing ethics and principles: Malladi said alongside device OEMs, Qualcomm works with governments and regulators, in part to explain what AI is “and what it is not”, while also engaging with developers, work which includes offering access to testing through a hub centred around the company’s various compatible silicon.
“Our job is not to explain to them the intricacies of our NPU and our CPU, but to make it much more easy for them to be able to access” the chips “without knowing all of the details”.
Malladi argued keeping data local rather than employing the cloud could also play a key role in AI ethics, though acknowledged security remained an important consideration even when information is stored on device. “This has nothing to do with AI per se, but I think in the context of AI it becomes even more important”.
The executive noted increasing concerns among regulators about deepfakes, explaining a big part of the issue is what actually constitutes fakery. He asked if performing some simple edits to a picture counts as falsification, adding Qualcomm considers this as an original element, augmentation as another and totally synthetic images a third.
He said Qualcomm is working with secure content transparency tools provider Truepic to verify metadata covering all three elements, providing a “certificate of authenticity” to offer some degree of transparency.
Along with the fact many flagship smartphones are incorporating AI directly, Malladi noted the pace of development in language models is also playing to Qualcomm’s mission, because companies are doing more with fewer parameter options.
He pointed to Meta Platform’s Llama3, which comes with 8 billion and 70 billion parameter options compared with the 7 billion, 13 billion and 70 billion of its predecessor as an example.
“Bottom line, what we call smaller models are way more superior than yesterday’s larger models,” in turn enabling richer use cases on mainstream devices.
While Malladi’s presentation was of course oriented towards Qualcomm’s core competencies and its pro-device push, his views carry weight due to his background as a technologist who studied neural networks, among other fields, at university.
His presentation adds to a growing consensus of the core challenges around implementing AI, along with an emerging understanding of the need for collaboration, education and, of course, data.
Comments