{"id":5182,"date":"2023-05-31T22:10:49","date_gmt":"2023-05-31T22:10:49","guid":{"rendered":"https:\/\/www.tantraanalyst.com\/ta\/?p=5182"},"modified":"2023-06-07T04:40:55","modified_gmt":"2023-06-07T04:40:55","slug":"industry-voices-heres-why-generative-ai-will-be-distributed-across-cloud-and-edge","status":"publish","type":"post","link":"https:\/\/www.tantraanalyst.com\/ta\/industry-voices-heres-why-generative-ai-will-be-distributed-across-cloud-and-edge\/","title":{"rendered":"Here&#8217;s why generative AI will be distributed across cloud and edge"},"content":{"rendered":"<div class=\"wpb-content-wrapper\"><p>[vc_row][vc_column][vc_column_text]<\/p>\n<figure id=\"attachment_5252\" aria-describedby=\"caption-attachment-5252\" style=\"width: 702px\" class=\"wp-caption alignright\"><a href=\"https:\/\/bit.ly\/3oI6BVh\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-5252 size-full\" src=\"https:\/\/www.tantraanalyst.com\/ta\/wp-content\/uploads\/2023\/05\/230531_TantraAnalyst_Insights_Heres_why_generative_AI_wil_-be_distributed.jpg\" alt=\"AI, Tantra Analyst\" width=\"702\" height=\"336\" srcset=\"https:\/\/www.tantraanalyst.com\/ta\/wp-content\/uploads\/2023\/05\/230531_TantraAnalyst_Insights_Heres_why_generative_AI_wil_-be_distributed.jpg 702w, https:\/\/www.tantraanalyst.com\/ta\/wp-content\/uploads\/2023\/05\/230531_TantraAnalyst_Insights_Heres_why_generative_AI_wil_-be_distributed-300x144.jpg 300w, https:\/\/www.tantraanalyst.com\/ta\/wp-content\/uploads\/2023\/05\/230531_TantraAnalyst_Insights_Heres_why_generative_AI_wil_-be_distributed-700x336.jpg 700w\" sizes=\"auto, (max-width: 702px) 100vw, 702px\" \/><\/a><figcaption id=\"caption-attachment-5252\" class=\"wp-caption-text\">Silverlinings, May 31, 2023<\/figcaption><\/figure>\n<h6><span style=\"color: #808080;\">It would be an understatement to say that Generative AI (GenAI) is having its day in the sun. Most of today&#8217;s GenAI powered by Large Language Models (LLMs) is run in the centralized cloud, built with power-hungry processors. However, it will soon have to be distributed across different parts of the network and value chain, including devices such as smartphones, laptops and edge-cloud. The main drivers of this shift will be privacy, security, hyper-personalization, accuracy, and better power and cost efficiency.<\/span><\/h6>\n<h6><span style=\"color: #808080;\">AI model &#8220;training,&#8221; which occurs less often and requires extreme processing, will remain in the cloud. However, the other part, &#8220;inference,&#8221; where the trained model makes predictions based on the live data, will be distributed. Some model &#8220;fine-tuning&#8221; will also happen at the edge.<\/span><\/h6>\n<h4><span style=\"color: #000000;\"><strong>Challenges of today&#8217;s cloud-based GenAI<\/strong><\/span><\/h4>\n<h6><span style=\"color: #808080;\">No question that AI will touch every part of human and even machine life. GenAI, which is a subset application, will also be\u00a0<span style=\"color: #800000;\"><a style=\"color: #800000;\" href=\"https:\/\/bit.ly\/3IqSGtr\">very pervasive<\/a>.<\/span> That means the privacy and security of the data GenAI\u00a0processes will be critically important, and unfortunately, there is no easy or guaranteed way to ensure that in the cloud.<\/span><\/h6>\n<h6><span style=\"color: #808080;\">Equally important is GenAI&#8217;s accuracy. For example, ChatGPT&#8217;s answers are often riddled with factual and demonstrable errors (Google &#8220;ChatGPT hallucinations&#8221; for details). There are many reasons for this behavior. One of them is that GenAI is derived intelligence. For example, it knows 2+2=4 because more people than not have said so. The GenAI models are trained on enormous generic datasets. So, when that training is applied to specific use cases, there is a high chance that some results will be wrong.<\/span><\/h6>\n<h4><strong><span style=\"color: #000000;\">Why GenAI needs to be distributed<\/span><\/strong><\/h4>\n<h6><span style=\"color: #808080;\">There are many reasons for distributing GenAI, including privacy, security, personalization, accuracy, power efficiency, cost, etc. Let&#8217;s look at each of them from both consumer and enterprise perspectives.<\/span><\/h6>\n<h6><span style=\"color: #808080;\"><strong><span style=\"color: #000000;\">Privacy:<\/span>\u00a0<\/strong>As GenAI plays a more meaningful role in our lives, we will share even more confidential information with it. That might include personal, financial, health data, emotions and many details even you or your family and closest friends may not know. You do not want all that information to be sent and stored perpetually on a server you have no control over. But that&#8217;s precisely what happens when the GenAI is run entirely in the cloud.<\/span><\/h6>\n<h6><span style=\"color: #808080;\">One might ask, we already store so much personal data in the cloud now, why is GenAI any different? That&#8217;s true, but most of that data is segregated, and in many cases, access to it is regulated by law. For example, health records are protected by\u00a0<span style=\"color: #800000;\"><a style=\"color: #800000;\" href=\"https:\/\/bit.ly\/3Wm8WS3\">HIPPA regulations<\/a>.<\/span> But giving all the data to GenAI running in the cloud and letting it aggregate is a disaster waiting to happen. So, it is apparent that most privacy-sensitive GenAI use cases should run on devices.<\/span><\/h6>\n<h6><span style=\"color: #808080;\"><strong><span style=\"color: #000000;\">Security:<\/span><\/strong>\u00a0GenAI will have an even more meaningful impact on the enterprise market. Data security is a critical consideration when utilizing GenAI for enterprises. Even today, the concern for data security is making many companies opt for on-prem processing and storage. In such cases, GenAI has to run on the edge, specifically on devices and the enterprise edge cloud, so that data and intelligence stay within the secure walls of the enterprise.<\/span><\/h6>\n<h6><span style=\"color: #808080;\">Again, one might ask, since enterprises already use the cloud for their IT needs, why would GenAI be any different? Like the consumer case, the level of understanding of GenAI will be so deep that even a small leak anywhere will be detrimental to companies&#8217; existence. In times when industrial espionage and ransomware attacks are prevalent, sending all the data and intelligence to a remote server for GenAI will be extremely risky. An eye-opening early example was the<span style=\"color: #800000;\">\u00a0<a style=\"color: #800000;\" href=\"https:\/\/bit.ly\/437Xyvs\">recent case<\/a>\u00a0<\/span>of Samsung engineers leaking trade secrets when using ChatGPT for processing company confidential data.<\/span><\/h6>\n<h6><span style=\"color: #808080;\"><span style=\"color: #000000;\"><strong>Personalization:<\/strong>\u00a0<\/span>GenAI has the potential to automate and simplify many things in life for you. To achieve that, it has to learn your preferences and apply appropriate context to personalize the whole experience. Instead of hauling, processing, storing all that data and optimizing a large power-hungry generic model in the cloud, a local model running on the device would be super-efficient. That will also keep all those preferences private and secure. Additionally, the local model can utilize sensors and other information in the device to better understand the context and hyper-personalize the experience.<\/span><\/h6>\n<h6><span style=\"color: #808080;\"><strong><span style=\"color: #000000;\">Accuracy and domain specificity:<\/span>\u00a0<\/strong>As mentioned, using generic models trained with generic data for specific tasks will result in errors. For example, a model trained on financial industry data can hardly be effective for medical or healthcare use cases. GenAI models must be trained for specific domains and further fine-tuned locally for enterprise applications to achieve the highest accuracy and effectiveness. These domain-specific models can also be much smaller with fewer parameters, making them ideal for running at the edge. So, it is evident that running models on devices or edge cloud is a basic need.<\/span><\/h6>\n<h6><span style=\"color: #808080;\">Since GenAI is derived intelligence, the models are vulnerable to hackers and adversaries trying to derail or bias their behavior. A model within the protected environments of enterprise is less susceptible to such acts. Although hacking large models with billions of parameters is extremely hard, with the high stakes involved, the chances are non-zero. \u00a0\u00a0<\/span><\/h6>\n<h6><span style=\"color: #808080;\"><span style=\"color: #000000;\"><strong>Cost and power efficiency:\u00a0<\/strong><\/span>It is estimated that a simple exchange with GenAI costs<span style=\"color: #800000;\">\u00a0<a style=\"color: #800000;\" href=\"https:\/\/reut.rs\/3o9TX18\">10x more<\/a><\/span>\u00a0than a keyword search. With the enormous interest in GenAI and the forecasted exponential growth, running all that workload on the cloud seems expensive and inefficient. It&#8217;s even more so when we know that many use cases will need local processing for the reasons discussed earlier. Additionally, AI processing in devices is much more power efficient.<\/span><\/h6>\n<h6><span style=\"color: #808080;\">Then the question becomes, &#8220;Is it possible to run these large GenAI models on edge devices like smartphones, laptops, and desktops?&#8221; The short answer is YES. There are already examples like<span style=\"color: #800000;\">\u00a0<\/span><a style=\"color: #808080;\" href=\"https:\/\/bit.ly\/3MF3OFk\"><span style=\"color: #800000;\">Google Gecko<\/span><\/a>\u00a0and\u00a0<span style=\"color: #800000;\"><a style=\"color: #800000;\" href=\"https:\/\/bit.ly\/3OiGSx4\">Stable Diffusion<\/a><\/span>\u00a0optimized for smartphones.<\/span><\/h6>\n<h6><span style=\"color: #808080;\">Meanwhile, If you want to read more articles like this and get an up-to-date analysis of the latest mobile and tech industry news, sign-up for our monthly newsletter at\u00a0<span style=\"color: #800000;\"><a style=\"color: #800000;\" href=\"https:\/\/bit.ly\/TA-Newsletter\" target=\"_blank\" rel=\"noopener\">TantraAnalyst.com\/Newsletter<\/a>,<\/span> or listen to our\u00a0<span style=\"color: #800000;\"><a style=\"color: #800000;\" href=\"https:\/\/www.tantraanalyst.com\/ta\/podcast\/\" target=\"_blank\" rel=\"noopener\">Tantra\u2019s Mantra podcast<\/a>.<\/span><\/span><\/h6>\n<p>[\/vc_column_text][\/vc_column][\/vc_row]<\/p>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>[vc_row][vc_column][vc_column_text] It would be an understatement to say that Generative AI (GenAI) is having its day in the sun. Most of today&#8217;s GenAI powered by Large Language Models (LLMs) is run in the centralized cloud, built with power-hungry processors. However, it will soon have to be distributed across different parts of the network and value [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":5185,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"image","meta":{"mc4wp_mailchimp_campaign":[],"footnotes":""},"categories":[58],"tags":[],"class_list":["post-5182","post","type-post","status-publish","format-image","has-post-thumbnail","hentry","category-ai-compute-iot","post_format-post-format-image"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/posts\/5182","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/comments?post=5182"}],"version-history":[{"count":0,"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/posts\/5182\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/media\/5185"}],"wp:attachment":[{"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/media?parent=5182"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/categories?post=5182"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tantraanalyst.com\/ta\/wp-json\/wp\/v2\/tags?post=5182"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}