Draft:Edge AI
Submission declined on 7 November 2025 by Aesurias (talk).
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
|
| Submission declined on 5 November 2025 by Samoht27 (talk). Your draft shows signs of having been generated by a large language model, such as ChatGPT. Their outputs usually have multiple issues that prevent them from meeting our guidelines on writing articles. These include: Declined by Samoht27 5 days ago.
|
This article may be too technical for most readers to understand. |
Edge AI (also called Local AI, On-device AI, or Edge Intelligence) is the synthesis of edge computing and artificial intelligence (AI).[1] It refers to running AI models locally (on-device) on edge devices such as smartphones, IoT devices, and other embedded systems as well as on more capable hardware on the edge of the network (like PCs, notebooks etc.), so that AI computation occurs close to where data is produced.[2][3] Benefits of Edge AI include reduced latency, reduced bandwidth use, offline-availability (offline), and heightened data privacy compared to cloud-AI approaches.[1][2][3][4] On top, where Edge AI can be used instead of Cloud AI, it is more resource-efficient and therefore more sustainable.[5]
Core concepts
[edit]Edge AI typically deploys pre-trained AI models onto edge devices and executes inference locally. The defining feature is minimising reliance on centralised cloud resources, not hardware.[2] Key technical approaches include:
Hardware acceleration: use of GPUs, NPUs and other accelerators integrated into edge devices for efficient inference.[6]
Model optimisation: compression techniques such as quantisation and pruning to reduce size and computation.[6]
Local data processing: on-device pre-/post-processing and feature extraction to reduce network traffic.[1][3][4]
Connectivity for updates: optional data synchronisation for periodic model updates, data aggregation or workload offload.[2]
Architecture
[edit]While hardware platforms vary, Edge AI deployments commonly include:[2]
Edge hardware: devices with CPUs and optional accelerators (e.g. NPUs) for inference.
AI models: trained models packaged for the device runtime (e.g. TensorFlow Lite, ONNX Runtime); models are often optimised via distillation, quantisation or pruning.[6]
Local data pipeline: acquisition, filtering and normalisation of data performed on device prior to inference, often supported by an on-device vector database. An on-device vector database provides local long-term memory, supports on-device similarity search (typically ANN), and retrieval-augmented (RAG) workflows independant of a network connection.[7][8][9][10][11]
Challenges and limitations
[edit]Edge deployments are constrained by compute, memory, disk, and power, which limits the models (size) that can be run. This motivates techniques such as quantisation and pruning, as well as running specialised models rather than general-purpose ones (e.g. SLMs).
Applications
[edit]Edge AI is applied across multiple sectors, typically where latency, connectivity constraints, data privacy, or data-locality requirements are significant.[12]
Automotive: in-vehicle perception and driver-assistance rely on edge processors for real-time decision-making; industry analyses expect continued growth of in-car AI workloads.[13]
Healthcare and wearables: local AI is needed for monitoring and diagnostics when mission-critical, or when safety or regulatory constraints apply.[13][14][15]
Manufacturing and industrial IoT: on-premise AI for quality control, anomaly detection and robotics where connectivity to the cloud may be intermittent, slow, or unavailable.[13]
Consumer devices: on-device speech, vision and summarisation in smartphones, PCs and home devices (e.g. as demonstrated on current Apple and Android smartphones). [16][17]
See also
[edit]Edge computing Internet of things Machine learning Vector database
References
[edit]- ^ a b c Su, Weixing; Li, Linfeng; Liu, Fang; He, Maowei; Liang, Xiaodan (2022). "AI on the edge: a comprehensive review". Artificial Intelligence Review. 55 (8): 6125–6183. doi:10.1007/s10462-022-10141-4. Retrieved 4 November 2025.
- ^ a b c d e "ETSI GS MEC 003 V3.2.1: Multi-access Edge Computing (MEC); Framework and reference architecture" (PDF). ETSI. April 2024. Retrieved 4 November 2025.
- ^ a b c Hoffpauir, Kyle; Simmons, Jacob; Schmidt, Nikolas; Pittala, Rachitha; Briggs, Isaac; Makani, Shanmukha; Jararweh, Yaser (2023). "A Survey on Edge Intelligence and Lightweight Machine Learning Support for Future Applications and Services". ACM Computing Surveys. 55 (14): 1–38. doi:10.1145/3581759. Retrieved 4 November 2025.
- ^ a b Zhou, Zhi; Chen, Xu; Li, En; Zeng, Liekang; Luo, Ke; Zhang, Junshan (2019). "Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing". Proceedings of the IEEE. 107 (8): 1738–1762. arXiv:1905.10083. doi:10.1109/JPROC.2019.2921977.
- ^ "Efficient AI – TinyML". Fraunhofer Institute for Integrated Circuits IIS. 25 November 2024. Retrieved 6 November 2025.
- ^ a b c Wang, Xubin; Tang, Zhiqing; Guo, Jianxiong; Meng, Tianhui; Wang, Chenhao; Wang, Tian; Jia, Weijia (2025). "Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models". ACM Computing Surveys. 57 (9): 1–39. arXiv:2503.06027. doi:10.1145/3724420. Retrieved 4 November 2025.
- ^ Park, Taehwan; Lee, Geonho; Kim, Min-Soo (2025). "MobileRAG: A Fast, Memory-Efficient, and Energy-Efficient Method for On-Device RAG". arXiv:2507.01079 [cs.DB].
- ^ Pound, Jeffrey; Chabert, Floris; Bhushan, Arjun; Goswami, Ankur; Pacaci, Anil; Chowdhury, Shihabur Rahman (2025). "MicroNN: An On-device Disk-resident Updatable Vector Database". arXiv:2504.05573. Retrieved 6 November 2025.
{{cite journal}}: Cite journal requires|journal=(help) - ^ Pan, Junjie (2024). "Survey of vector database management systems". The VLDB Journal. 33 (5): 1591–1615. doi:10.1007/s00778-024-00864-x. Retrieved 4 November 2025.
- ^ Malkov, Yu. A.; Yashunin, D. A. (2018). "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs". IEEE Transactions on Pattern Analysis and Machine Intelligence. 42 (4): 824–836. doi:10.1109/TPAMI.2018.2889473. PMID 30602420.
- ^ Aumüller, Martin; Bernhardsson, Erik; Faithfull, Alexander (2020). "ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms". Information Systems. 87 101374. arXiv:1807.05614. doi:10.1016/j.is.2019.02.006. Retrieved 4 November 2025.
- ^ "Leading with edge: How to reinvent with data and AI". Accenture. 18 July 2023. Retrieved 6 November 2025.
- ^ a b c "The rise of edge AI in automotive". McKinsey & Company. 25 August 2025. Retrieved 4 November 2025.
- ^ "The widespread use of AI in medical devices: Utopia or merely a question of the procedure?" (PDF). PwC Germany. Retrieved 6 November 2025.
- ^ Rocha, A.; Monteiro, M.; Mattos, C.; Dias, M.; Soares, J.; Magalhães, R.; Macedo, J. (2024). "Edge AI for Internet of Medical Things: A literature review". Computers & Electrical Engineering. 116 109202. doi:10.1016/j.compeleceng.2024.109202. Retrieved 6 November 2025.
- ^ "Apple Intelligence". Apple. Retrieved 6 November 2025.
- ^ "AI on Android". Android Developers. Retrieved 6 November 2025.
Further reading
[edit]"Exploring the Intersection of AI and Edge Computing in IoT Applications: A Comprehensive Review and Future Directions" (PDF). International Journal of Research Publication and Reviews. 2025. Retrieved 4 November 2025.
External links
[edit]"The Critical Role of Databases for Edge AI". ObjectBox. 11 November 2024. Retrieved 4 November 2025.

- Promotional tone, editorializing and other words to watch
- Vague, generic, and speculative statements extrapolated from similar subjects
- Essay-like writing
- Hallucinations (plausible-sounding, but false information) and non-existent references
- Close paraphrasing
Please address these issues. The best way is usually to read reliable sources and summarize them, instead of using a large language model. See our help page on large language models.