A holistic Multi-service Multi-user Collaborative Inference Framework for EdgeAI


EdgeAI, a combination of AI and Edge Computing aims to bring intelligence to heterogeneous devices at the edge in order to enable Ubiquitous Intelligence. Edge AI inherits the benefits from Edge Computing such as low data transmission latencies compared to Cloud Computing, better utilised computing power, higher privacy, etc. Following this paradigm, a model that has been first trained on a cloud server is downloaded to an edge server, whereafter a user end device at the edge interact directly with the edge server to run model inference. The execution results are transmitted back to the user end device. Edge AI brings model-based services to the proximity of users, but with the problem that edge servers are commonly resource-constrained. This project aims to design a holistic Multi-service Multi-user Collaborative Inference framework on resource-constrained edge environments, which facilitates Edge AI (Artificial Intelligence) via:

  • Novel algorithms for edge server allocation to multiple decision-making procedures
  • Novel algorithms for partitioning large serviceable deep learning models to support Collaborative Inference
  • Novel algorithms for efficient batch inference
  • Novel algorithms for the resource utilization to increase availability and throughput, and
  • Integration of all necessary components to deliver a system that can fully support Multi-service Multi-user Collaborative Inference

People

Patents

  • Shen Jingran (沈静然), Georgios Theodoropoulos (乔治斯·泽奥多洛保罗斯), Niko-laos Tziritas (尼古劳斯·特斯里塔斯), “Collaborative optimization method, system, equipment and media based on edge computing”, Title in Chinese: “基于边缘计算的协同优化方法、系统、设备及介质”. CNIPA Application number: P23SZ1NN07509CN, Application date: December 26, 2023, Filed

Publications

  • Jingran Shen, Nikos Tziritas, Georgios Theodoropoulos, “Towards A Flexible Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms”, 13th IFIP International Conference on Intelligent Information Processing (IIP2024), 3-6 MAY 2024, Shenzhen, China, doi: 10.48550/arXiv.2312.06440