Neural network splitting for energy-efficient Edge-AI

Daniel May

Neural network splitting for energy-efficient Edge-AI

Förderjahr 2024 / Stipendium Call #19 / Stipendien ID: 7170

My thesis focuses on developing DynaSplit, a framework designed to optimize energy efficiency and latency for deploying deep neural networks in edge-cloud environments. The growing demand for intelligent applications on edge devices, such as IoT sensors and smartphones, poses significant challenges due to limited computational resources, energy constraints, and strict latency requirements.

DynaSplit addresses these challenges by jointly optimizing the split point of neural networks - deciding which layers run on edge devices versus the cloud - and hardware parameters such as CPU frequency and accelerator utilization. This dual optimization enables efficient resource utilization while ensuring performance meets Quality of Service (QoS) requirements.

The framework employs a two-phase methodology: an Offline Phase that uses multi-objective optimization to identify Pareto-optimal configurations, and an Online Phase that dynamically selects the best configuration for incoming inference requests. Experimental results on real-world AI models show that DynaSplit can reduce energy consumption by up to 72% compared to cloud-only solutions, while meeting 90% of user-defined latency constraints.

By enabling efficient and sustainable deployment of AI models in resource-constrained environments, DynaSplit demonstrates the potential of combining advanced optimization techniques with the flexibility of split computing.

Uni | FH [Universität]

Technische Universität Wien

Themengebiet

Artificial Intelligence

,

Distributed Systems

,

Energiemanagement

,

IoT

,

Künstliche Intelligenz /AI / Machine Learning

,

Mobile Devices

Zielgruppe

Öffentliche Hand

,

Systemintegratoren

,

Techniker:innen

Gesamtklassifikation

Diplomarbeit

Technologie

AI | KI

,

Client Server Applikation

,

Cloud Service

,

Python

,

raspberry pi

,

Sensorik

verwendete Open Source SW

TensorFlow

,

Optuna

,

gRPC

,

scikit-learn

,

pandas

,

psutil

,

Paramiko

Lizenz

CC-BY-SA

Projektergebnisse

Endbericht CC-BY

stip7170_Call19_Endbericht_V01.pdf
4.07 MB

Summary CC-BY

stip7170_Call19_Zusammenfassung_V01.pdf
94.49 KB

Blogbeiträge

DynaSplit balances energy efficiency, latency, and scalability for edge-cloud AI inference. Learn how it achieves up to 72% energy savings while meeting 90% of latency goals in real-world tests.

(20.01.2025 )
DynaSplit Evaluation: A Framework for Efficient AI Inference

DynaSplit: A framework optimizing AI inference on edge devices by dynamically splitting neural networks and configuring hardware. Balances energy efficiency and latency, enabling sustainable AI deployment in edge-cloud environments.

(03.12.2024 )
Optimizing AI for Edge Devices: The DynaSplit Framework

Ähnliche Stipendien

DEMon

Object detection AI models can be overconfident in their knowledge. Adding uncertainty estimation makes them more trustworthy. However, the increased computational cost provides a challenge for resource-constrained real-time edge applications.

Increasing Trustworthiness of Edge AI by Adding Uncertainty Estimation to Object Detection

Smart home equipment offers a lot of comfort in any apartment; it is little wonder that such gadgetry becomes more and more popular all over. Homeowners, however, are unaware of the huge number of threats...

Design of a Honeypot for Smart Home

AI has many positive effects and can bring many benefits both to the lives of individuals and to society as a whole. Nevertheless, the use of AI can affect fundamental and human rights, among other things by causing gender discrimination.

Impact of Artificial Intelligence on Women’s Human Rights

Eine Rakete in Form eines Gehirns, generiert von DALL-E

Das Ziel dieser Arbeit ist es, eine Plattform zu konzipieren und zu implementieren, welche die Ausführung von "Serverless" Funktionen im Bereich des Machine-Learning ermöglicht.

Efficient and Transparent Model Selection for Serverless Machine Learning Platforms