Förderjahr 2018 / Stipendien Call #13 / ProjektID: 3793 / Projekt: Data Management Strategies for Near Real-Time Edge Analytics
The paper with title “Architecturing Elastic Edge Storage Services for Data-Driven Decision Making” is presented at the ECSA conference in Paris, France, 9-13 September.
Following the blog series in April-June quarter, in this blog post, one can find a tooling overview for proposed engineering principles (Figure 1).
Based on the comprehensive research and analysis of requirements from three important aspects (see blog post Investigating Elasticity for Edge Storage Services), namely, edge data/system characterization, application context and edging system operations, for each principle we reveal needed tooling, that is, existing solutions, needed modifications or proposing new necessary approaches from different viewpoints:
P1: Many tools are proposed for monitoring cloud systems, for example, Prometheus, and Fluentd, but few able to monitor edge data metrics. New tools should be equipped with additional features including pluggable components for edge deployed systems such as Fluent Bit, suitable for highly distributed environments that have limited capacities, as a promising solution for necessary end-to-end metrics from data collection, to storage services and data analytics processes.
P2: A repository of available and pluggable microservices can speed up the DevOps of storage services by supplying needed utilities. Different microservices can be used to enable elastic activities, such as data cleaning, normalization and data integration. Some of the approaches for keeping most relevant and complete data in space-limited storage nodes might incorporate an adaptive algorithm for efficient edge storage management and an efficient mechanism for multi-technique recovery of incomplete datasets.
P3: Fogger tool could be used to support dynamic allocation and contextual location awareness of distributed storage resources in edge environment and featuring blockchain technology. Microservices-based design concepts, such as Edgex open source platform, might enable decentralized and independent data handling as well as reliable data integration supported by on-demand data services.
P4: Existing deploying tools like Docker Compose, Ansible, and Terraform, allow us to bundle and deploy stack of services but lacking optimization for edge environments. This requires us to leverage existing work and develop novel algorithms based on edge node characteristics.
P5: In order to combine different inputs, especially from distributed storage nodes, we need to provide approaches of dynamic configuration, runtime code change (like models @ run.time) and services mesh. In this context, software-defined means automation of actions and elasticity management at runtime, particularly, in utilizing core APIs for storage services.
P6: This principle requires considering novel mechanisms from (1) data viewpoint allowing IoT sensors to securely receive and perform actuation requests from edge nodes and (2) programmability viewpoint supporting actuation capabilities for remote IoT device programmability (for example, building standard actuation APIs).
P7: The approaches of push and pull data on-demand can be investigated for edge-cloud data transfer. Impact of different data representations can be considered as a good starting point to avoid excessive data traffic. There is need for a model to support secure data migration among multi-location data stores. Just as important from analytics viewpoint, we need to incorporate algorithms and metrics for scheduling and synchronizing ML model updates, since streaming data will evolve over time, posing challenges for ML-based systems.
Which services to use, adapt or develop, and how to utilize them for achieving elastic edge storage services, will depend on application context, the level of strict Quality of Service requirements and underlying infrastructure capacities. The evaluation requires taking a deep dive into a specific application and considering revealed dependencies.
This work is published in , as a part of the Lecture Notes in Computer Science book series (LNCS, volume 11681). The conference presentation is listed under the project results (Projektergebnisse) section of the main page of the project.
 Lujic I., Truong HL. (2019) Architecturing Elastic Edge Storage Services for Data-Driven Decision Making. In: Bures T., Duchien L., Inverardi P. (eds) Software Architecture. ECSA 2019. Lecture Notes in Computer Science, vol 11681. Springer, Cham