Virtual Knowledge Graphs for Federated Log Analysis (ARES Conference 2021)

Paper @ARES Conference 2021 (29.08.2021)

Förderjahr 2017 / Science Call #1 / ProjektID: / Projekt: SEPSES

We are happy that our paper titled “Virtual Knowledge Graphs for Federated Log Analysis” has been accepted at the ARES conference 2021.

The paper introduces a novel approach to dynamically construct virtual log knowledge graphs directly from heterogeneous raw log files across multiple hosts. It furthermore contextualizes the results with internal and external background knowledge to enrich the results.

This has the advantage that log files can remain on the respective hosts without a priori centralized aggregation, processing, and materialization of log data. Only upon queries, the relevant log data gets processed, combined and shipped to the analyst.

The architecture of the approach is visualized in the following figure:

VKG for federated log analysis

Our approach comprises two main components:

1. Query Processor, a component that provides an interface to formulate SPARQL queries and distributes the queries among individual endpoints.

2. Log Parser, a component on each host, which receives and translates queries, processes log data, and sends the results back to the Query Processor.

The following figure visualizes the query translation mechanism. A SPARQL query is translated and mapped to the respected log properties from a specific log source, host, and time range defined in the query.

VKG SPARQL translation

Next, as depicted in the figure below, the selected log lines/properties are parsed and mapped into RDF, based on the respected log vocabulary. The constructed RDF log graphs are enriched with background knowledge and compressed into a compact RDF format (i.e. HDT) for further processing.

rdf log graph construction

Our evaluation shows that the log processing time is primarily a function of the number of extracted (relevant) log lines and queried hosts. For future work, we plan to improve the query analysis and extend the approach for streaming scenarios.

Kabul Kurniawan

Weitere Blogbeiträge

An ATT&CK-KG for Linking CybersecurityAttacks to Adversary Tactics and Techniques (ISWC P&D 2021)

The paper discusses an extension of our prior work namely Cybersecurity Knowledge Graph (CSKG) with adversary tactics and techniques, to support analysts in connecting log events to higher level attack steps. For this purpose, we developed a vocabula...

The SLOGERT Framework for Automated Log Knowledge Graph Construction (ESWC 2021)

SLOGERT is an approach to (semi-)automatically transform raw log data, i.e., textual records of system events, into RDF graphs following a sequence of processes. SLOGERT supports automatic identification of rich RDF graph modelling patterns to repres...

Automated Knowledge Graph Construction From Raw Log Data

Logs are a crucial source of information to diagnose the health and status of systems, but their manual investigation typically does not scale well and often leads to a lack of awareness and incomplete transparency about issues. To tackle this challe...

Cross-Platform File System Activity Monitoring and Forensics – A Semantic Approach (IFIPSEC 2020)

In this paper, we introduce a semantic approach for file system activity monitoring and forensics. We proposed a vocabulary (depicted in Figure 1) for file access information and implement an architecture (shown in Figure 2) for log acquisition, log ...

Report from the Semantics 2019 Conference

Semantics 2019 Trip Report

The SEPSES knowledge graph: An integrated resource for cybersecurity (ISWC 2019)

Resource paper @ISWC

Semantic Integration and Monitoring of File System Activity (Semantics 2019)

Semantics 2019 - Poster & Demo Session

SEPSES Cybersecurity Knowledge Graph (CSKG)

Im Rahmen des SEPSES-Projekts haben wir einen Cybersecurity Knowledge Graph (CSKG) entwickelt, der Informationen zu identifizierten Software-Schwachstellen, konzeptuellen Entwicklungsfehlern und Angriffsmustern aus verschiedenen öffentlich zugänglich...

Finding Non-compliances with Declarative Process Constraints through Semantic Technologies (CAiSE’19 Forum)

Declarative Process Constraints through Semantic Technologies