Netidee Blog Bild
Data provenance and the modern web.
Challenge and future application. (29.11.2017)
Förderjahr 2017 / Stipendien Call #12 / ProjektID: 2418 / Projekt: Decentralised Data Provenance based on the Blockchain

In complex, loosely coupled systems, as depicted in Figure 1, it is often hard to reproduce how a certain decision was made or how certain data was generated. In our example, the input data depicted in green passes through a hybrid system of human experts and Web services and produces some output data, depicted in violet. It is hard to build trust into this output data since it is not any more easily reproducible what happened to the input data and what influence the human experts had on it. Neither is the path which the data took in its lifetime easily traceable.

Figure 1: Complex, loosely coupled system.

To solve this issues, data provenance can be used to provide reliable information about those processes, decisions and outcomes (depicted in orange). By collecting provenance information about the process and the changes that occurred to the input data, we can build trust into the output data, as depicted in Figure 2. However, one major disadvantage of data provenance solutions is that you have to trust the provenance data store and its maintainers. This leads back to the initial problem of how to ensure trust in the data that is provided.

Figure 2: System with data provenance.

In our next blog entry, we will see how the Blockchain can help to solve the trust issue.


web services data provenance blockchain

Svetoslav Videnov

Profile picture for user svidenov
I am a software engineering master student at the TU Wien. My research interests are in distributed systems and microservice architectures. I am currently working at the TU Wien as a research assistant in the distributed systems group.

My master thesis aims to combine the advantages of the blockchain with data provenance. The blockchain is a distributed ledger which allows persisting data in an unchangeable way. Data provenance is an approach to track what happened to data and by this allowing to build trust into this data.
Diese Frage dient der Überprüfung, ob Sie ein menschlicher Besucher sind und um automatisierten SPAM zu verhindern.