Supervised Distributed Computing

We introduce a new framework for distributed computing that extends and refines the standard master-worker approach of scheduling multi-threaded computations. In this framework, there are different roles: a supervisor, a source, a target, and a collection of workers. Initially, the source stores some instance I of a computational problem, and at the end, the target is supposed to store a correct solution S(I) for that instance. We assume that the computation required for S(I) can be modeled as a directed acyclic graph G=(V,E), where V is a set of tasks and (v,w) ∈ E if and only if task w needs information from task v in order to be executed. Given G, the role of the supervisor is to schedule the execution of the tasks in G by assigning them to the workers.

If all workers are honest, information can be exchanged between the workers, and the workers have access to the source and target, the supervisor only needs to know G to successfully schedule the computations. This means that the supervisor does not have to handle any data itself like in standard master-worker approaches, which has the tremendous benefit that tasks can be run massively in parallel in large distributed environments without the supervisor becoming a bottleneck. We also consider a setting, where a constant fraction of the workers is adversarial. Interestingly, we show that under certain assumptions a data-agnostic scheduling approach works in an adversarial setting without (asymptotically) increasing the work required for communication and computations.

Research Objectives

Overview of a supervised distributed computation

The goal of the proposed project is to demonstrate the power of supervised distributed computations under malicious behavior. The research within this project is divided into three parts:

Families of task graphs
Specific problems
Extensions

The objective of part 1 is to investigate important families of task graphs. We are particularly interested in the case that the task graph forms a list or a general DAG. Given some black-box verification mechanisms for the tasks, we intend to provide bounds for the runtime as well as the work the supervisor and the peers need to invest in order to execute the tasks in the given task graph.

The objective of part 2 is to come up with supervised solutions for specific problems. In the following, we list several problems that seem promising to consider in our model:

Matrix Multiplication
Sorting
LP-type probles
Problems considered in the k-machine model
Problems considered in the NCC model
Problems with efficient Divide-and-Conquer solutions

Parts 1 and 2 focus on supervised distributed computation under the assumption that the fraction of adversarial peers is sufficiently small. The objective of part 3 is to look beyond that. First, we will consider the case where the adversarial peers are in the majority. This reflects the situation that an attacker might start Sybil attacks. Second, we will consider the problem of emulating a supervised distributed computation in a pure peer-to-peer setting. Recall that if the fraction of adversarial peers is sufficiently small, it is possible to group the peers into quorums of logarithmic size, where each quorum has a sufficiently small fraction of adversarial peers. Thus it would be possible to emulate a reliable supervisor with a single quorum or a collection of quorums for better scalability.

Publications

Supervised Distributed Computing

J. Augustine, C. Scheideler, J. Werthmann, in: Lecture Notes in Computer Science, Springer Nature Switzerland, Cham, 2025.

DOI

Alle Publikationen anzeigen