The goal for this project is to create a single server multiple worker distributed architecture that supports HTTP requests on the scheduler and translates them to GRPC requests on workers. Any application outside of the GRPC cluster can make HTTP requests to trigger jobs on the cluster and can also create distributed jobs as well.
The scheduler is a gateway to the worker cluster. Requests to the scheduler are made via HTTP, these requests are then translated into GRPC requests and proxied to the corresponding workers. Scheduler also manages records of known workers.
General Overview
- Support single scheduler and multiple workers.
- The scheduler must be aware of all workers' states. (requires the workers to register and deregister when going offline)
- The scheduler and the workers authenticate with each other via SSL.
- The scheduler and workers, both need to have GRPC-APIs.
The scheduler GRPC-API support:
- Register a worker
- Deregister a worker
The worker GRPC-API support:
- Start a job
- Stop a job
- Return the status of a job
- Return a stream of output for a running job.
The scheduler also has an HTTP-API:
- Start a job on a specific worker (with worker ID, command and path to the job)
- Stop a job on a specific worker (with worker ID and job ID)
- Query a job on a specific worker (with worker ID and job ID)
- Return the output stream of a job a specific worker (with worker ID and job ID)