Task monitor daemon process may limit scalabilty

**Is your feature request related to a problem? Please describe.**

Currently updating the run status in the database involves sending a Celery signal that is picked up by a single task monitor daemon process that is spawned by the main application. As status updates may be numerous if many workflow runs are managed in parallel and status updates may furthermore contain long log messages, this architecture may impose a serious bottleneck for scaling up run throughput.

**Describe the solution you'd like**

To improve scalability, status updates could be handled by worker processes instead. A status update could be posted to the broker queue and picked up by a worker rather than the task monitor in order to update the database. To ensure that ongoing workflow runs do not block status updates (effectively causing the service to be stuck indefinitely), a dedicated worker pool of at least size would need to be set aside for this purpose.

**Describe alternatives you've considered**

As an alternative to setting aside a dedicated worker pool for status updates, status updates could also be handled directly by the worker processes that are already handling the workflow runs.

**Additional context**

It is important that the chosen solution will be conceptually compatible with a future callback mechanism for status updates (see #57, https://github.com/ga4gh/task-execution-schemas/issues/121, https://github.com/ga4gh/workflow-execution-service-schemas/issues/133 & https://github.com/ga4gh/cloud-interop-testing/issues/98#issuecomment-645485554).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Task monitor daemon process may limit scalabilty #194

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Task monitor daemon process may limit scalabilty #194

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions