Sharding/clustering : ISH-141

ISH-141

Created by Laura Hausmann

Updated by Laura Hausmann

Sharding/clustering

This lists the known issues with clustering (running multiple instances in a load-balancer setup, or even just running multiple queue processors). More are to be investigated.

The queue system's job retry handler (RecoverOrPrepareForExitAsync) doesn't handle crashes properly (especially with multiple queue processor workers)
WebSocket connections will only receive events processed by the node they’re connected to
So far, there is no option to start a node in pure "queue processor" mode
Cron tasks shouldn't be executed by all workers & should be executed in a way that's resilient to a worker going down
Push notifications get duplicated once per full/web worker (queue workers don't affect this problem) (maybe push notification delivery should be a background-task job in cluster mode?)
Duplicate work prevention (AsyncKeyedLocker) is not easily adaptable to multiple workers

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Type: Feature → Epic

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

Details changed: Expand ›

Laura Hausmann

State: Triaged → Won't fix

Laura Hausmann

Target version: v2025.1

Tyler

Maybe instead of the traditional clustering setup we could just have remote runners for converting images (libvips) and ffmpeg processes. Those would be the most demanding CPU tasks and being able to run them remotely would help scaling immensely.

You’d need to code in a way to send the content to the remote runner and then receive it back on the source server, for the instances that don’t use S3.

Laura Hausmann

Duplicated by: ISH-467 Multi-process web workers

Project

Iceshrimp.NET

Priority

Normal

Type

Epic

State

Won't fix

Assignee

Laura Hausmann

Subsystem

Backend

Component

Core services

Target version

Unscheduled

Released in version

Unreleased