(This can be applied to more categories than just "federation" and "not-federation", but to make an example of a prominent problem, i'll just be talking about those)
Laura mentioned the federation semaphore, which i'll assume is just a simple "only 40 active federation requests at once" mechanism, however, that wouldn't entirely be an "efficient" way of dealing with federation traffic.
Inspired by the traffic-prioritisation that I applied for a large mastodon server, it can also be possible to implement a "bucket" system, similar to semaphores, but;
- where multiple categories can take tokens from the same bucket
- different rules apply to the different categories
In a "regular" and "federation" traffic situation, the "regular" traffic can take tokens from the bucket, however, it has no limit of the amount of tokens it can take at once.
The "federation" traffic can only take tokens if there are X or less tokens taken.
This would make user traffic override federation traffic if there is an influx, without overloading the system.
The "X" in this case is a variable that can be tuned, to see what the average max amount of requests is that the system can take (lets say its x2 the cpu count).
To prevent complete outages, a certain amount of tokens keep being reserved for the federation category (1 or 3 or so), so that federation processing doesn't completely halt.
----
This can be applied to more than just "regular" and "federation" traffic; It can be subdivided into "fetching federation" and "pushing federation", or "fetching user requests" and "pushing user requests". The latter of which - in the inspired-from traffic-prioritisation method i applied - has absolute priority over everything else, as a user pressing favourite or posting something can feel "slow" incredibly quickly, and so must be the next request served if the server is under load.