ISH-767
Created by Mia Herkt
9 months ago
Mar 23 2025, 17:59 GMT+1
Updated by Mia Herkt
1 month ago
Nov 25 2025, 01:55 GMT+1
Evaluate: Perform all media processing (including images) via FFmpeg libraries

Pros:

  • Widely used, available basically everywhere
  • Fast
  • Covers image, video and audio formats, including metadata
  • Exposes things like rotation in stream side data so it can be handled explicitly
  • Lots of options for filtering, including GPU acceleration where available

Cons:

  • Somewhat poorly documented
  • API changes rather often
  • No good high-level bindings

Other thoughts:

  • Since many MP4/ISOBMFF files are not muxed for streaming, something worth considering is remuxing them to move the moov atom to the start of the file (like passing -movflags +faststart to ffmpeg) so that clients can start decoding immediately without having to do extra range requests or downloading the entire file.
  • When generating multiple outputs (e.g. thumbnail + webpublic), because of how much control FFmpeg gives us over the filter graph, we can most likely avoid having to perform some work like colorspace conversion twice.
  • libavfilter has support for image classification via DNNs, with results exposed through side data. This is very useful for moderation because it can be used to automatically flag potentially unwanted media for review. With 0x0.st I have been relying on automatic classification so much that I now consider it essential.
Avatar

libmpv is another option if maintaining FFmpeg wrappers is too much work. Transcoding is not its primary use case, but I’ve been using it for exactly that via a LuaJIT wrapper that took a couple hours to write. It’s quite high-level and doesn’t expose some features that might be of interest, but on the plus side the API is very easy to use and also very stable (didn’t see breakage in what I’m using in I don’t even know how many years).

Project
Iceshrimp.NET
Priority
Normal
N
Type
Feature
F
State
Untriaged
U
Assignee
Laura Hausmann
Avatar
Subsystem
Backend
B
Component
No component
Target version
Unscheduled
Released in version
Unreleased