I have four different pipelines I use to generate video. This is the simplest, that I actually call ‘Simple Pipeline’.
Oddly enough, there are no videos on the channel that use the simple pipeline. There is a single video I posted on my personal channel:
Step 1 - MIDI
If my source file is not already a MIDI file, I use MuseScore to convert the source from MuseScore’s native
.mscz format to MIDI.
Step 2 - FLAC audio
MIDI files do not store audio, but just instructions to play specific notes on specific instruments. I use TiMidity++ to convert the MIDI file in a FLAC file containing real audio.
FLAC is a Free Lossless Audio Codec, somewhat akin to MP3, but lossless. For “normal” audio, FLAC files are really big, but the since the audio generated from MIDIs is unnaturally more consistent, it tends to compress very well.
Step 3 - Visualization
I used the previously mentioned MIDI visualizer to generate the visualization. This takes a settings file, and a MIDI file as input and produces a series of PNG files as output.
The settings file is mostly the same every time, but I do tweak the color selection where I think it would be appropriate.
The output can be quite… extreme. The visualization runs at sixty frames per second, which means I get a lot of 4K images. A ten minute video produces 36,000 for instance.
Step 4 - Final video
Finally, I bring it altogether using FFmpeg. It takes the source images and source audio and generates a video file. The format details are largely dictated by YouTube’s suggestions:
- Video - H.264
- Audio - AAC
- Container - MP4