My open-source machine learning toolbox

Comment on Mastodon

Introduction

I recently got interested into what's possible with machine learning programs, and this has been an exciting journey. Let me share about a few programs I added to my toolbox.

They all work well on NixOS, but they might require specific instructions to work except for upscayl and whisper that are in nixpkgs. However, it's not that hard, but may not be accessible to everyone.

Whisper

This program analyzes audio content of an audio or video file, and make a transcript of it. It supports many languages, I tried it with English, French and Japanese, and it worked very reliably.

Not only it creates a transcript text file, but it also generates a subtitles (.srt) file, you can create video subtitles automatically. It has a translation function which pass all the transcript text to Google translate and give you the result in English.

It's quite slow using a CPU, but it definitely works, using a GPU gives an 80 times speed boost.

It requires a weight to work, it exists in different sizes: tiny, small, base, medium, large, and each has an English only variant that is smaller. It will download them automatically on demand in the ~/.cache/whisper/ directory.

whisper GitHub project page

Stable-diffusion

This program can be used to generate pictures from a sentence, it's actually very effective. You need a weight file which is like a database on how to interpret stuff in the sentence.

You need an account on https://huggingface.co/CompVis/stable-diffusion-v-1-4-original to download the free weight file (4 GB).

a man on a horse, black and white

Solid Snakes on a unicorn in a cyberpunk style

stable-diffusion GitHub project page

stable-diffusion GitHub project page with openvino support for CPU based rendering

DeOldify.NET

This program can be used to colorize a picture. The weights are provided. This works well without a GPU.

I tried to use it on mangas, it works to some extent, it adds some shading and identify things with colors, but the colorization isn't reliable and colors may be weird. However, this improves readability for me 👍🏻.

a man on a horse, black and white but colorized with DeOldify

DeOldify.NET GitHub project page

Upscayl

This program upscales a picture to 4 times its resolution, the result can be very impressive, but in some situation it gives a "plastic" and unnatural feeling.

I've been very impressed by it, I've been able to improve some old pictures taken with a poor phone.

a man on a horse, black and white but colorized with DeOldify and upscaled with Upscayl

Upscayl GitHub project page

Going further

If you know some tools in that kind that could interest me, please share! :) Especially if it's something to colorize mangas 😁.