💾 Archived View for oberdada.pollux.casa › gemlog › 2023-07-17_soundtools.gmi captured on 2024-08-25 at 00:20:51. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-07-22)

-=-=-=-=-=-=-

GEMINILOGGBOOKOBERDADAISTICUS

Minimal audio toolkit

To make music, you don't need any tools or instruments at all, think about all the music for choir or Steve Reich's Clapping Music. But assuming you are working in a computer environment and making some variant of electronic or electroacoustic music, a common working model revolves around a digital audio workstation (DAW) with a number of plugins. The plugins are mostly effects that process sound signals in realtime. There may be MIDI instruments that generate sounds, controlled by a sequencer, perhaps displayed in a piano roll notation. DAW programs tend to be big and bloated with functionality. They clutter the screen with tracks, mixer channels, numerous tiny boxes, buttons, menues and submenues. Dual screens may be a requirement for a reasonably smooth work flow. Getting something done often involves digging deep into manuals to see where everything is hiding and trying to understand fancy terms for what often amounts to simple signal processing operations. But it doesn't have to be that way.

An opposite extreme dates back to the early days of computer music when everything was done at the console. Some of these old programs are still available, and many of them remain immensely useful. For certain tasks, working in a shell is the most efficient way. For editing, precise waveform display is indispensable and spectral display may be an advantage. Mixing is so much easier with graphical sliders to move up and down than entering series of numbers representing gain factors. But most of what a modern DAW does has been done in the pre-GUI stoneage. I'll briefly describe some useful sound tools.

Csound

This is a domain specific language which would be awkward to use for general purpose programming. Its Hello World is to produce a one second sine at 440 Hz, and when you see how to do that you know the essentials of the language. I'm not dismissing its learning curve, even less so today than when the language was in its early stages (it evolved from the MUSIC N family of languages that were around in the 1960's). The evolution many programming languages suffer from is a feature creep with more and more key words and syntactic sugar added with each iteration. The basic structure of Csound is simple, there is a specification of the sound synthesis mechanism, the orchestra file, and a score file which specifies events in time. Signals of various kinds are given names, and the signals are routed between opcodes with inputs and outputs and parameters. The trouble with Csound is its ever expanding list of opcodes. I have found it sufficient to learn and use about a dozen of them, which means I don't have to look up their syntax and parameter specs each time. But now there are hundreds of opcodes which do very specialised things, including what you could achieve with combinations of more basic opcodes. When I first learnt Csound in the 1990's many of the current language features had not yet been implemented. I don't envy those who try to learn Csound today.

Since Csound is not a general purpose programming language, certain things such as creating custom loop structures or using arrays of arbitrary size may be possible, but the syntax makes it rather awkward. When Csound is used for algorithmic composition one really wants to write a score generating program in some other language. It is excellent for creating the equivalent of an effect plugin that reads a soundfile and outputs a processed version. Realtime processing is possible, although I have rarely used it.

CDP

Composers Desktop Project (CDP) is a collection of hundreds of small routines that process sound files. Trevor Wishart has been the main developper, and his aesthetics as a composer often shines through. One main idea is modularity: the output of one program can be fed into another program. The routines can be run from the command line, which I recommend, although two separate GUIs are also available. Modern DAWs have automation of parameters, which means that you specify time-varying parameters by drawing lines between parameter values at various points in time. CDP uses something called breakpoint files, which are text files consisting of one column of points in time in increasing order, and zero or more columns of parameter values. The routines take a list of command line arguments which may be numerical values, but in many cases you can substitute the name of a breakpoint file for a number. Arbitrary time-varying parameters greatly expand the range of possibilities.

CDP has many timedomain processes, including those that stretch or shrink the duration of a sound by granulation or other means. One of Wishart's specialities is the set of wavecycle processes that segment the signal into portions delimited by zero crossings and wreak havoc on the sound in various interesting ways. There are also spectral processing routines, which unfortunately I have not been able to get working.

CDP is notorious for having been conceived for old or small computers. It doesn't depend on a fast computer since nothing runs in realtime. There are even text based mixing routines, which may be too hardcore for most users. Wishart's inspiring book Audible Design explains most ideas implemented in the programs. The user manual is indispensable to understand what each routine does and what all the options and command line arguments do. Remembering the names of all routines, and remembering what the routines do, is the main challenge with CDP.

Miscellaneous software

Audacity needs no presentation. I have mixed and mastered a few tracks with it, but would absolutely not recommend it for more complex mix projects. It has been the one and only Linux sound editor since times immemorial, and also one of the few cross-platform compatible ones, so you have to learn to live with it. Yes, there are alternatives, some of which hardly deserve to be called sound editors, and others which aren't free and open source or available on all distributions.

FScape allows for some useful processing in the frequency domain. I haven't used it in a few years, but it is still being maintained and developped.

Praat is meant as a toolbox for phonologists, capable of the synthesis of vocal utterances. If you are going to use it for speech synthesis you'd have to be very patient, and intimately familiar with the anatomy of the vocal tract. This software has been around for decades. In the old days you had to write an email and politely ask for a copy and explain why you wanted it.

It's not so important exactly what software or audio programming languages you end up using. Efficiency comes from having a small set of versatile tools that you get to know intimately. The most extreme of course is to turn to programming in a general programming language. All you need is a compiler (or interpreter) and a few good ideas.

As a case in point, I once wrote a small program that takes one mono soundfile and computes its FFT in one single long window. This is then saved to a stereo soundfile with real and imaginary parts of the spectrum in the left and right channels. Now you can listen to the spectrum as a sound signal, beginning at DC and ending at Nyquist, or you can transform this soundfile in any number of ways and then transform it back to the time domain. You may also create a spectrum from scratch, write it to a soundfile, and convert it to the time domain. The possibilities are endless, albeit absolutely counterintuitive.

glog index

Main page