💾 Archived View for glyphli.art › knowledge › audio › dsp › foundational-dsp-concepts.gmi captured on 2024-07-08 at 23:44:24. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-02-05)
-=-=-=-=-=-=-
This article is a cheatsheet of sorts for foundational digital audio concepts to help people without a background in audio technology understand the rest of this section of the knowledgebase. As such it needs to be as clear and consise as possible, so if you have any suggestions for improvements I'm all ears!
Computers generate sounds by moving the cone of a speaker over time, and in most cases said cone's position is represented by a floating point number from -1.0 to 1.0 with 0.0 being the resting position of the cone.
It's possible (and in fact extremely common) to represent audio using integers by making the assumption that the lowest value the int type can store is analogous to -1.0, and vice-versa for the maximum value, but in order to be able to perform any signal processing the int representation must be converted back to floating point due to DSP maths' reliance on multiplication with decimal values.
Uncompressed audio is simply a long string of speaker position values, referred to as samples, in chronological order. There are two dimensions of resolution to take into account with audio files, bit-depth (the size of an individual sample in bits) and sample-rate (how many samples should be played back over the course of a second), with the former dictating how accurately you can store speaker positions and the latter dictating the highest frequency you can store (we'll get into why in a bit).