Playing with the FFT

02/03/2012

Long time ago (really long time ago) I was studying the Fourier Transform (http://en.wikipedia.org/wiki/Fourier_transform) in the faculty, and even I used the FFT (Fast Fourier Transform, http://en.wikipedia.org/wiki/Fast_Fourier_transform) algorythm in an application written in C for digital image processing. Since then I have not been worried about the FFT.

These days returned my interest for the FFT so, with the idea of learning audio programming techniques, I wanted to code a tuner or a spectral power analyzer with my Linux box (it means it should be a JACK client).

First of all you have to look what FFT algorithm could be used, GPL licenced, and googling a little bit I realise that FFTW (http://www.fftw.org/) is the election:

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). We believe that FFTW, which is free software, should become the FFT library of choice for most applications.

I was compiling few examples and became familiar with the calculations of the FFT. What I want to compute is the power spectral range of an audio signal, so the data will be real samples (one dimension) coming from my microphone; and the FFT output will be complex values, and both the real part and the imaginary part are important in order to compute the power spectrum of a signal.

fftPlan = fftwf_plan_dft_r2c_1d(fftSize, fftIn, fftOut, FFTW_MEASURE); //r2c_1d: real to complex, one dimension

Searching if there are open source tuners, preferably CLI and JACK compatible I found two projects that are precisely using the FFTW library:

And also it was useful for my own code the project capture_client:

I have divided the problem into two parts:

  • First of all I capture the signal coming from the microphone, and as a result I have two files: a wav file, and a text file where I write a sufficient number of samples.
  • Second: I open the sample file, apply a Hanning window to the samples for smoothing, compute the FFT, compute the power spectral range of the signal, and I write an output file ready to be processed with the graphics utility gnuplot (http://www.gnuplot.info/).

I never worked with gnuplot, the Matlab equivalent in Linux world. Running the gnuplot demo that quickly shows all the gnuplot possibilities, I was really impressed. In this case to draw the graphic is very easy:

$ gnuplot
gnuplot> plot “data_440_trumpet_output.txt”

The sample that I show was recorded with the 80′s mini keyboard Casio SK-8, choosing the trumpet sound. Here is the result:

Casio SK-8 440Hz (A) trumpet, 44100 fps by joanillo

Now I want to compare the results of my calculations with the ones obtained with Audacity and Ardour, just to verify that the calculations were correct:

My calculations and Gnuplot Audacity Ardour

Perfect! I got it!

jcapture is a small program written in C++ that records the signal from the microphone and write an audio file in wav format. Basically JACK client that automatically connects to system: capture_1, which is the physical port that represents the mic input of your sound interface, and thanks the libsndfile library, writes the input audio data to the buffer audio file.

jcapture also displays the input signal level graphically on the console (it is a console program, no graphical interface). If you are interested in the JACK API programming you are welcome to download jcapture:

Today is the 155th anniversary of the birth of Heinrich Rudolf Hertz (155 is not a special number… didn’t they anything better to tell?). To illustrate this event there is a flash movie in Google.com front page, a sine wave function really disastrous.

Everyone knows that the derivative of sine function is cosine, and therefore the slope at origin is 1,

sin ‘(0) = cos (0) = 1

If x and y axes have the same scale, it means that the slope when the sine function crosses the horizontal axis must be a line of 45 degrees. Since the cosine function takes values ​​between -1 and 1, the slope at any point can not be a vertical line!

Well, there are people drawing sine waves just linking semicircles … some mathematical background please …

jplay-sndfile is an application that I have been programming this week, and has a didactic purpose. Basically is an audio player that can also apply a frequency shift, and can also make a continuous sweep of frequencies. It is a JACK client (JACK, the Linux audio server), and is based, as a starting point, in sndfile-jackplay : you can find this piece of software within the utilities of the libsndfile library (http://www.mega-nerd.com/libsndfile/tools/# jackplay).

Everybody knows that if you play an audio file to twice the original frequency, it lasts half the time. Would be equivalent to taking a sample of two, and play at original sample rate.

Now let’s make the opposite case, we divide the frequency by two: the playback lasting time should be doubled. To do this, the easiest is to double each sample in two, and playback at the former sampling rate.

These two special cases, multiply by two and divide by two, are the easiest and most obvious cases. An analysis of these cases, studying the source code provided, can clarify many concepts related with Digital Signal Processing. However, is harder to process intermediate frequencies between 0.5 and 2, and harder still to make a continuous sweep of frequencies in a given range. This is what you can do and can be studied in this application that has a didactic character.

The output of the help option provides the following information:

$ jplay-sndfile -h

jplay-sndfile 1.00
Created by Joan Quintana Compte (joanillo)
joanqc arroba gmail.com – www.joanillo.org
Licensed under GPL v.3

jplay-sndfile is a JACK client intended for playing audio files (wav) and pitch shifting, written basically for learning, testing and educational purposes, and the first stage for future developments. A part of playing an audio file, you can change the pitch (between *0.5 and *2) of your audio file, you can play the audio file combing the pitch between two ranges. For testing is useful a sine wave, but remember that you can use any mono (one channel) audio files.
This Jack Audio client connects automatically to system:playback_1 and system:playback_2

usage: jplay-sndfile [-h] [[pitch-shift] | [pitch-shift-start pitch-shift-end]] wav-file

-h –help: this usage information
[pitch-shift] (0.5,2): shifting pitch
[pitch-shift-start] (0.5,2): shifting pitch start range
[pitch-shift-end] (0.5,2): shifting pitch end range
wav file: mono channel audio file

Examples:
./jplay_sndfile samples/hellosine.wav
./jplay_sndfile 0.65 samples/hellosine.wav
./jplay_sndfile 0.5 2 samples/hellosine.wav

The audio sample that follows is intended as a summary of what you can do with jplay-sndfile, and includes the following cases:

  • ./jplay_sndfile samples/test_44100.wav
  • ./jplay_sndfile .5 samples/test_44100.wav
  • ./jplay_sndfile .8 samples/test_44100.wav
  • ./jplay_sndfile 1.4 samples/test_44100.wav
  • ./jplay_sndfile 2 samples/test_44100.wav
  • ./jplay_sndfile .5 2 samples/test_44100x3.wav
  • ./jplay_sndfile samples/sine_440_44100.wav
  • ./jplay_sndfile .5 1 samples/square_440_44100.wav
  • ./jplay_sndfile 1 2 samples/saw_440_44100.wav
  • ./jplay_sndfile .5 samples/waves.wav
  • ./jplay_sndfile 2 samples/waves.wav

jplay-sndfile-examples by joanillo

One of the interesting things is to study the callback function of the JACK API, which has resulted quite short. Here is the core ot this application. Basically the callback() function is called by the audio server each time the audio interface (sound card hardware) needs to fill its buffer with more data.

static int process (jack_nframes_t nframes, void * arg)
{
jack_default_audio_sample_t buf ;
unsigned int i;
outs = (jack_default_audio_sample_t *)jack_port_get_buffer (output_port, nframes) ;

info_t *info = (info_t *) arg;

memcpy (outs_original, buffer2 + frames_counter_original, sizeof (jack_default_audio_sample_t) * nframes * DOUBLE_SAMPLES * 2);

int k=0;
float k2;
k2 = nframes/info->shift_pitch;
float var, var2 = 0;
int part_entera = 0;
var = 1/info->shift_pitch – 1;

for (i = 0 ; i < nframes ; i++)
{
if (info->shift_pitch < 1) {
if (frames_counter + i >= info->numFrames / shift_pitch_equivalent) { // shift_pitch_equivalent = info.shift_pitch quan no faig un escombrat
info->play_done=1;
return 0;
}
} else { // >= 1
if (frames_counter_original + i >= info->numFrames * DOUBLE_SAMPLES) {
info->play_done=1;
return 0;
}
}

if (info->shift_pitch > 1) {
k=(int)(i*DOUBLE_SAMPLES*info->shift_pitch + .5);
outs[i] = outs_original[k];
} else if (info->shift_pitch <= 1) {
outs[k] = outs_original[i*DOUBLE_SAMPLES];
if ((int)var2 != part_entera) {
outs[k+1] = outs_original[i*DOUBLE_SAMPLES+1];
part_entera = int(var2);
k++;
}
var2 = var2 + var;
k++;
}

}
frames_counter += nframes;
frames_counter_original += nframes * DOUBLE_SAMPLES * info->shift_pitch ;

return 0 ;
} /* process */

And what is all this about? As we have said this is a didactic application, used primarily to learn:

  • learning: Digital Signal Processing
  • learning: libsndfile API
  • learning JACK API, a typical example of a callback function

But a part of learning, we have an idea in mind in the next project: use a Wacom tablet, the ones that are used by graphic designers, to produce realistic sounds, sensitive to movement and pen pressure, but this is another project: Wacom gesture recognition and audio synthesis

Wacom Theremin

09/02/2012

The Theremin  is a musical instrument invented in 1918 by Léon Theremin, and has the particularity that is played by hand gestures. Theremis is considered one of the first electronic instruments, and his particular synthetic sound has been used many times in Science Fiction movies and special effects.

In this project we implement the way of playing and the sound of a theremin with a little conventional controller: a Wacom tablet, one of those tablets used by graphical designers. If you know how is played a theremin, the implementation is evident: X axis changes tone (Pitch Bend MIDI message; Y axis changes volume; and of course: pressing the pen produce sound (NoteOn MIDI message) and pen release stop the sound (NoteOff). Besides the XY position, wacom technology provides extra sensors like pressure and tilt information. Although not implemented in this project, we could have associated pressure and tilt to other CC MIDI messages (Continuous Controller), like modulation (vibrato effect).

We used a Roland JV-2080 synthesizer to produce a Theremin sound, althought I don’t think that I succeded the real theremin sound. The good thing about the JV-2080 is that has user patches that you can play with and lots of parameters, oscillators and effects to change if you have enough time. And the best of the JV-2080 is that you can configure pitch bend (tone shifting) with a very large range, not usuari in other synths (specially soft synths).

One of the things that I liked in this project is the integration succeded between the C language application developed (source code is available for download) and Gimp (the standard image editor in Linux world). This opens up many possibilities and ideas combining Gimp and sound effects and interactive applications.

Evidentment, a Wacom Theremin is not a new idea: other people have already made theremin’s with wacom and other controllers like a Wii… but in this long term project (50 Ways to Play Una Plata d’Enciam) could not miss a Wacom Theremin made with Open Source tools.


jplayfine (http://wiki.joanillo.org/index.php/Jplayfine) is my actual musical project in development. It consists in a play-along JACK client using a midi file. After performing the track you have a mark that says if you have been performing well or if you need more practice. To do so, the application must know in which midi channel the lead track is performing, and whis midi channel is your controller playing back. jplayfine is a JACK client that lives in the Linux Audio and MIDI ecosystem. I will speak about jplayfine when it will be released.

jplayfine uses an external sequencer for playing-back a midi file (tests have been made with jack-smf-player). Developing the project I encountered the necessity of parsing the midi file that contains the song (lead and accompaniement). I could use the smf.h library (the one that jack-smf-player is using). The truth is that, searching for a) simplicity, b) control of the source code, and c) looking for a deep understanding of the MIDI protocol, I’ve found mysel coding for a C++ SMF Parser. Here is version 1.01 (if anyone wants to test or use). There is a standalone application, and an application test that uses the library. Also, inside the midi/ folder there are several midi files that I used for testing purposes. For using this library you need a Linux C++ compiler (g++).

Project link: http://wiki.joanillo.org/index.php/Fitxers_MIDI_(SMF)._Format#smf_parser
MIDI protocol: http://www.sonicspot.com/guide/midifiles.html
Download smf_parser1.01


Una composición del mi padre: un Ave María, estamos hablando de música religiosa tocada con un órgano de iglesia. La partitura se ha transcrito en lilypond, se ha tocado con fluidsynth buscando un sonido de órgano que esté bien, y se ha grabado en Ardour. Más adelante convenceré a Rita para que cante encima, pues tiene letra (en latín, como tiene que ser). Puedes descargarte el zip, donde puedes encontrar todo el material, incluido el fichero lilypond y así se puede ver cómo es este formato de transcripción musical:

Descargar Ave Maria (mp3, lilypond, midi, pdf)

La parte melódica de la melodía tiene esta pinta en lilypond:

melodia = \relative c” {
\set Staff.midiInstrument = #”church organ”
\clef treble
r2 r r r r r r r r r r r r \mBreak
bes2 g4. g8 c2 bes ees4 d8 c bes4( g) bes2 c4 bes8 aes g4( ees) g2 bes4 aes8 g f2 g4 r ees’ d8 c \mBreak
bes4( g) bes2 c4 bes8 aes g4( ees) g2 bes4 aes8 g f2 ees \bar “||” g4.\f g8 a4 f bes2 a4 a8 a bes4 a d2 d4. d8 \mBreak
d4 cis c8 c c c c bes bes4 bes bes8 bes bes4 a~ a gis16( a bes) a e’4~ e16 cis a g g8( f) f4~ f e16 g bes( g) e2~ e4 cis’16( d e d) \mBreak
cis2~ cis4 r \bar “||” r2 r r r r r r r r r \mBreak
r2 r r \bar “||” bes2\mf g4. g8 c2 bes ees4( d8) c bes4( g) bes2 c4( bes8) aes g4( ees) g2 bes4 aes8 g f2 \mBreak
g4 r ees’4( d8) c bes4( g) bes2 c4 bes8 aes g4( ees) g2 bes4( aes8) g f2 ees ges4 ges8 ges f4 f ges4. ges8 \mBreak
f8. f16 f4 g aes8 bes c4 c a( bes8) c d4 d ees bes8 g c4 bes \mBreak
aes4( g8) f aes4 g g( f8 ees) f2 f4( ees8 d) ees2~ ees4 r \bar “|.”
}

Ave maria by joanillo

Estos días estoy cantando a Pere y a María una canción de cuna que tiene un significado muy especial. Es una canción de cuna que le cantaba mi abuela Montserrat a mi padre, y es realmente muy original, no es para nada conocida. Mi abuela nació en una masía en La Guàrdia, en la subcomarca del Lluçanès (Cataluña), una zona con entidad propia entre Puigreig y Vic. Precisamente, todo el folklore de la zona lo documentó Josep M. Vilarmau, allá por los años 40 del s. XX, que era precisamente primo de mi abuela, además de colega del folklorista Joan Amades. Hizo una obra de referencia para estudiar las canciones de la comarca <: Folklore del Lluçanès, y esta canción de cuna, sin título, precisamente no está.

Es una canción muy sencilla y dulce, como han de ser las canciones para hacer dormir a los niños. Mi padre la ha arreglado para piano, plasmando esta dulzura y simplicidad armónica. Yo sólo la he transcrito con lilypond y la he grabado con un buen sonido de piano. Así pues, mi reconocimiento para mi abuela (al cielo esté), a mi padre, y a todo el patrimonio inmaterial que representa el folklore de las tierras catalanas. La letra (traducción del catalán):

El ángel del sueño tiene las alas blancas,
tiene el cabello dorado y el vestido de plata.
El ángel del sueño que del cielo desciende.

El ángel del sueño al niño acompaña,
le mece en la cuna y los ojitos les cierra.
El ángel del sueño le besa y le canta.

Lilypond es un sistema de edición musical muy completo y que produce unas partituras de gran calidad. Eso sí, es un poco complicado de utilizar, aunque hay mucha información disponible. Yo sou un firme partidario. He transcrito la partitura manuscrita a lilypond, y aquí puedes ver el resultado:

Cançó de Bressol (descarga)

Lilypond, además de salida pdf, también produce la salida midi. Este fichero midi lo he hecho sonar con el sinte fluidsynth, que puede funcionar en línea de comandos, y que he cargado con un buen sonido de piano. Así de simple:

$ fluidsynth -l -a jack -m alsa_seq -g 1 /var/soundfonts/Musica_Theoria_v2_GM.sf2 /home/joan/lilypond/canco_bressol_v3.midi

El sonido que produce el fluidsynth se ha grabado en Ardour (el DAW de referencia en el mundo Linux), y sin más, sin añadir ningún efecto, lo he convertido a wav (y a mp3 con lame). Aquí puedes ver el resultado.
Cançó de bressol by joanillo

Espero que en un futuro esta canción se pueda ir transmitiendo de generación en generación y que quede com un bien preciado en mi familia.

Encontré por Internet la partitura del Toc de Castells. No sé exactamente de qué colla es, pues veo que cada colla toca el Toc de Castells de una forma un poco diferente. No tengo una gralla (sería equivalente a la dulzaina, en casa no me dejan tener una), pero sí que conseguí un flabiol de gralla, que tiene sonido de flauta dulce pero la digitación es la misma que la de la gralla. De esta forma puedes practicar com si fuese la gralla pero sin molestar a los vecinos. En algunas escuelas utilizan el flabiol de gralla en vez de la flauta dulce.

Esta pequeña grabación me ha servido para practicar algunos temas que tenía pendientes: la caja y los redobles están hechos con Hydrogen (me costó encontrar información de cómo hacer los redobles, pero al final lo conseguí). La secuencia de Hydrogen está grabada directamente en Ardour (escogiendo una buena librería de sonidos de batería), y he tenido que aclararme con el tema del Transporte (Ardour hace de Time Master, y JACK e Hydrogen le siguen). Todo está grabado con Ardour, con cuatro pistas: caja, redoble, flabiol 1 y flabiol 2. También he aprendido a hacer diferentes tomas (takes, playlists) en Ardour, para así poder escoger la que queda mejor; y hacer un punch en alguna zona concreta si se cree que se puede mejorar. Por otra parte, me he olvidado por fin de hacer las conexions en el QJackCtl y he hecho todas las conexions en Ardour, mucho más práctico y rápido.

Como efectos hay un reverb y compresión, que dan profundidad a la grabación y creo que la mejoran (y de paso disimulan alguna imperfección en la interpretación). En definitiva, otra grabación hecha con herramientas libres (JACK, Hydrogen, Ardour, GNU/Linux). El software libre al servicio de la música tradicional.

Hay una entrada de los Castells en la wikipedia en español, pero no en inglés. Podría ser interesante que alguien del mundillo casteller hiciese una entrada en la wikipedia en inglés.

Toc de Castells by joanillo

Pere tiene ahora 2 años y tres meses. Aprovechando la inercia de La Castañada y Todos los Santos he grabado de forma improvisada cómo canta La Castanyera.

Estamos muy acostumbrados a archivar las fotos, y también los videos, y a tenerlos más o menos clasificados. Con las fotos podemos ver cómo crecen los niños (Y cómo crecen!). Podría ser una buena idea coleccionar, archivar y clasificar ficheros de audio. También tienen un valor documental, aunque damos más importancia a las fotos y a los videos.

Por ejemplo, me he dedicado 5 minutos a buscar por el fondo sonoro de la Biblioteca Nacional de Catalunya y he encontrado esta grabación de los años 50 de la Cobla de Barcelona, la coral Sant Jordi y el director Oriol Martorell: Vacaciones en la Costa Brava [archivo sonoro]

Aquí va La Castanyera cantada por Pere, con 2 años y tres meses:

gravació Pere, 2 anys i tres mesos by joanillo

Page 1 of 41234»