wav2png.py, son of wav2png
May 15th, 2008Last week I decided that for nightingale we need a new wav2png, and preferably one written in python, using the awesome python image library. After talking a bit to Ricard it was clear that using numpy and audiolab it would be a piece of cake. Well, a big piece of cake, but still. Once I got going, I went a bit overboard and decided that it would be nice to have a spectrogram of the sound as well, perhaps displayed when you move the mouse over the large image in the sound page.
It took me about 2 and a half days of coding and testing to make it robust (it needs to work for 5-sample wave files and 5-million-samples wave files) and looking good. Some sensible feedback from the guys at oneDot.only made me decide that we had to cut back on the number of colors in the waveform view. The current one looks really ugly in my opinion, so… that was changes as well. It’ll take a while for people to become accustomed to the new colors, but it makes sense to me. I threw in some vertical anti-aliasing for that extra slick look.
For those who don’t know what a spectrogram is, have a look at the wikipedia entry for it.
Without further ado, I present you some results. First of all my own “test” file, a sinusoid sweep:

and its spectrogram:

An FM percussion loop from walkerbelm:

and its spectrogram:

A bell sequence from ERH:

and its spectrogram:

You can find the full source code to generate these images in the nightingale repository ( http://github.com/bram/freesound/tree/master ), in particular look in the directory /freesound/utils/audioprocessing/
You’ll need to install python, numpy, PIL and audiolab to make it work. See above for the links.
Let me know what you think!
May 15th, 2008 at 3:39 pm
I followed your link to Wikipedia and ended up at a link for the short time fourier transform.
http://en.wikipedia.org/wiki/Short-time_Fourier_transform
There was an image there that I liked better than the spectograms you showed, though it has the same information. It is probably more compute intense though.
http://en.wikipedia.org/wiki/Image:Short_time_fourier_transform.PNG
I find the existing thumbnails just fine, though the new ones are slicker. The idea of being able to get a gestalt of a sound at a glance is excellent.
May 15th, 2008 at 4:23 pm
In response to wisslegisse:
Although the 3D images are cool looking, I find the 2D spectrogram much easier to read.
Nice work!
May 26th, 2008 at 12:34 am
Thanks for this!
I’ve been playing with some scripts but the fastest one took at least 50 seconds per file (one minute). This one just takes 5 seconds!
A question: from what I understand, this script assumes that the wav is a mono 44.1kHz?
May 26th, 2008 at 2:18 pm
ljvillanueva: as far as I know, it should work just fine with other sampling rates. The waveform display: definitely. The spectral view, that might fail, it hasn’t been tested. Experiment, and let me know, I’d say…
June 6th, 2008 at 5:21 pm
Why re-invent the wheel ?
See http://www.linuxbandwagon.com/image2wav/ for a python script (use http://psyco.sourceforge.net/ to make it run faster)
The BEST pictures come from The_vOICe http://www.seeingwithsound.com/javoice.htm if you are willing to run a Java Applet instead of using Python … This page shows a low-res view but you can adjust the parameters so it looks like the new picture at the bottom of http://en.wikipedia.org/wiki/Spectrogram
June 6th, 2008 at 5:51 pm
SumGuy: those programs do the inverse (convert sound into image). wav2png does the inverse: it plots a sound.
Psyco doesn’t speed up wav2png: as it already uses numy, it’s pretty hard to make it faster.
The author of image2wav could do the same (use numpy for his FFT) and would get the results a LOT faster.
June 14th, 2008 at 9:48 am
To follow up my previous comment, it assumes the wav has a sample rate of 44.1 kHz, otherwise the scale is wrong. I made some changes to get the scale in arithmetic (vs log) scale and to select the maximum frequency to draw. I’ve posted the script in my wiki.
June 19th, 2008 at 12:33 pm
i think it’s awesome !
wouldn’t you be able to make some simple apps for those who aren’t programmers ?
June 20th, 2008 at 2:20 am
I get a strange error running this.
[root@server ~]# ./wav2png.py
File “./wav2png.py”, line 54
will_read = num_frames_left if num_frames_left < frames_to_read else frames_to_read
^
SyntaxError: invalid syntax
This is on a CentOS 5.1 x64 box, python 2.4.3 default RPM installed. Any ideas?
June 20th, 2008 at 2:20 am
Edit to above: the ^ chatacter is right below “if” in the “_left if num_frames……” line.
June 24th, 2008 at 12:20 pm
the ternary expression is a feature of python 2.5… you’ve got 2.4.3 installed. Just rewrite the ternary expression:
a = A if C else B
is the same as:
if C:
a = A
else:
a = B
September 29th, 2008 at 7:59 pm
[…] has been up to a bevy of good work since we last checked in with them. This includes developing a beautiful successor to wav2png, changing their name to freesound.org, teaming up with Happy New Ears to develop an […]
December 17th, 2008 at 1:53 am
I found the code very helpful for a project where I needed some basic sound analysis, thanks alot.
January 18th, 2009 at 11:08 pm
Hi
The SVN URL doesn’t work any more since you have moved to git. I couldn’t find the latest version of wav2png in the git repository - is there any chance you could send me a link to it?
Thanks,
Mark
January 19th, 2009 at 2:40 pm
Mark, please see http://github.com/bram/freesound/tree/master
In particular: http://github.com/bram/freesound/tree/c71aa75126c06d87651c833b134dd2f7f4b2f137/freesound/utils/audioprocessing
May 26th, 2009 at 5:17 pm
This also depends on django and for me it MUST be launched with an uneven height (eg -h 257, NOT -h 256). Otherwise I get errors.
Unexperienced people like me should get audiolab from here: http://pypi.python.org/pypi/scikits.audiolab
May 26th, 2009 at 5:52 pm
The django dependencies can be removed quite easily as far as I know…
Let me know what kind of errors you get with even height!
May 26th, 2009 at 5:54 pm
actually, I just checked, are you sure you used the LATEST version, and not the checkin I was referring to in the last post? Go here: http://github.com/bram/freesound/tree/master and then browse to freesound > utils > audio processing, or alternatively, just use git to clone the repositlry!
May 26th, 2009 at 6:22 pm
Yes, I just realised that I won’t need django (commented out “from django.utils import simplejson” in processing.py).
The error is
==================
$ python wav2png.py somefile.wav
processing file somefile.wav:
Traceback (most recent call last):
File “wav2png.py”, line 46, in
create_wave_images(*args)
File “/home/hannes/ramdisk/freesound/utils/audioprocessing/processing.py”, line 440, in create_wave_images
waveform = WaveformImage(image_width, image_height)
File “/home/hannes/ramdisk/freesound/utils/audioprocessing/processing.py”, line 280, in __init__
raise AudioProcessingException, “wavefile images look much better at uneven height”
processing.AudioProcessingException: wavefile images look much better at uneven height
==========================
It’s the AudioProcessingException bit it does not like. If I replace it with a ‘print “error”‘ it works fine.
I am using Python 2.6 (I think), maybe that’s the culprit. It’s not like I know Python at all.
I am definitely using the latest version (grabbed a .tar.gz off GitHub).
Wonderful script. Thank you!
May 26th, 2009 at 8:18 pm
Ah, my bad, this error is raised by myself, as (as the error says) “wavefile images look much better at uneven height”! You only need the simplejson if you’re doing other things, like getting audio file information via those functions…
If you make any changes to the script, or use it somewhere public, please let me know!
May 26th, 2009 at 8:57 pm
Heh, well there you caught an amateur. I overlooked the “Exception” bit and thought it was supposed to simply print it as a warning. Thanks.
So far I made it convert any files I throw at it to WAV (in a ramdisk and yes, at the moment it would convert WAV to WAV…) and only output the waveform. That’s pretty ok for my copy’n'paste’n'fix approach.
This is so great to quickly scan an album for its loudness/dynamics.
Is the scale of the waveform graph always the same?
My goal would be to make it render clipping red (like Audacity can do). But if I will ever manage to do is questionable. Well, it’s for fun only.
November 10th, 2009 at 4:12 pm
I have searching the net for ages now trying to find some kind of script that I can run on a website that will scan uploaded files and create a waveform that can be used by a flash player.
can this be used in a php environment?
for an example, just listen to any track on http://www.djdownload.com and tehn check out the player.
looks great by the way!
Brad
November 10th, 2009 at 8:32 pm
If you have Python on your server and can install some additional modules this should work fine…
November 24th, 2009 at 5:28 am
thank you for sharing this!
January 14th, 2010 at 1:47 pm
Hi,
Are all the files in the audioprocessing dir (http://github.com/bram/freesound/blob/master/freesound/utils/audioprocessing/) required to make this work?
I am getting this error:
Traceback (most recent call last):
File “wav2png.py”, line 4, in ?
from processing import create_wave_images, AudioProcessingException
File “/tmp/processing.py”, line 55
will_read = num_frames_left if num_frames_left < frames_to_read else frames_to_read
^
SyntaxError: invalid syntax
January 14th, 2010 at 3:01 pm
If you are getting that error, it most likely is because you are using an older version (2.3/2.4) of python. Try updating to 2.6…
January 15th, 2010 at 4:45 am
Thanks Bdejong for you prompt reply.
I got python 2.6 installed, reinstalled PIL, Audiolab and Numpy because they didnt work anymore..
Now stuck on this error:
python wav2png.py input.wav
Traceback (most recent call last):
File “wav2png.py”, line 4, in
from processing import create_wave_images, AudioProcessingException
File “/tmp/processing.py”, line 29, in
import scikits.audiolab as audiolab
File “/usr/local/lib/python2.6/site-packages/scikits.audiolab-0.10.2-py2.6-linux-i686.egg/scikits/audiolab/__init__.py”, line 25, in
from pysndfile import formatinfo, sndfile
File “/usr/local/lib/python2.6/site-packages/scikits.audiolab-0.10.2-py2.6-linux-i686.egg/scikits/audiolab/pysndfile/__init__.py”, line 1, in
from _sndfile import Sndfile, Format, available_file_formats, available_encodings
ImportError: libsndfile.so.1: cannot open shared object file: No such file or directory
I installed libsndfile from source. No go.
Your help would be appreciated.
January 15th, 2010 at 5:23 am
Hi, I got it solved by:
export LD_LIBRARY_PATH=/usr/local/lib/