Archive for the 'Tech Talk' Category

Follow Freesound 2 development on twitter

Thursday, May 14th, 2009

Hello all,

you can now follow freesound 2 development on twitter, just follow the user freesounddev: http://twitter.com/freesounddev/

If you’re not a developer you might not understand everything, as there are so called “commit messages”, but you might see things flying past you recognize ;)

- Bram

Freesound iphone/ipod touch application!

Wednesday, May 13th, 2009

Hello all,

Eric from nibblesoft wrote me a while back that he was working on a freesound application for the iphone / ipod touch. We’re quite happy with the pplication, it’s fun to see the first custom application on a handheld device! Have a look at the page on nibblesoft’s website, or alternatively, go directly to the itunes store. The application isn’t free, but Eric has promised us that he will donate part of the income to Freesound, which makes us very happy!

Some information about the app… Features Include:

  • Search freesound.org and listen to sounds.
  • Request any freesound.org sound to be emailed to you as a ringtone.
  • Record your own sounds and submit recordings to freesound.org over WiFi.
  • “Loop” sounds during playback.
  • Rate a sound.
  • Email a link to a sound.
  • Over 150 sounds come “pre-marked” as favorites.
  • Search your favorites list or recordings list.

Here are some screenshots:

Github and Lighthouse.

Sunday, January 25th, 2009

Quite happy with the switch from subversion to git, and quite happy with github. One of the big advantages of all the new “do one thing good” websites is that they, well, do one thing very good. So, looking for bug tracking, I found Lighthouse, another wonderful website that has “Ruby on Rails” written all over it. The fanboy overload of RoR is horriblme, but some people are truely doing some really amazing things with it.

Without further ado, here is the freesound 2 ticketing  service: http://freesound.lighthouseapp.com/

Please only use it for future freesound 2 tickets, not freesound 1 tickets.

From subversion to git.

Friday, January 16th, 2009

Following the overall exodus where people change from using subversion to git, we -much like lemmings- could not stay far behind.

All source code for freesound “2.0″ a.k.a. nightingale can now be found over at github:

http://github.com/bram/freesound/tree/master/

Update your links!

Meet Chiapas, the new database server.

Friday, December 12th, 2008

Meet chiapas:

HP Proliant DL360 G5
Processor: Xeon Quad Core E5440
Memory: 14 GB RAM
Dual ethernet
Dual power supply
Redundant ventilation system
Hard drives: 3 x 15000RPM, 72GB, SAS
Price: 4.797,42€

Chiapas is now happily running the Freesound.org database. It wouldn’t have been possbile to start using this machine without the help of Letusa ( http://www.letusa.es ). They kindly sponsored the aquisition of this machine and of another, even pricier one I will introduce once we start using it.

The previous server was completely out of breath when we installed this machine. This one is keeping up just fine:

http://iua-share.upf.es/ganglia/?c=Freesound.org&h=chiapas.upf.es

As you can see for yourself, Freesound.org has become snappy again and quickly responds to all your requests. Hurray for big, bad server machines! Oh and here’s one for people with hardware fetishes:

Chiapas

Chiapas on the left, Mystery Machine on the right.

Slow, slower, slowest…

Friday, October 17th, 2008

Hello all,

We know, freesound has been getting slower these days. But, help is along the way. Next week we will start initial testing of our brand new shining database server ( more about that later!). For those who like little graphics to spice up their daily lives, how about these. They represent the “server load” (how much the server is suffering) in the last year:

Our web server, doing relatively well:
http://iua-share.upf.edu/ganglia/?r=year&c=IUA&h=iua-freesound.upf.es

Our poor database server, slowly but very surely drowning:
http://iua-share.upf.edu/ganglia/?r=year&c=IUA&h=freesound-db.upf.es

You don’t need to be a genius to tell that that line is going up too fast for our own good. There’s just too many of you who want to use Freesound! Of course all of these problems will be solved with Freesound 2 a.k.a. Nightingale, but… we’re still heavy at work there. For now, switching to a new, shiny, 4 core, 14GB RAM machine will probably help. Again, more about those later! ;)

Your host for tonight,

- Bram

Testing Solr…

Monday, June 2nd, 2008

After looking around for a search engine for Nightingale, and comparing features between all the various ones (from using tsearch2 on postgres to Sphinx to Solr to …) I’ve settled on Solr. Configuring and running Solr was (much) easier than expected at the start. After about a day of hacking around, I got a nice tag-browser running, with Alax-ified searching through the tags. A bit more hacking around and I decided I would write a mash-up of all existing Python Solr wrappers. SolPython and solr.py, the one that’s included with Solr, seemed very unpythonic and little developed. PySolr on the other hand looked very nice, but there were some things in it I thought vould be better. Particularly, i wondered why the authors (two known Python/Django devs) used the XML parsing and didn’t use the JSON output. When you search in Solr, you can tell it to reply you in a number of output formats. They chose the XML output, I rewrote to use the JSON output, and allowed for more output parsers to be written / plugged in.

Neither Solr.py or PySol has classes for wrapping the search parameters. After reading through the docs I added a lightweight wrapper for a lot of the parameters.

We keep track of searches in freesound, so we can “replay” those searches for testing purposes, and after a bit of testing I found out some interesting things. Using Solr and a relatively heavy set of output features (I want to see a lot of “faceting”), I tested a batch of 100K searches. It looks like I can run 50 queries per second on my macbook pro. As the set of documents in Freesound is relatively small (”only” 50K sounds), everything fits very nicely in a very small cache (only 128MB), inluding all faceting data.

As before this source code is also open source, but -as Xavier gave me the go-ahead- this one is BSD instead og GPL. I will continue to release “support code” under the BSD license.

The code can be found here: http://iua-share.upf.edu/svn/nightingale/trunk/sandbox/solr/solr.py
The example code I used for benchmarking here: http://iua-share.upf.edu/svn/nightingale/trunk/sandbox/solr/freesound_test.py

Python/Solr people, feel free to send me any feedback!

wav2png.py, son of wav2png

Thursday, May 15th, 2008

Last week I decided that for nightingale we need a new wav2png, and preferably one written in python, using the awesome python image library. After talking a bit to Ricard it was clear that using numpy and audiolab it would be a piece of cake. Well, a big piece of cake, but still. Once I got going, I went a bit overboard and decided that it would be nice to have a spectrogram of the sound as well, perhaps displayed when you move the mouse over the large image in the sound page.

It took me about 2 and a half days of coding and testing to make it robust (it needs to work for 5-sample wave files and 5-million-samples wave files) and looking good. Some sensible feedback from the guys at oneDot.only made me decide that we had to cut back on the number of colors in the waveform view. The current one looks really ugly in my opinion, so… that was changes as well. It’ll take a while for people to become accustomed to the new colors, but it makes sense to me. I threw in some vertical anti-aliasing for that extra slick look.

For those who don’t know what a spectrogram is, have a look at the wikipedia entry for it.

Without further ado, I present you some results. First of all my own “test” file, a sinusoid sweep:

and its spectrogram:

An FM percussion loop from walkerbelm:

and its spectrogram:

A bell sequence from ERH:

and its spectrogram:

You can find the full source code to generate these images in the nightingale repository ( http://github.com/bram/freesound/tree/master ), in particular look in the directory /freesound/utils/audioprocessing/

You’ll need to install python, numpy, PIL and audiolab to make it work. See above for the links.

Let me know what you think!

Hey! Where did my bandwidth go to??

Thursday, March 6th, 2008

Bandwidth…

Freesound is using around 3 to 5 TeraByte per month these days, being capped at 2 MegaByte/sec. This upper limit is imposed by the university. UPF still runs a 100 mbps network as far as I know, which means that with 2MB/sec (=16mbps) we’re using 16% of the whole university bandwidth if we are pushing the upper boundary! That’s pretty much all the time now.

But, where does this bandwidth go to? Today I did an analysis of the last full 7 days to see which of the 4 would take up the most: sound packs, sound images (those colorful waveform displays), sound previews (the mp3 your hear when you hit play) or actual sounds! The total bandwidth used by these categories is 694.37GB, or split up:

  1. sounds 40.4% (280.63GB)
  2. previews 36.1% (250.63GB)
  3. packs 22.4% (155.61GB)
  4. images 1.1% (7.51GB)

First of all, we can completely ignore the images. They are not our problem. PNG is a very nice format for images with few colors, and our waveform display are just that. Other than that it looks like sounds, previews and packs are… well, more or less using the same amount of bandwidth. I’m a bit surprised that packs are only 22%, I thought people liked packs a lot more than files. Let’s see what happens if we split up the sound category:

  1. wav 77.1% (216.34GB)
  2. aif 9.2% (25,96GB)
  3. mp3 8.2% (22.88GB)
  4. flac 5.3% (14.98GB)
  5. ogg 0.2% (0.47GB)

People really like uncompressed sound and we have a lot of it at Freesound: wave files lead the way with a huge 77%!

Flac…

Free Lossless Audio Codec (FLAC) is a file format for audio data compression. Being a lossless compression format, FLAC does not remove information from the audio stream, as lossy compression formats such as MP3, AAC, and Vorbis do. (taken from http://en.wikipedia.org/wiki/Flac)

What would happen if we would compress all wave files to flac (thanks to Nico for bringing this up so I had an excuse to do this analysis)? Flac has an average compression ratio of about 0.6 (1MB wave file > 0.6MB flac file) so we would save about 550GB per month which is quite a bit!

On the other hand flac is a bit of a pain to support: many people don’t know flac and like to be able to use their wave files immediately after downloading. There’s no visual guides for people how to use flac, nor is there a cross-platform unified interface for using flac. A great format, but not very user-friendly if you don’t like the command line! Flac developers, prove me wrong and I will make freesound the biggest proponent of flac!

Freesound 2.0 - A.k.a. Nightingale

Monday, March 3rd, 2008

So, we have finally succombed to the pressure of web 2.0, we have been tagged, we tagged, we dugg, opened a flickr.com account and … created a blog for Freesound. Or should I say Nightingale? Well, let’s only use Nightingale codename for what I devised it for: the new and improved Freesound two point ooooh.

Xavier and I have talked multiple times about rewriting parts of Freesound, but when Google gave us a research grant, it was a godsend! He asked me if I would like to come “back” (not being able to move back to spain - staying in the same place - but in a digital kind of way anyway) to work for MTG on a freelance basis to work on Freesound. I thought about it, but being the good father to Freesound I try to be, how could I say no??

Mind you, that version of Freesound is still very far away! But we already know some of the things we really want in it. Here are the things I had said in the forum with some additional information added where needed.

  • All the code will be public and released under the gnu public license (v2 or 3, not sure yet). The code will be hosted on MTG’s code repository. You could ask, why not google code or sourceforge… well, I guess we could have done that but have opted for an environment we can easily control. As Jordi (our fearless sysadmin) said: what if you want to change to GIT (another “version control” tool for development) in a few months. Before anyone asks, the open source release will not include the “search similar” features of the website: this is a closed source technology, developed by MTG and sold by BMAT, MTG’s first spinoff.
  • Obviously we will add all the highly wanted features that we have been talking about for so long. This includes the support of 3 different licenses (public domain, attribution and attribution non-commercial), embedding of the sounds in other pages (add your freesound files to … your blog), adding images to your sounds etc etc. We will do another pass for “officially wanted features”, I’ll also go through the forum to collect everything in there.
  • The forum will be replaced by a hand-coded forum. This is good and bad: bad because it means more work, good because we don’t need to depend on phpBB anymore. PhpBB is great if you run a forum, but we run a website, and the forum is taking too prominent a place right now, it governs the whole site (the login system in freesound is based on phphBB). PhpBB being as omnipresent as it is is also highly plagued by spam. Another problem is that if you add hacks to phpBB (like… antispam) you need to re-implement your changes or do a very careful merge of the code. You could say, yes, but if you make a forum you will not have as many features and not as thoroughly tested software as phpBB. True, but … we don’t need all those features of phpBB and a smaller code base is easier to bugfix and maintain!! :)
  • All development will be done using the python framework called django. Aah, this is a difficult one. I’ve had many discussions with many people saying plone or typo3 or insert-favorite-cms-here is better. Frameworks versus content-management-systems is not my favorite subject, but I will say this: if you try to do things with a CMS that it wasn’t really made for you will end up fighting the CMS. And fighting that CMS will take you just as long as writing something from zero, in a framework with plenty of features (think RoR, django, turbogears, …). I just happen to know django, like python and I’ve got some experience with it.
  • Freesound has grown out of its initial parameters, so we will try to make the website more easily scalable. Freesound is taking up 2TB of bandwidth per month, and our database server is… suffering. We need some caching and we need it now. Greg has been working on some changes in the current freesound code to have more distributed bandwidth consumption over various squid servers all over the web. The plan is to roll out that changeset tomorrow (March 4th)! Dobroide has set up a test squid server in Sevilla. We will be blasting his computer into high gear in a few days. Poor CPU :)
  • A new and fresh skin/design most likely created in collaboration with oneDotOnly but this is still in discussion. Whatever the skin is, it will need to be cleaner than the mess we have right now. When my friend Javi did the initial skin for Freesound, who would know that we would abuse it so badly in the future. 3 years of adding features with a skin that was designed to last 1 year. Auch, sorry Javi!
  • Sound playback with soundmanager2. About 2 and a half years ago I found a bug in Mozilla’s flash plugin. I submitted the bug. 2.5 years have passed. Time to move on. Using this javascript + 1 flash instance per page plugin solves the problem (and in a very nice way!). It allows us to add as many sounds per page as we want without the flash player stopping playback. Have you ever noticed all sound goes nuts when you have 2 Freesound tabs open? That will be fixed.
  • Search capability with Apache Solr. Me and lately Gerard as well have been experimenting with Solr. Jordi wants to damn me to hell and back for wanting to use a java based server (”you’re going to make me install Tomcat? Grrr!” - maybe he is not so fearless!), but this thing is just too nice. A highly scalable search engine that will take care of all the annoying bits for us. In splicemusic we used SQL triggers to update data in a big search-table (which was probably not the smartest way either), but writing sql triggers can sometimes be… annoying. With django we can make a simple priority queue for documents that need updating (”hey, this document changed, you need to look at it again, Mr. search engine”). I’ve been looking around for other solutions but none seem as powerful (while staying easy-to-use) as Solr. Especially the faceting seems to be well worth it! Faceting is what you get in some shops: you searched for camera we found digital cameras (150), analog cameras (2), tripods (15), etc. Imagine this for freesound: you searched for bass drum, we found samples with samplerate 44100 (15), 44800 (10), samples in packs Bram’s Bassdrum Pack (28), Jovica Heavy Hitters Pack (58), …

So, to conclude: a lot of plans, a “todo” from here to the moon and back, but… it will be well worth it!