Tales of software craftsmanship

Tales of software craftsmanship

A summer of transcoding for Amarok

It’s been GSoC season for over a month now and I haven’t blogged, so now I’m going to try to fix that. After last year’s Multilevel playlist sorting project, one of my proposals has been accepted again for GSoC 2010: I’m going to implement on-the-fly transcoding in Amarok.

Amarok is a music player and manager built around very general concepts of tracks and media sources. The collection tries to decouple the format from the data itself and presents the music as tracks (with metadata) rather than files. In other cases, music isn’t even stored in local files. These concepts, and others, allow one to truly rediscover music through seamless internet sources and media devices integration, and the user in fact doesn’t have to care where the actual data comes from. The many sources at one’s fingertips are accessible in a consistent way and playable from the playlist.

However, even in this day and age of stuff in the cloud, there are situations in which the user still has to worry about media formats, e.g. when acquiring new music, or copying existing music from one collection to another or from the collection to a portable music player. That’s where transcoding kicks in.

For example, one might have a quantity of Windows Media Audio files that should be transcoded to a more Free format in order to be usable in the future, or a quantity of Monkey’s Audio files, which, while lossless, are not well supported everywhere, especially in PMPs. And then of course, even if someone has a collection full of FLAC files, which is a reliable and Free codec, a conversion into a lossy format such as Ogg Vorbis or MP3 might be necessary for use with a PMP simply for reasons of storage capacity.

So my idea is this: whenever the user can copy files, give him or her the choice to either just copy, just transcode or transcode with custom options. That way, we cover both of the following use cases:

  • “I’m running late for a 4 hour train ride and I haven’t updated the music collection on my portable player, I need to quickly copy over my tunes while making sure they will compatible with the portable player”
  • “if I tweak the quality rating of the Vorbis encoder exactly the way I want it I’m going to save 1% of the space on my portable music player and still get the audio quality my sensitive ears deserve”.

The current situation is that the transcoding operation (in the strictest possible sense) works, so the next thing I have to do is integrate it nicely with Amarok’s existing collections framework. The current implementation uses FFmpeg, but I’ve placed FFmpeg-specific stuff in a wrapper class so something else could quite easily be used in the future if need arises.

The following screenshot represents the current state of the still quite unfinished transcoding GUI.

On a somewhat unrelated note, I’ve been to the  KDE Multimedia+Edu sprint in Randa, Switzerland.

It was a lot of fun and very productive too. I wish to thank the whole organizers team. Special thanks go to Mario Fux for his mad organizational skills, to the cooking team which I had the pleasure to share the kitchen with while preparing vegan stuff and to Knut Yrvin for arranging a much needed meeting with the Brisbane office of Nokia, Qt Development Frameworks regarding QtMultimedia and the future of Phonon. Finally, thanks Anne-Marie Mahfouf for a gift she gave me which allowed me to taste again something I like very much but haven’t been able to eat because of nickel allergy.

13 Comments

  • Reply Burke |

    That would be a nice feature, assuming Amarok detects portable music device. But Amarok most of the time does not work together with those. Not to speak about music CD’s. I almost every time have to use another tool in order to replay a CD or to refill my PMP. Horrible experience.

  • Reply Jack |

    You should encourage people to transcode between lossy formats. Most of the times transcoding is unnecessary. Users are convinced that they need to transcode some file to or from mp3 when in fact their player will play some other codecs just fine. Perhaps a database of the player’s capabilities and a warning system to the user saying that transcoding isn’t necessary.

    General user is convinced that converting between lossy formats (mp3, aac, wma, ogg) is lossless

    In the example screenshot you describe transcoding from a lossy format to another (Ogg Vorbis) as “High Quality”. That’s a cruel joke.

    You should check out what the guys in Gnome are doing Gstreamer and Rygel about transcoding.

    See: http://blogs.gnome.org/uraeus/2010/06/14/dlna-gstreamer-and-gst-convenience/

    Quote from that article: “No matter if you are doing automatic DLNA transcoding or manual transcoding to your device, you normally want to transcode as little as possible. Dumb transcoding (which is what my current Transmageddon profiles do, just transcoded the audio and video to a known working target, regardless of if either the incoming video or audio already was in an acceptable state, thus taking more resources and decreasing audio and video quality more than needed. With smart transcoding you instead cross check between input and output the possibilities and you figure out an optimal remuxing and or transcoding strategy. Thanks to gst-convenience this is not easy to do in Rygel and it will be easy to do in Transmageddon.”

  • Reply Jack |

    In the above should read “shouldn’t” in the following phrase “You should encourage people to transcode between lossy formats.”

    Sorry about that.

  • Reply Timo |

    I’d rather have a persistent setting so I won’t have to fiddle with it everytime I move files to my PMP. Maybe a middleground solution would be good, something like a dialog that asks if you want to use your stored settings or something else. It’d also be nice if it could be configured to only transcode certain types of files, i.e. only transcode if the format is unsupported by the PMP or if the file is lossless. Great to hear this is being worked on though, it’s pretty much the only thing I’m missing in Amarok 2.

  • Reply Pygmalius |

    The way Banshee implements this is that moving to a Portable device automatically transcodes into a suitable format if needed.
    It has sensible default encoding options (Kinda “mid-quality”, good compromise between size and quality), but user can edit these on a per format/device (not sure which as I only have one) basis.

    Which is similar to how the dialog box makes it look, but Téo also gave the option to just copy without transcoding :P.

    It might be good if it warned you about transcoding from lossy to lossy formats due to quality loss, but this might end up being annoying.
    One possibility would be to “rank” different formats and bitrates by quality and select a good default, for example just lower the output quality if the input isn’t good enough to merit it, and add a small warning if user sets it higher.

    Good luck with the project Téo anyway, I was actually tempted to apply for this myself but I was too young to go this year :P, nice to see someone is doing it, especially someone with experience working on Amarok

  • Reply zach |

    That would be excellent. Thank you very much for this player, I have tried all the others and none come close.

  • Reply dave |

    This is THE killer feature that I’ve been waiting for! I’ve has spent years managing multiple copies of my collection in both flac and aac. This, along with a few scripts is how I’ve been managing my iPod in Amarok since I switched to the 2.x series.

    I’d encourage you to make the transcoding process multi-threaded. I’m sure many Amarok users are using multi-core processors. I’ve an i7 which can happily transcode 8 songs concurrently. You can imagine the speed boost this would give copying 4000 songs to your iPod.

    Any idea when we can expect to see this feature appear in an Amarok release?

    • Reply Téo |

      We’re using FFmpeg for transcoding, so it’s already out of process with respect to Amarok, the threading is obviously handled by FFmpeg itself. For now we’re calling FFmpeg instances sequentially because it simplifies *a lot* the code in some places, but there might be room for improvement. When it gets released feel free to test how it behaves on multi-core CPUs and let us know 🙂
      I haven’t merged the transcoding branch yet but I plan to do it soon so it should be released with Amarok 2.4 in a few months.

  • Reply dave |

    Although ffmpeg certainly is capable of threading, it ultimately comes down the underlying library and in the case of MP3 (libmp3lame) I’m pretty sure ffmpeg will not multi-thread. In gtkpod I can specify an maximum number of threads and it will transcode up to that number of files concurrently.

    I have no experience with KDE programming but I rather like working wit Qt Threads. Being able to interface with threads through signals and slots removes a lot of minutiae.

    I just ran a few tests:
    flac -> mp3
    ffmpeg -y -i test.flac test.mp3 -threads 1 10.39s user 0.10s system 99% cpu 10.519 total
    ffmpeg -y -i test.flac test.mp3 -threads 2 10.52s user 0.09s system 99% cpu 10.637 total
    ffmpeg -y -i test.flac test.mp3 -threads 3 10.50s user 0.07s system 99% cpu 10.588 total
    ffmpeg -y -i test.flac test.mp3 -threads 4 10.46s user 0.09s system 99% cpu 10.579 total
    ffmpeg -y -i test.flac test.mp3 -threads 5 10.55s user 0.07s system 99% cpu 10.643 total
    ffmpeg -y -i test.flac test.mp3 -threads 6 10.44s user 0.05s system 99% cpu 10.509 total
    ffmpeg -y -i test.flac test.mp3 -threads 7 10.66s user 0.11s system 99% cpu 10.793 total
    ffmpeg -y -i test.flac test.mp3 -threads 8 10.69s user 0.09s system 99% cpu 10.800 total

    On a machine with 8 available cores, there is no advantage to be found from specifying threads to ffmpeg. On each of the above tests, htop showed one core pegged and the others idle. I’d venture that flac to mp3 conversions will be a large use case when this feature is added.

    Another test:
    flac -> aac
    ffmpeg -y -i test.flac test.aac -threads 1 6.74s user 0.07s system 99% cpu 6.826 total
    ffmpeg -y -i test.flac test.aac -threads 2 6.58s user 0.06s system 99% cpu 6.654 total
    ffmpeg -y -i test.flac test.aac -threads 3 6.58s user 0.08s system 99% cpu 6.671 total
    ffmpeg -y -i test.flac test.aac -threads 4 6.56s user 0.06s system 99% cpu 6.635 total
    ffmpeg -y -i test.flac test.aac -threads 5 6.56s user 0.06s system 99% cpu 6.640 total
    ffmpeg -y -i test.flac test.aac -threads 6 6.63s user 0.07s system 99% cpu 6.713 total
    ffmpeg -y -i test.flac test.aac -threads 7 6.55s user 0.09s system 99% cpu 6.652 total
    ffmpeg -y -i test.flac test.aac -threads 8 6.54s user 0.09s system 99% cpu 6.644 total

    It seems we can’t thread when converting aac either. I’ll bet mp3 and aac are the two most popular formats on an iPod.

    Last one:
    flac -> ogg
    ffmpeg -y -i test.flac test.ogg -threads 1 2.01s user 0.03s system 99% cpu 2.045 total
    ffmpeg -y -i test.flac test.ogg -threads 2 1.98s user 0.05s system 99% cpu 2.043 total
    ffmpeg -y -i test.flac test.ogg -threads 3 1.97s user 0.08s system 99% cpu 2.056 total
    ffmpeg -y -i test.flac test.ogg -threads 4 1.99s user 0.04s system 99% cpu 2.042 total
    ffmpeg -y -i test.flac test.ogg -threads 5 1.98s user 0.06s system 99% cpu 2.043 total
    ffmpeg -y -i test.flac test.ogg -threads 6 1.95s user 0.08s system 99% cpu 2.039 total
    ffmpeg -y -i test.flac test.ogg -threads 7 1.99s user 0.04s system 99% cpu 2.039 total
    ffmpeg -y -i test.flac test.ogg -threads 8 1.98s user 0.06s system 99% cpu 2.054 total

    I suspect we’ll see this among most audio formats. Here’s a better way to do it. The following script:
    #! /bin/bash
    time (
    for n in $(seq 1 8)
    do
    ffmpeg -y -i test.flac test$n.mp3 /dev/null 2>&1 &
    done

    wait
    )

    produces the following output:
    real 0m19.311s
    user 2m26.664s
    sys 0m0.787s

    (it even gets the fans spinning)

    I suspect the reason the above test takes about 20s instead of 10 (the speed of individual conversions) is because the machine isn’t truly an 8 core machine, just hyper-threaded. In summery, I think treading by spawning concurrent ffmpeg processes is a worthwhile endeavor, possibly at some point in the future.

    • Reply Téo |

      Thanks for the tests, this is good to know. Simply launching multiple ffmpeg instances shouldn’t be too hard and I’ll definitely look into it at some point.

  • Reply Cyril |

    Thanks for the feature. It has been out for a while now, but it is still one way only. I wouldn’t transcode anything from my PMP (those file are already vorbis) to my collection. At the opposite, it would be wonderful to be able to transcode the flac files from my local collection to vorbis on my PMP, but it’s still impossible yet. Do you plan to extend the feature to make it bidirectional?

    • Reply Teo |

      Right now I’m focused on some other Amarok related tasks, but I’m not excluding such a feature from being implemented in the future.

Post a comment