Playback UI mockup

23 January, 2008

Playback widgetry (Draft 1)

Here is a first draft mockup of a potential playback part of the UI. Notes:

  • Most of the search & browse etc. magic will appear in a sidebar when the button labelled “Browse >” is clicked.
  • I haven’t thought through the listview text and formatting yet, but I’m open for suggestions

Also: as I post more of these I’m gonna want comments, dammit! 🙂

Advertisements

For the sake of organizing my own thoughts, here is an attempt to distill the wordy audio player-related posts I made last year into a draft of a clearer set of goals.  It is still fairly undigested for now, I hope it will help to bring focus to the next stage of brainstorming and prototyping.

Here a draft , in no particular order:

  • Change assumptions about the roles of specific metadata fields:
    • Tracks ought to be incidental signposts within more interesting metadata groupings
      • Current players tend to [over]emphasize tracks as the only playback atoms in the UI
    • Use accurate terms in UI, and not 1337 h1pst0r mp3 slang
      • Please, ‘song’ -> ‘track’.  ‘Song’ is a logical subset of ‘track’.  Music tracks aren’t necessarily songs.  Audio tracks aren’t necessarily music.
  • Minimize disruptions to the user workflow with smarter shuffling, less time/fewer interruptions spent requesting music
    • More flexibly take advantage of metadata as simple basis for user-configured track grouping
      • “Random track” prevents appropriate track grouping, and “random album” is too rigid
      • Shuffled track grouping needs to be configurable, based at least on metadata rules; i.e., group Pink Floyd albums and classical pieces, but not Britney Spears
      • This would require good metadata to work well.
    • Try to imitate patterns in user playback requests; when shuffling, find track groupings with metadata similar to current playback [to avoid jarring style shifts] and/or to recent requests [to try to match user ‘mood’].
  • Be a team player as a desktop component – make it easy (e.g. through d-bus) for other applications to request playback
    • There ought to be a freedesktop.org “standard” for this..
  • Don’t reimplement general desktop infrastructure features, notably w.r.t.
    • Indexing; use beagle, tracker, strigi et al through xesam
    • File playback; use gstreamer
  • Maintain directed scope of application.  Monolithic do-everything applications are difficult to integrate nicely into a desktop.  Dammit, Jim, I’m a music playback app, not:
    • A tagger
    • An alarm clock
    • A portable music player manager
    • A wikipedia browser
    • A CD ripper
    • But not really sure about web radio etc…

I’ve been thinking by myself for too long about this, so please leave feedback and set me straight 🙂

I’m pleased to see there has remained some residual interest about the music player posts I made last year. Last summer, my friend Evan and I discussed the idea further and wrote some prototype code, but found that certain desirable pieces of the desktop infrastructure (notably xesam, but also improvements to existing components such as gvfs) were not yet mature enough for serious use.

Rest assured that I haven’t forgotten about this project, and more or less still hate iTunes. Progress since the summer on those desktop infrastructure projects gives me hope that it might become possible to begin more serious prototyping work later this year. If that happens, then I will dump my ideas here and hope for some guidance from everybody who reads this. Yes, that’s right – all 4 of you 🙂

I’ve gotten all sorts of good feedback since my friend ChipX86 was nice enough to add visibility to my post! 🙂 I had intended to avoid talking about any implementation details in that post, but many comments made me realize I had lost sight of that goal, so unfortunately my limited discussion of those details gave a few of you an understandably skewed view of how I’d like to approach this. Still, your comments were a helpful mix of suggestions, constructive criticism, and support. Thanks everyone, I appreciate it! There were enough comments that I’m not going to try answering all of them individually. My next steps will be writing post 2/2 about architecture (as promised) with many of y’all’s ideas in mind, possibly revising post 1/2 (or putting it on a wiki) to reflect my reaction to many of your concerns about my initial ideas, and then finally putting together my SoC application 🙂

As a college student studying engineering and music performance, and as a loonix geek, I’ve spent far too much time thinking about music players. Some of the first hacking projects I worked on in high school were tools to sensibly shuffle and play my music library. Now I’m more interested in the GNOME desktop, user interface design, and integration; I’d like to combine some of these ideas to take music playback on the GNOME desktop beyond iTunes.

In this frighteningly long post/article, I’m going to lay those ideas out in hopes of getting some helpful feedback. 🙂

State of the music playback and management: some context
First generation music players dutifully do their thing when the file manager tells them to. They show you the awesome metadata for each “song”¹. A point arrived a few years ago when people started realizing that a track’s metadata can be more useful than its file name in a single library, because file names are typically an inconsistent jumble that are redundant with tagged metadata anyways. Searching through directories full of this sort of file is a pain unless you’re obsessive enough (like, say, me or David) to go through and sort things using some sort of recursive renaming tool and then tweak by hand.

Apple’s iTunes took a step toward rectifying the situation. It integrated nicely with the iPod and that crazy internet shopping thing, but it also took the metadata feature a step further, and made the jumbled file names invisible to the user. It favored metadata instead. This turned out pretty nicely, and a whole slew of GNOME applications have imitated this idea to varying extents: rhythmbox is the most obvious (since its goal seems to be to replicate iTunes as closely as possible), but banshee, quodlibet, listen, and muine do too, somewhat.

Those projects have done a good job with that, but I think we can do better. Here’s why:

  • Beagle and Tracker already index music metadata. Why index them separately? Let’s not.
  • None of the existing GNOME music players integrates all that well into the desktop infrastructure (nautilus, menus, the panel..) yet. Let’s make this better.
  • Shuffling has always been horribly sloppy in every music player I’ve ever seen. This is not an exaggeration. We can make track selection more intelligent, and improve transitions.

¹ Pet peeve: this term is perpetually inaccurate in music-playing software. I don’t care how hip you’re trying to be, or what your target denominator is. A much less annoying word choice is “track.” A “song” is a specific term with a specific meaning. That meaning is not “the stuff in this sound file” any more than it is a measure of disk space. A symphony is not a “song,” a jazz tune is not necessarily a “song,” and anything that’s a dance is not a “song,” so please don’t write software that assumes every track in my library is a song. Thank you.

Bone to Pick #1: Metadata indexing
The second generation of music players was ahead of the desktop curve in terms of the metadata/tagging craze. Desktops are slowly evolving in that direction; in GNOME this is driven by the beagle and tracker indexing tools. If this trend continues, file names may no longer be relevant on desktops within a few years. Awesomely, these services also index music. Yay! This means we shouldn’t need to do that separately in music players any more, right? Jamie McCracken has stated plans to start implementing this via a tracker backend into rhythmbox soon, but in the meantime consider this (very nonscientific) survey of system resources consumed by some GNOME music players for my ~5000 track library:

  • Memory consumption on startup:
    • banshee: 62M
    • listen: 78M (!more than firefox with 4 tabs open to fancy JS pages)
    • quodlibet: 46M
    • rhythmbox: 32M
  • Disk space consumption of library index:
    • banshee: 7M
    • rhythmbox: 3M
    • quodlibet: 2M
    • listen: 7M

Bone to Pick #2: Shuffling
As I said above, this is horribly sloppy in every single player I have ever seen or used. Music players have supported this feature since they first appeared, so it’s kind of impressive to me that nobody does it right. A proper shuffler should be able to semi-intelligently group tracks together, and insert an appropriate brief delay between tracks and groups, pretty much like a DJ on the radio who might play more than one genre. Yet I’ve still only ever observed exactly two different approaches of grouping files for shuffling, and they are both rigid and simplistic, and I haven’t seen audio transition options in any of the GNOME players.

The most common approach (a la iTunes, rhythmbox, banshee, listen, and quodlibet) is to select a random track out of the library’s lnog list of tracks. With this scheme, it is inevitable (in my library certainly) that tracks are chosen from somewhere in the middle of a symphony, or that 30-second “Stop” track from The Wall, or a spoken-word segue in a Kanye West album, or another track that generally don’t make sense out of context. This drives me crazy. Unless your music library is made up of Brittney Spears-alikes and random singles (which is admittedly actually kind of common), this approach to shuffling will not be very satisfying, and will demand frequent intervention by the user.

The second common approach has been to select and play a random album (as in muine and quodlibet). This approach has drawbacks too: any tracks I have that are missing an album tag tend to get ignored. Further, if I start listening to certain sets end-to-end (e.g., any Beck albums, or Bach or Shostakovich Preludes and Fugues) I’m likely to go slightly crazy and move on.

It seems clear to me that rigidly enforcing either of these shuffling schemes doesn’t work well. I want to be able to group some albums (like The Wall) entirely together, split some into smaller groups (i.e., each prelude paired with its fugue), or even play every track separately (such as Beck albums). How to go about doing this? Most generally, different genres ought to grouped and treated differently by default, according to the traditions of that genre. Multi-movement classical pieces, modal jazz albums (hoo-ray Miles Davis!), and progrock typically require the context of the rest of the group in order to make sense. Folk music, most rock, big band jazz, and many ethnic musics are sometimes placed on an album for convenience rather than artistic cohesion, and users may be used to hearing them as singles on the radio, so there is often little reason for them to be grouped together. Clearly these defaults are rough stereotypes; it would still be necessary to provide a mechanism to tweak grouping individually, but it would be a start.

The last shuffling issue I’d like to address is that of song transitions. This is another situation where traditions vary by musical genre and artist. Typically in classical music, for example, you don’t want to add any extra delay in between movements, because it will fubar the flow, but every player I’ve used adds a delay while it loads the next file. On the other hand, it’s traditional in many popular idioms (and can sound rather nice) to let a finished song hang in the air for a second or two between tracks, or to crossfade (I understand this is possible in Amarok.. why shouldn’t GNOME players be able to do this?).

Bone to Pick #3: Desktop Integration
And now the fun part. Music playback has become a staple task for desktop computers. Desktop instant messaging developers recognized the same fact about their arena, and started the telepathy project, yet I haven’t seen anything like this for music playback.

I think we can avoid the monolithic (and frankly often sluggish) iTunes-alike music playback approach by adopting a decentralized dbus-based music infrastructure remeniscent of the telepathy design:

  • A playback application with a minimal interface. Its jobs could be to
    • Prioritize playback requests from other components (or let PulseAudio do this?)
    • Handle *all* aspects of playback using gstreamer
    • Higher-priority playback requests preempt lower-priority playback requests by pausing, fading or muting to background
    • Lower-priority playback requests queued for playback after higher-priority requests
    • Not need to know anything about music libraries
    • React intelligently to preemption requests (for new files or from PulseAudio)
    • Sit (in the panel?) and respond to control and information requests over dbus.
  • A shuffling app in to configure and generate pseudorandom/ambient playback (like a radio DJ) appropriately for the relevant style of music
    • Play at lowest priority; requests probably preempted in the player when specific files are requested by other components
  • A metadata search app to find and play specific sets/tracks – why not do this in nautilus using beagle or tracker?
    • Send requests to player when user opens a set or track
    • Inside music:/// URI in nautilus?
  • A podcast/internet radio browsing app – could even be firefox+google reader..
  • A tagging/metadata management app?
  • An iTunes store app?
  • A portable music player handling app?

Conclusions
Mostly in this post I’ve focused on what I’d like to do. In Part 2, I’ll discuss more about how I plan to do it. I’d love feedback from anyone who happens to read this. My secret plan is to apply to GNOME to start part of this as a Google SoC project!