Adding Replay Gain in Linux Automatically

UPDATE February 1, 2010: This post is now somewhat obsolete, as mp3gain supports writing to ID3v2 tags instead of APE tags. Read the warning below.

UPDATE July 17, 2010: WARNING: mp3gain version 1.5.1 (latest stable command line version) seems to corrupt ID3v2 tag information with the “-s i” option (write info into ID3v2 tags instead of default APE tags). For me, it seems to corrupt the JPEG image data inside the ID3v2 tag for some reason. If you care about the existing ID3v2 tags in your mp3 files, do NOT use mp3gain on them with the “-s i” option (the “-s a” option which is the default and only writes to APE tags, does not do any harm if you only care about APE tags). A good tool to check your tag info (ID3v2, APE, etc.) is with MP3 Diags (if you’re on Arch Linux, get it here). So for now, the method below is still my preferred way of doing things (mp3gain with APE and then ape2id3.py).

Replay gain tags in mp3, flac, and ogg files lets the player adjust the volume accordingly to make the song sound not too loud and not too soft. The Music Player Daemon (aka MPD), my favorite music player, recognizes replay gain tags if present. However, for mp3 files, the popular APE format for replay gain tags are not supported; MPD can only read ID3 tags for replay gain for mp3’s. Luckily, an mp3 file can have both APE and ID3 tags. This means that we can use mp3gain (a cute, simple command-line app available in pretty much every Linux distro) to add APE tags into our mp3s with:

mp3gain [file(s)]

This will add APE replay gain tags into the mp3 file(s) chosen. Then, we can use a simple script to read these APE replay gain tags, convert them into ID3 tags, and put these ID3 tags into the same mp3s. Such a script, thankfully, already exists here. Now, the mp3 file will have both APE and ID3 tags for replay gain values! And MPD will happily use the ID3 values.

If you’re in a hurry, you’d automate this process for entire directories, recursively. That’s what this most excellent Linux page on replay gain page suggests. (You should REALLY bookmark that page, as it has the best information hands down about replay gain in Linux.) I’ve modified the scripts for metaflac there to make it work with mp3’s instead. Here are the two scripts:

Script A:

#!/bin/bash
# Define error codes
ARGUMENT_NOT_DIRECTORY=10
FILE_NOT_FOUND=11
# Check that the argument passed to this script is a directory.
# If it's not, then exit with an error code.
if [ ! -d "$1" ]
then
    echo -e "33[1;37;44m Arg "$1" is NOT a directory!33[0m"
    exit $ARGUMENT_NOT_DIRECTORY
fi
echo -e "33[1;37m********************************************************33[0m"
echo -e "33[1;37mCalling tag-mp3-with-rg.sh on each directory in:33[0m"
echo -e "33[1;36m"$1"33[0m"
echo ""
find "$1" -type d -exec ~/syscfg/shellscripts/replaygain/mp3/tag-mp3-with-rg.sh '{}' \;

Script B (the ‘tag-mp3-with-rg.sh’ script referenced above in Script A):

#!/bin/bash
# Error codes
ARGUMENT_NOT_DIRECTORY=10
FILE_NOT_FOUND=11
# Check that the argument passed to this script is a directory.
# If it's not, then exit with an error code.
if [ ! -d "$1" ]
then
    echo -e "33[1;37mArg "$1" is NOT a directory!33[0m"
    exit $ARGUMENT_NOT_DIRECTORY
fi
# Count the number of mp3 files in this directory.
mp3num=`ls "$1" | grep -c \\.mp3`
# If no mp3 files are found in this directory,
# then exit without error.
if [ $mp3num -lt 1 ]
then
    echo -e "33[1;33m"$1" 33[1;37m--> (No mp3 files found)33[0m"
    exit 0
else
    echo -e "33[1;36m"$1" 33[1;37m--> (33[1;32m"$mp3num"33[1;37m mp3 files)33[0m"
fi
# Run mp3gain on the mp3 files in this directory.
echo -e ""
echo -e "33[1;37mForcing (re)calculation of Replay Gain values for mp3 files and adding them as APE2 tags into the mp3 file...33[0m"
echo -e ""
# first delete any APE replay gain tags in the files
mp3gain -s d "$1"/*.mp3
# add fresh APE tags back into the files
mp3gain "$1"/*.mp3
echo -e ""
echo -e "33[1;37mDone.33[0m"
echo -e ""
echo -e "33[1;37mAdding ID3 tags with the same calculated info from above...33[0m"
echo -e ""
# the -d is for debug messages if there are any errors, and the -f is for overwriting any existing ID3 replay gain tags
~/syscfg/shellscripts/replaygain/mp3/ape2id3.py -df "$1"/*.mp3
echo -e ""
echo -e "33[1;37mDone.33[0m"
echo -e ""
echo -e "33[1;37mReplay gain tags (both APE and ID3) successfully added recursively.33[0m"
echo -e ""

And here is the APE to ID3 conversion script from the link above (the ‘ape2id3.py’ script called from Script B):

#! /usr/bin/env python

import sys
from optparse import OptionParser

import mutagen
from mutagen.apev2 import APEv2
from mutagen.id3 import ID3, TXXX

def convert_gain(gain):
   if gain[-3:] == " dB":
       gain = gain[:-3]
   try:
       gain = float(gain)
   except ValueError:
       raise ValueError, "invalid gain value"
   return "%.2f dB" % gain

def convert_peak(peak):
   try:
       peak = float(peak)
   except ValueError:
       raise ValueError, "invalid peak value"
   return "%.6f" % peak

REPLAYGAIN_TAGS = (
   ("mp3gain_album_minmax", None),
   ("mp3gain_minmax", None),
   ("replaygain_album_gain", convert_gain),
   ("replaygain_album_peak", convert_peak),
   ("replaygain_track_gain", convert_gain),
   ("replaygain_track_peak", convert_peak),
)

class Logger(object):
   def __init__(self, log_level, prog_name):
       self.log_level = log_level
       self.prog_name = prog_name
       self.filename = None

   def prefix(self, msg):
       if self.filename is None:
           return msg
       return "%s: %s" % (self.filename, msg)

   def debug(self, msg):
       if self.log_level >= 4:
           print self.prefix(msg)

   def info(self, msg):
       if self.log_level >= 3:
           print self.prefix(msg)

   def warning(self, msg):
       if self.log_level >= 2:
           print self.prefix("WARNING: %s" % msg)

   def error(self, msg):
       if self.log_level >= 1:
           sys.stderr.write("%s: %s\n" % (self.prog_name, msg))

   def critical(self, msg, retval=1):
       self.error(msg)
       sys.exit(retval)

class Ape2Id3(object):
   def __init__(self, logger, force=False):
       self.log = logger
       self.force = force

   def convert_tag(self, name, value):
       pass

   def copy_replaygain_tag(self, apev2, id3, name, converter=None):
       self.log.debug("processing '%s' tag" % name)

       if not apev2.has_key(name):
           self.log.info("no APEv2 '%s' tag found, skipping tag" % name)
           return False
       if not self.force and id3.has_key("TXXX:%s" % name):
           self.log.info("ID3 '%s' tag already exists, skpping tag" % name)
           return False

       value = str(apev2[name])
       if callable(converter):
           self.log.debug("converting APEv2 '%s' tag from '%s'" %
                          (name, value))
           try:
               value = converter(value)
           except ValueError:
               self.log.warning("invalid value for APEv2 '%s' tag" % name)
               return False
           self.log.debug("converted APEv2 '%s' tag to '%s'" % (name, value))

       id3.add(TXXX(encoding=1, desc=name, text=value))
       self.log.info("added ID3 '%s' tag with value '%s'" % (name, value))
       return True

   def copy_replaygain_tags(self, filename):
       self.log.filename = filename
       self.log.debug("begin processing file")

       try:
           apev2 = APEv2(filename)
       except mutagen.apev2.error:
           self.log.info("no APEv2 tag found, skipping file")
           return
       except IOError:
           e = sys.exc_info()
           self.log.error("%s" % e[1])
           return

       try:
           id3 = ID3(filename)
       except mutagen.id3.error:
           self.log.info("no ID3 tag found, creating one")
           id3 = ID3()

       modified = False
       for name, converter in REPLAYGAIN_TAGS:
           copied = self.copy_replaygain_tag(apev2, id3, name, converter)
           if copied:
               modified = True
       if modified:
           self.log.debug("saving modified ID3 tag")
           id3.save(filename)

       self.log.debug("done processing file")
       self.log.filename = None

def main(prog_name, options, args):
   logger = Logger(options.log_level, prog_name)
   ape2id3 = Ape2Id3(logger, force=options.force)
   for filename in args:
       ape2id3.copy_replaygain_tags(filename)

if __name__ == "__main__":
   parser = OptionParser(version="0.1", usage="%prog [OPTION]... FILE...",
                         description="Copy APEv2 ReplayGain tags on "
                                     "FILE(s) to ID3v2.")
   parser.add_option("-q", "--quiet", dest="log_level",
                     action="store_const", const=0, default=1,
                     help="do not output error messages")
   parser.add_option("-v", "--verbose", dest="log_level",
                     action="store_const", const=3,
                     help="output warnings and informational messages")
   parser.add_option("-d", "--debug", dest="log_level",
                     action="store_const", const=4,
                     help="output debug messages")
   parser.add_option("-f", "--force", dest="force",
                     action="store_true", default=False,
                     help="force overwriting of existing ID3v2 "
                          "ReplayGain tags")
   prog_name = parser.get_prog_name()
   options, args = parser.parse_args()

   if len(args) < 1:
       parser.error("no files specified")

   try:
       main(prog_name, options, args)
   except KeyboardInterrupt:
       pass

# vim: set expandtab shiftwidth=4 softtabstop=4 textwidth=79:

It’s pretty simple. Script A just calls Script B recursively on every directory found inside the designated directory. Script B finds mp3 files, and first tags them with APE replay gain tags with mp3gain. Then, it calls the APE to ID3 conversion script above to add equivalent ID3 tags into those same mp3s. I’ve modified Script B so that it first deletes any APE replay gain tags already present in the mp3 file before doing the replay gain calculations — but this is optional. I also added a bunch of ANSI color escape codes to Script A and B so that they look prettier. These three scripts work beautifully well together with mp3 files inside directories. However, the directories MUST be album directories, as all mp3 files found in a directory are treated as having come from the same album for album replay gain tags (track replay gain tags are always independent on a per-file basis).

I should probably rewrite Script A and B in Python to make it easier to maintain — but everything is pretty simple as it is. If you have 1 big folder full of mp3’s from different artists/albums, then you could change the

mp3gain "$1"/*.mp3

in Script B into something like

for file in $mp3files
do
    mp3gain "$file"
done

This way, mp3gain is called separately for each mp3 file (instead of being called once for all mp3 files in the folder). Now you don’t have to worry about the mp3’s in that folder being treated as having come from 1 album for those album gain tags. To top things off, you should edit your shell’s config (e.g., .zshrc), and alias Script A to something easy, like rgmp3, so that you can just do

rgmp3 [directory]

to get this whole thing to work. Now run this command on your master mp3 folder, take a nap, and come back. All your mp3’s will now have both APE and ID3 replay gain tags!

I hope people find this useful. I’ve googled and googled countless times about replay gain in the past, and until I discovered the excellent link mentioned above a couple days ago, I could never really get replay gain tags working for my mp3’s.

Advertisements

How to quickly make a (sorted) playlist for mplayer

See my update on July 10, 2010, below (making use of the neat “-iregex” option)!

Mplayer uses a simple kind of playlist: a text file with the path and name of each file to be played. The path to the new file is relative to the location of the playlist file itself. So, you can do:

find -maxdepth 1 -type f -name \*.\* | sort > playlist

This will find all files in the current directory that have a “.” character in it, sort them, and put their paths into playlist (a text file). We have to use the “-type f” argument because we want files, not directories. If you only want flac files, then you can use \*.flac instead. If you have files named “FLAC” as well as “flac”, you can use the “-iname” option to match case-insensitively. If you want to search deeper down directory levels, just increase the maxdepth value, or leave out the maxdepth parameter altogether to search recursively.

But let’s say your music directory is a bit messy with different filetypes. You can find multiple filetypes by just appending “-o -[i]name expression“, like this:

find -maxdepth 1 -type f -iname \*.flac -o -iname \*.ogg -o -iname \*.wav | sort > playlist

This will find all flac, ogg, and wav files case-insensitively in the current directory. You might even want to make a shell alias for this command (in your ~/.zshrc, ~/.bashrc, etc.), so that you don’t forget:

alias fa='find -maxdepth 1 -type f -iname \*.mp3 -o -iname \*.flac -o -iname \*.ogg -o -iname \*.wav -o -iname \*.aac | sort > playlist'

Now you can type fa (find audio) to generate a playlist for (most, if not all) audio files in the current directory. The only risk here is that playlist already exists, but that likelihood is small enough to ignore. This is because the “>” operator replaces whatever content that was inside the “playlist” file with the output of the previous command. If you just wanted to append the results to an existing playlist file, you could just use “>>” instead of “>”.

To play the playlist, just do:

mplayer -playlist playlist

…where “playlist” is the file with all the songs in it. Be sure to include the full path to the playlist if you are currently not inside the same directory. To make the entire playlist loop forever, type:

mplayer -playlist playlist -loop 0

and it will loop the entire playlist forever. Unfortunately, putting the “loop=0” info in my ~/.mplayer/config file makes mplayer read that paramter first, and thus, only repeat the first file in my playlist forever. There seems to be no workaroud to this, except manually appending the loop paramter after the playlist parameter, as shown above.

More info here.

UPDATE November 12, 2009: I discovered a simple hack around the above mentioned problem about trying to loop the entire playlist. The solution is to remove the “loop=0” line from your ~/.mplayer/config, and instead make use of shell aliases. I use zsh, and this is what I have in my ~/.zshrc (the second alias is the key):

alias m='mplayer -loop 0'
alias mp='mplayer -loop 0 -playlist'

Now, to play any single file infinitely, simply use “m [file]”. To infinitely loop an entire playlist, simply do “mp “. You can use the “>” and “<” hotkeys to move around the playlist.

UPDATE December 31, 2009: Silently updated the post to include the sort command. Without piping things to sort, the playlist would be somewhat random in order (find probably neglects sorting in exchange for faster results). Also added some more advanced techniques.

UPDATE July 17, 2010: Instead of typing a bunch of “-o -iname”  flags, a simpler way is to just use a combined regular expression:

find -maxdepth 1 -type f -iregex ".*\.\(aac\|flac\|mp3\|ogg\|wav\)$" | sort > playlist