Sorting in MySQL - Ignoring the ‘The’

At dagensskiva.com we don’t necessarily want to list artists by the first letter of their name, e.g. we want The White Stripes to be listed under W, not T. The following MySQL ORDER clause will perform such as sort.

SELECT artist FROM bands ORDER BY IF(LEFT(LOWER(artist), 4) = 'the ', LOWER(RIGHT(artist, LENGTH(artist)-4)), LOWER(artist)) ASC

Which will also ignore the case of the first letter after the The when sorting by transposing down to lowercase. I’m forcing to a hex conversion to deal with åäö, not sure how that will work in character sets other than UTF8.

My feeling is that there is a cleaner way to do this - do let me know if there is a prettier way of doing this.

Edit (01/01/2008): As Johan Dahlin pointed out, locale based sorting is not the work of the developer, but the database. Thanks for the help!

Tags: , ,

One Comment to “Sorting in MySQL - Ignoring the ‘The’”

  1. Johan Dahlin Says:

    Relying on the hexadecimal value to do locale specific sorting is actually wrong. In the case of swedish, which is the most (only?) relevant language on dagensskiva.com you would like to sort a..zåäö.
    The hexadecimal value for å in utf-8 is 229, but ä is 228. Add to that that upper case letter have lower hex values than lower case letter.

    The proper way to solve this is to specify a character set, it appears to be possible to do this on a global, database and table basis:
    http://mirrors-r-us.net/doc/refman/5.0/en/character-sets.html

    Hope this helps!

Leave a Reply