I l@ve RuBoard Previous Section Next Section

2.5 Sorting by One Field, Then by Another

Credit: José Sebrosa

2.5.1 Problem

You need to sort a list by more than one field of each item.

2.5.2 Solution

Passing a comparison function to a list's sort method is slow for lists of substantial size, but it can still be quite handy when you need to sort lists that are reasonably small. In particular, it offers a rather natural idiom to sort by more than one field:

import string

star_list = ['Elizabeth Taylor', 'Bette Davis', 'Hugh Grant', 'C. Grant']

star_list.sort(lambda x,y: (
   cmp(string.split(x)[-1], string.split(y)[-1]) or  # Sort by last name...
   cmp(x, y)))                                       # ...then by first name

print "Sorted list of stars:"
for name in star_list:
    print name

2.5.3 Discussion

This recipe uses the properties of the cmp built-in function and the or operator to produce a compact idiom for sorting a list over more than one field of each item.

cmp(X, Y) returns false (0) when X and Y compare equal, so only in these cases does or let the next call to cmp happen. To reverse the sorting order, simply swap X and Y as arguments to cmp.

The fundamental idea of this recipe can also be used if another sorting criterion is associated with the elements of the list. We simply build an auxiliary list of tuples to pack the sorting criterion together with the main elements, then sort and unpack the result. This is more akin to the DSU idiom:

def sorting_criterion_1(data):
    return string.split(data)[-1]   # This is again the last name

def sorting_criterion_2(data):
    return len(data)                # This is some fancy sorting criterion

# Pack an auxiliary list:
aux_list = map(lambda x: (x,
                          sorting_criterion_1(x),
                          sorting_criterion_2(x)),
               star_list)

# Sort:
aux_list.sort(lambda x,y: (
   cmp(x[1], y[1])  or       # Sort by criteria 1 (last name)...
   cmp(y[2], x[2])  or       # ...then by criteria 2 (in reverse order)...
   cmp(x, y)))               # ...then by the value in the main list

# Unpack the resulting list:
star_list = map(lambda x: x[0], aux_list)

print "Another sorted list of stars:"
for name in star_list:
    print name

Of course, once we're doing decorating, sorting, and undecorating, it may be worth taking a little extra trouble to be able to call the sort step without a comparison function (the DSU idiom), which will speed up the whole thing quite a bit for lists of substantial size. After all, packing the fields to be compared in the right order in each decorated tuple and plucking out the right field again in the undecorate step is pretty simple:

# Pack a better-ordered auxiliary list:
aux_list = map(lambda x: (sorting_criterion_1(x),
                          sorting_criterion_2(x),
                          x),
               star_list)

# Sort in a much simpler and faster way:
aux_list.sort(  )

# Unpack the resulting list:
star_list = map(lambda x: x[-1], aux_list)

However, this doesn't deal with the reverse order, which you can easily obtain when passing a comparison function to sort by just switching arguments to cmp. To use DSU instead, you need to pack a suitably altered value of the criterion field. For a numeric field, changing the sign is fine. In this example, the sorting_criterion_2 that needs reverse sorting is indeed a number, so our task is easy:

# Pack a better-ordered auxiliary list yielding the desired order:
aux_list = map(lambda x: (sorting_criterion_1(x),
                          -sorting_criterion_2(x),
                          x),
               star_list)

For reverse sorting on a string field with DSU, you need a string-translation operation that maps each chr(x) into chr(255-x)—or an even wider translation table for Unicode strings. It is a bit of a bother, but you only have to write it once. For example, for plain old strings:

import string
all_characters = string.maketrans('','')
all_characters_list = list(all_characters)
all_characters_list.reverse(  )
rev_characters = ''.join(all_characters_list)
rev_trans = string.maketrans(all_characters, rev_characters)

Now, if we want to reverse the first sorting criterion:

# Pack a better-ordered and corrected auxiliary list:
aux_list = map(lambda x: (string.translate(sorting_criterion_1(x), rev_trans),
                          sorting_criterion_2(x),
                          x),
               star_list)

# Sort in a much simpler and faster way AND get just the desired result:
aux_list.sort(  )

# Unpack the resulting list:
star_list = map(lambda x: x[-1], aux_list)

2.5.4 See Also

The Reference Manual section on sequences, and the subsection on mutable sequences (such as lists).

    I l@ve RuBoard Previous Section Next Section