I l@ve RuBoard Previous Section Next Section

5.12 Making a Fast Copy of an Object

Credit: Alex Martelli

5.12.1 Problem

You need to implement the special method _ _copy_ _ so your class can cooperate with the copy.copy function. If the _ _init_ _ method of your class is slow, you need to bypass it and get an empty object of the class.

5.12.2 Solution

Here's a solution that works for both new-style and classic classes:

def empty_copy(obj):
    class Empty(obj._ _class_ _):
        def _ _init_ _(self): pass
    newcopy = Empty(  )
    newcopy._ _class_ _ = obj._ _class_ _
    return newcopy

Your classes can use this function to implement _ _copy_ _ as follows:

class YourClass:
    def _ _init_ _(self):
        print "assume there's a lot of work here"
    def _ _copy_ _(self):
        newcopy = empty_copy(self)
        print "now copy some relevant subset of self's attributes to newcopy"
        return newcopy

Here's a usage example:

if _ _name_ _ == '_ _main_ _':
    import copy
    y = YourClass(  )    # This, of course, does run _ _init_ _
    print y
    z = copy.copy(y)     # ...but this doesn't
    print z

5.12.3 Discussion

Python doesn't implicitly copy your objects when you assign them. This is a great thing, because it gives fast, flexible, and uniform semantics. When you need a copy, you explicitly ask for it, ideally with the copy.copy function, which knows how to copy built-in types, has reasonable defaults for your own objects, and lets you customize the copying process by defining a special method _ _copy_ _ in your own classes. If you want instances of a class to be noncopyable, you can define _ _copy_ _ and raise a TypeError there. In most cases, you can let copy.copy's default mechanism work, and you get free clonability for most of your classes. This is quite a bit nicer than languages that force you to implement a specific clone method for every class whose instances you want to be clonable.

_ _copy_ _ often needs to start with an empty instance of the class in question (e.g., self), bypassing _ _init_ _ when that is a costly operation. The simplest way to do this is to use the ability that Python gives you to change an instance's class on the fly by creating a new object in a local empty class, then setting its _ _class_ _ attribute, as the recipe's code shows. Note that inheriting class Empty from obj._ _class_ _ is redundant (but quite innocuous) for old Python versions (up to Python 2.1), but in Python 2.2 it becomes necessary to make the empty_copy function compatible with all kinds of objects of classic or new-style classes (including built-in and extension types). Once you choose to inherit from obj's class, you must override _ _init_ _ in class Empty, or else the whole purpose of the recipe is lost.

Once you have an empty object of the required class, you typically need to copy a subset of self's attributes. If you need all of the attributes, you're better off not defining _ _copy_ _ explicitly, since copying all instance attributes is copy.copy's default. Unless, of course, you should need to do a little bit more than copying instance attributes. If you do need to copy all of self's attributes into newcopy, here are two techniques:

newcopy._ _dict_ _.update(self._ _dict_ _)
newcopy._ _dict_ _ = self._ _dict_ _.copy(  )

An instance of a new-style class doesn't necessarily keep all of its state in _ _dict_ _, so you may need to do some class-specific state copying.

Alternatives based on the new standard module can't be made transparent across classic and new-style classes in Python 2.2 (at least, I've been unable to do this). Besides, the new module is often thought of as dangerous black magic (rather exaggerating its dangers). Anyway, this recipe lets you avoid using the new module for this specific purpose.

Note that so far we have been talking about shallow copies, which is what you want most of the time. With a shallow copy, your object is copied, but objects it refers to (attributes or items) are not, so the new copied object and the original object refer to the same items or attributes objects. A deep copy is a heavyweight operation, potentially duplicating a large graph of objects that refer to each other. You get a deep copy by calling copy.deepcopy on an object. If you need to customize how instances of your class are deep-copied, you can define the special method _ _deepcopy_ _ and follow its somewhat complicated memoization protocol. The technique shown in this recipe—getting empty copies of objects by bypassing their _ _init_ _ methods—can sometimes still come in handy, but there is a lot of other work you need to do.

5.12.4 See Also

The Library Reference section on the copy module.

    I l@ve RuBoard Previous Section Next Section