5.12 Making a Fast Copy of an Object
Credit: Alex Martelli
5.12.1 Problem
You
need to implement the special method _ _copy_
_ so your class can cooperate with the
copy.copy function. If the _ _init_
_ method of your class is slow, you need to bypass it and
get an empty object of the class.
5.12.2 Solution
Here's a solution that works for both new-style and
classic classes:
def empty_copy(obj):
class Empty(obj._ _class_ _):
def _ _init_ _(self): pass
newcopy = Empty( )
newcopy._ _class_ _ = obj._ _class_ _
return newcopy
Your classes can use this function to implement _ _copy_
_ as follows:
class YourClass:
def _ _init_ _(self):
print "assume there's a lot of work here"
def _ _copy_ _(self):
newcopy = empty_copy(self)
print "now copy some relevant subset of self's attributes to newcopy"
return newcopy
Here's a usage example:
if _ _name_ _ == '_ _main_ _':
import copy
y = YourClass( ) # This, of course, does run _ _init_ _
print y
z = copy.copy(y) # ...but this doesn't
print z
5.12.3 Discussion
Python doesn't implicitly copy your objects when you
assign them. This is a great thing, because it gives fast, flexible,
and uniform semantics. When you need a copy, you explicitly ask for
it, ideally with the
copy.copy
function, which knows how to copy built-in types, has reasonable
defaults for your own objects, and lets you customize the copying
process by defining a special method _ _copy_ _ in
your own classes. If you want instances of a class to be noncopyable,
you can define _ _copy_ _ and raise a
TypeError there. In most cases, you can let
copy.copy's default mechanism
work, and you get free clonability for most of your classes. This is
quite a bit nicer than languages that force you to implement a
specific clone method for every class whose
instances you want to be clonable.
_ _copy_ _ often needs to start with an empty
instance of the class in question (e.g., self),
bypassing _ _init_ _ when that is a costly
operation. The simplest way to do this is to use the ability that
Python gives you to change an instance's class on
the fly by creating a new object in a local empty class, then setting
its _ _class_ _ attribute, as the
recipe's code shows. Note that inheriting
class Empty from obj._ _class_
_ is redundant (but quite innocuous) for old Python
versions (up to Python 2.1), but in Python 2.2 it becomes necessary
to make the empty_copy function compatible with
all kinds of objects of classic or new-style classes (including
built-in and extension types). Once you choose to inherit from
obj's class, you must override
_ _init_ _ in class Empty, or
else the whole purpose of the recipe is lost.
Once you have an empty object of the required class, you typically
need to copy a subset of self's
attributes. If you need all of the attributes,
you're better off not defining _ _copy_
_ explicitly, since copying all instance attributes is
copy.copy's default. Unless, of
course, you should need to do a little bit more than copying instance
attributes. If you do need to copy all of
self's attributes into
newcopy, here are two techniques:
newcopy._ _dict_ _.update(self._ _dict_ _)
newcopy._ _dict_ _ = self._ _dict_ _.copy( )
An instance of a new-style class doesn't necessarily
keep all of its state in _ _dict_ _, so you may
need to do some class-specific state copying.
Alternatives based on the new standard module
can't be made transparent across classic and
new-style classes in Python 2.2 (at least, I've been
unable to do this). Besides, the new module is
often thought of as dangerous black magic (rather exaggerating its
dangers). Anyway, this recipe lets you avoid using the
new module for this specific purpose.
Note that so far we have been talking about
shallow
copies, which is what you want most of the time. With a shallow copy,
your object is copied, but objects it refers to (attributes or items)
are not, so the new copied object and the original object refer to
the same items or attributes objects. A deep copy is a heavyweight operation,
potentially duplicating a large graph of objects that refer to each
other. You get a deep copy by calling
copy.deepcopy on an object. If you need to
customize how instances of your class are deep-copied, you can define
the special method _ _deepcopy_ _ and follow its
somewhat complicated memoization protocol. The technique shown in
this recipe—getting empty copies of objects by bypassing their
_ _init_ _ methods—can sometimes still come
in handy, but there is a lot of other work you need to do.
5.12.4 See Also
The Library Reference section on the
copy module.
|