Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Before your jump into your code, grep it and change every instance of [:] into list or copy know that it isn't that easy. Most Python projects will use all four common variations of copying a list or object. Here they are with benchmark times[1]:

    b = a[:]           0.039ms
    b = list(a)        0.085ms
    b = copy(a)        0.187ms
    b = deepcopy(a)   10.592ms 
First method for short lists, eg. function or system args, where you know you have a list. The Python manual suggests this method when copying a sequence as the fastest/best method[2]

The type constructor list will convert any sequence into a list and will preserve order. If you pass it a list, all it does is return the sequence using the slice operator anyway[3]. It is slower because of the type checking, but it is implemented in C. So you can think of list() as just [:] with a type cast - no need to call it again if you know you have a list.

copy and deepcopy are implemented in python, and are generic functions that attempt to sniff the type of the object to be copied. They will use the __copy__ magic[4] of the object if it exists, so you can override it in your objects with return self[:]. You need to use these if you have a generator, a list of non-basic types (such as lists of lists, or lists of tuples, or lists of objects). Both functions use a module-level cache and deepcopy will iterate and apply copy

there is very little performance degradation by aliasing copy to deepcopy and using it everywhere, although it could save you time by catching bugs. (Edit: scratch that, I got my benchmark wrong - deepcopy will still be slow even if you pass it a shallow list, see comment below, thanks tedunangst)

Read the source of copy and deepcopy so you can understand them and can implement your own custom version for more advanced types. Find the file:

    >>> import copy
    >>> copy.__file__
    /usr/local/python/2.6/lib/copy.pyc
Each of these methods has its own use case, if you grep through a well implemented project such as Werkzeug[5] you can find how each is used efficiently. For eg. [:] is used when you know you have a list, such as template variables. list() is used to force into a list, eg. before these vars get to other objects and copy() is used on custom data types, making a copy of environ (which can contain almost anything) and in copying the routing table (which can not be trusted to be a list).

[1] Benchmark times taken from: http://stackoverflow.com/questions/2612802/how-to-clone-a-li... which I had bookmarked as a reference

[2] http://docs.python.org/faq/programming.html#how-do-i-copy-an...

[3] http://docs.python.org/faq/programming.html#how-do-i-convert...

[4] http://www.brpreiss.com/books/opus7/html/page85.html

[5] https://github.com/mitsuhiko/werkzeug



I'm puzzled by your comment that there's very little degradation using deepcopy everywhere. Your numbers demonstrate quite the opposite.


Thanks for noticing - I got my numbers completely off. When I ran the benchmark on my machine it turns out it was still using the original copy.

One way to catch deepcopy bugs might be to create an autocopy function which can detect if it is a 'shallow' object and use copy, or if not use deepcopy.

I am going to try and write an implementation that doesn't slow it down too much. It might be worthwhile since copy bugs are so common in Python projects.


I wonder whether it would be possible to optimise the Python interpreter to make deep copies copy-on-write. I suppose that would involve a lot of work for relatively little gain.


I remember that being mentioned in a PEP somewhere but it never got implemented. It might be worth implementing copy in C with copy-on-write to bring some of those benchmark numbers down.


Can you explain why `[:]` is faster for small list (10 elements) but `list()` is faster for larger list (100000 elements).

    ~$ python -S -mtimeit -s "a = list(range(10))" "a[:]"
    1000000 loops, best of 3: 0.198 usec per loop
    ~$ python -S -mtimeit -s "a = list(range(10))" "list(a)"
    1000000 loops, best of 3: 0.453 usec per loop
    ~$ python -S -mtimeit -s "a = list(range(100000))" "a[:]"
    1000 loops, best of 3: 675 usec per loop
    ~$ python -S -mtimeit -s "a = list(range(100000))" "list(a)"
    1000 loops, best of 3: 664 usec per loop




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: