Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I dont' understand. Isn't this going to happen if you have multiple threads running even if the GIL is blocking them from running?

I don't think it would. If there is a GIL (Global Interpreter Lock) only one thread of the process can be scheduled to run at any time. As the poster (Sturla) says, Python threads are native OS threads so they should be scheduled by the OS kernel (right?). A good scheduler would use affinity scheduling and schedule all threads of the Python program on the same processor/core every time to get benefits from cached data and code. I believe modern kernels (Linux, MacOS, Solaris, probably Windows as well) use this kind of affinity scheduling, so if we're lucky the Python process gets scheduled on the same processor every time and there will be no need any cache synchronization.

> I'm not a hardware expert, but I'm not sure how constant locking would prevent cache synchronization just because they weren't truly running in parallel.

I'm not sure if you misunderstood the mail. The constant locking would only be used if they were running in parallel.

Anyway, if you have a GIL you don't need that kind of locking described in the mail. You only need to do explicit locking on shared data structures when you read or update the contents of those data structures. If you have reference counting, threads that run in parallel and no GIL you would have to lock even if you are just assigning a reference to such a data structure to a new variable. If you have a GIL you are certain that only one thread at a time are updating the reference count. That is indeed what the GIL is, one coarse lock for all data (and the interpreter) instead of fine grained locks for every data structure.

(I don't know Python very well, I just answer from general knowledge of computer architecture and language implementation. But I've read about the Python GIL several times, since it's the most discussed GIL of any language.)



I don't think there's anything to prevent more than one thread from being scheduled at any time. They just block when trying to run concurrently because of the GIL.

>I'm not sure if you misunderstood the mail. The constant locking would only be used if they were running in parallel.

No, I'm saying that the GIL is constant locking. You still have two threads being run concurrently on (possibly) two separate cores accessing the same cache lines. They just cannot actually run in parallel. I have no idea how the GIL time slices between the two threads, so what i'm saying is completely possible.

However, below my original post meastham correctly pointed out the GIL does prevent cache thrashing where updates to shared memory might go back and fourth multiple times unnecessarily. So it's not as bad as I was imagining.


Yes, you are right that the OS can schedule two Python threads to run at the same time, it's just that one of them will only run a few instructions and then block, just not any Python instructions. I hadn't really thought that through thoroughly, thanks for pointing it out.

Ah, I see now what you meant with constant locking. I interpreted your words as "constant" as in happening all the time as would be the case with fine grained locks instead of one long-lasting, global lock.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: