It only loops if you modify the string in certain ways partway through the loop. Is that a significant problem? As soon as you stop your indefinite loop of race-condition writes, this loop is guaranteed to finish.
Well it's trading "bad code can populate the program's string intern table with invalid string objects" for "bad code can instantly deadlock the program", which is not much of an upgrade. And wouldn't you need to do this in every function that uses more than 2 or more related mutable objects 1 time each, or uses 1 mutable object more than 1 time? Do you know of any systems that work like this?
This is basically a very poor man's version of software transactional memory. Noticing this is one step on the road to realizing that shared memory concurrency needs cooperative synchronization (and locks are just one way to achieve that), and most important of all, you should strictly limit the number of functions that need to synchronize at all, by strictly limiting the number of shared data objects.
I think the article OP and many in the comments here have taken the wrong lessons from this. I think the real lessons are:
1. In a program containing data races, one cannot assume objects obey their stated invariants.
2. Therefore, security/correctness in a shared memory concurrent system cannot be achieved if there is untrusted/unverified code (i.e., code that may introduce data races).
3. Regardless, it may pay dividends to try harder to learn from 1 and do better at validating input, especially in silently pervasively used shared state (e.g., the string intern table). Unfortunately, I think this will always be best effort.
I get it, you find the infinite loop to be distateful. Here's an alternative: Remove the loop wrapper. Add `throw new ConcurrentModificationException();` at the end. Don't modify the two helper functions. This way, the constructor only gets one chance to try doing the encoding. Change the Javadoc to say that if the array contents are modified while the constructor is executing, the function may (but is not guaranteed to) throw ConcurrentModificationException.
> it's trading "bad code can populate the program's string intern table with invalid string objects" for "bad code can instantly deadlock the program", which is not much of an upgrade.
A lot more harm can come from code doing the wrong thing or producing wrong answers than from deadlocks.
Are you really going to put while(true) around every nearly every access of mutable state in your programs? That is the implication I get from your comment. I really think you should think hard before trying to dig your way out of a hole like that.
It's not very different from grabbing a mutex on the string. Except that someone is more likely to hang a mutex forever than to make this code hang forever. So yes that's fine. Why wouldn't it be?
And "every time a string is copied" is pretty far from "nearly every access of mutable state".
Also unless the string-changing code has a very elaborate timing attack, won't it only make the constructor loop again about half the time at most?
This thread is honestly crazy to me. I've never seen so many people advocate for hot spin loops as a patchwork attempted solution to data races. Never mind have I never seen this recommended in any serious literature, I've never even seen it suggested in a random blog post, or in the worst code bases I've ever worked with (I have worked on some not great code in my professional experience in the past).
And the reason I am talking about mutable state, is because these issues are not specific to strings! Why do you think there's something special about these kinds of data races that make them specific to strings?
I really, genuinely hope you take some time to read more deeply into this topic, and talk with your peers about this some more, before you implement any of these ideas.
The fundamental lesson I think is that data races must be eliminated, and there is no solution that tolerates them or works around them. You must eliminate them from your code.
A key related lesson is that there is no unilateral way to do this for shared mutable state. If there is shared mutable state, 2 or more concurrent threads must co-operate in order to avoid data races (all threads must lock, or use atomics, or fences, etc.). "There is no unilateral solution", means there is no way to loop or lock or fence in just one thread, in order to fix or resolve or work around some other thread that does not.