Bug #914
x64 mutex_enter only clears half of %rax, may block unneccesarily
0%
Description
Further looking around at the x86 mutex_enter, I notice that we clear %eax before issuing a locked cmpxchgq, cmpxchgq compares with %rax (of which we only cleared the low 32bits, via %eax), so we may not have actually 0d the register before the comparison.
I think this will push us into a slow path if any of the high 32bits of %rax are set, and we end up going through vector_enter long enough to find the lock is actually free (and always has been).
Updated by Rich Lowe almost 10 years ago
When searching for the processor manuals to confirm this theory, one of the hits was an old bugs.opensolaris.org bug (6958602) stating basically what I said above, and containing a DTrace script to test the unnecessary block theory.
The google cached copy is: http://webcache.googleusercontent.com/search?q=cache:_TH8yRL3TvYJ:bugs.opensolaris.org/bugdatabase/view_bug.do%3Bjsessionid%3D4cbbe2b16c2f2f9397445cead61c%3Fbug_id%3D6958602
Updated by Rich Lowe almost 10 years ago
- Status changed from New to Rejected
ah! xorl %eax, %eax will 0-extend and clear all of %rax. (which differs from how %ax and %eax are treated 16bit v. 32bit-ishly).