Adds handling for AArch64 exclusive loads whose destination equals the result register of the paired exclusive store. For optimized same-block handling, without this fix our value comparison loops forever as it clobbers the original value by using the dead store result register.
The fix is to use a scratch register, but this fails for stolen register cases: thus we bail to the slowpath for those. We also have to re-order the compare to ensure we look at the matching case first, as the compare must write to the same result register.
Adds test cases to the ldstex test. Tested manually on an ARM machine.
Fixes #5247 (closed)