Incorrect disassembly of AArch64 SHA instructions?
Filing an issue for https://github.com/DynamoRIO/dynamorio/pull/4988/files#r785332308, in case a comment on an old PR gets lost. I think the AArch64 SHA instructions are mistranscribed. They use d
registers for several parameters. Based on https://dynamorio.org/page_aarch64_port.html, it sounds d
means the lower 64 bits, while q
means the whole register. However, most of these parameters are the full width. Checking just the first few:
SHA1C
The PR decodes Rd as q
, Rn as s
, and Rm as d
. However, Rm is described as <Vm>.4S
, not <Vm>.2S
, which totals 128-bit, not 64-bit.
SHA1H
This looks correct. The PR decodes Rd and Rn as s
, which matches the documentation. Indeed, the pseudocode only reads 32 bits.
SHA1M
The PR decodes Rd as q
, Rn as s
, and Rm as d
. However, Rm is described as <Vm>.4S
, not <Vm>.2S
, which totals 128-bit, not 64-bit.
SHA1P
The PR decodes Rd as q
, Rn as s
, and Rm as d
. However, Rm is described as <Vm>.4S
, not <Vm>.2S
, which totals 128-bit, not 64-bit.
If this is right, the other instructions probably also need a re-check. Is it possible all the V
registers were mistranscribed in this and other PRs?
Looking at dis-a64.txt, these tests seem to confirm this. Notice how the second column agrees that this is a 4s
, not 2s
, instruction. Meanwhile the test asserts a %d2
.
5e020020 : sha1c q0, s1, v2.4s : sha1c %s1 %d2 -> %q0
In contrast, the add
decode uses %d
for 2s
and %q
for 4s
.
0ea28420 : add v0.2s, v1.2s, v2.2s : add %d1 %d2 $0x02 -> %d0
[...]
4ea28420 : add v0.4s, v1.4s, v2.4s : add %q1 %q2 $0x02 -> %q0