ARM unallocated encodings are not handled transparently
Created by: egrimley
Test program:
.arm
.global sigill
.type sigill, %function
sigill:
.inst 0xf2800f00
b .
#include <signal.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void sigill(void);
void handler(int signum, siginfo_t *info, void *ucontext_)
{
uint32_t inst = *(uint32_t *)info->si_addr;
printf("%d 0x%08x %d\n", signum, inst, info->si_code);
exit(0);
}
int main()
{
struct sigaction act;
memset(&act, 0, sizeof(act));
act.sa_sigaction = handler;
act.sa_flags = SA_SIGINFO;
sigaction(SIGILL, &act, 0);
sigill();
return 1;
}
Native output:
4 0xf2800f00 1
DynamoRIO with debug:
<Invalid opcode encountered>
<Application .../udf.exe (17571) DynamoRIO usage error : instr_decode: raw bits are invalid>
<Usage error: instr_decode: raw bits are invalid (.../core/arch/instr_shared.c, line 1527)
DynamoRIO without debug:
<Application .../udf.exe (17572). DynamoRIO internal crash at PC 0xab0578f6. Please report this at http://dynamorio.org/issues/. Program aborted.
Received SIGSEGV at pc 0xab0578f6 in thread 17572
When DynamoRIO thinks it has encountered an unallocated encoding there are several things that might have happened:
-
The app was supposed to do this.
-
The app has gone wrong, perhaps because of an unrelated bug or limitation in DynamoRIO, or perhaps the app is just buggy.
-
There is a bug or omission in DynamoRIO's decoder.
-
An instruction has been added to the architecture and DynamoRIO is out of date.
Therefore, probably the right thing for DynamoRIO to do is:
-
With DEBUG, print a warning that an unallocated encoding has been encountered, giving the address and the encoding (but not if it is an officially undefined UDF instruction).
-
Assume optimistically that the instruction is not a branch and does not use the stolen register so it can be safely copied into the fragment cache and executed. It may or may not cause a SIGILL.
To implement this behaviour we could use some new opcodes (OP_unknown_a32
, OP_unknown_t32n
, OP_unknown_t32w
; or just OP_unknown
) that work a bit like AArch64's OP_xx
(which should probably be renamed at some point).