make unknown instruction handling more robust

Today DR's handling of an instruction that is considered invalid according to DR's decode tables is not ideal: on Linux, DR isolates the instruction to a single bb and then executes the bb, expecting it to fault right away. On Windows, DR forges an illegal instruction exception.

If the instruction is in fact valid, DR does not necessarily do the right thing for continued execution. Its handling is especially problematic with variable-length instructions. DR doesn't know the length of the instruction, so it copies the maximum (17 bytes on x86). It's possible that the instruction is valid (a new ISA extension not yet added to DR, e.g., or a non-public instruction encoding) and that there are control flow or other instructions that need mangling in those 17 bytes. DR could lose control or crash.

How do we improve this? Maybe we should decode the 17 bytes from every offset and ensure there are no control flow instructions to at least not lose control (the wrong thing will still happen in terms of a client not seeing instructions) and perhaps also look for other key things like segment references, rip-rel references, or (for ARM) stolen register use. It's not clear what to do if we see these things: try to single-step (via debugger methods or via running progressively longer sequences and counting instructions via perfctrs...just talking out loud here).

Or maybe we should acknowledge that there is no good solution and change Linux to also explicitly forge SIGILL up front. There are tradeoffs here with new ISA support.

Note that for core DR with the fast decoder we deliberately gloss over invalid entries within opcode classes: if we can guess at the length we go with it. When a client is there this may not be the case as the full decoder might complain: we should double-check the difference and see if we can do better for the full decoder.

Xref #57 Xref #431