x86 nop filling should use multi-byte nops
The SET_TO_NOPS routine in DR today on x86 does a memset of 0x90. It was written back when multi-byte nops were not readily available. We should update it to use more efficient sequences for >1-byte lengths.