Re-factor unit tests for get_[ymm|zmm]_caller functions that were failing on CDash in release, due to multiple reasons.
- ymm and zmm registers are call-clobbered in x86. Impossible in C to make them live out of a function. Fixed by inlining all code.
- gcc is adding vzeroupper instructions for AVX(-512) - SSE boundaries for performance reasons. This doesn't work with this test. Fixed by adding -mno-vzeroupper directives (or -mllvm -x86-use-vzeroupper=0 for clang).
- It's almost impossible to cleanly avoid SSE instructions in C-code when mixing with AVX in inline asm. For example there may be a xmm register assignment if there is (too-) much C-code between writing and reading of AVX registers in the test. This patch simplifies the tests to avoid this. Some barriers have been added as well.
- Travis GCC 4.8.4 does not fully support AVX. e.g. "ymm%d" registers are not known in asm volatile clobber lists as well as in register asm("") statements. The AVX and AVX-512 proc checks now include this.
- This patch moves the unit tests from x86_code module into a new x86_code_test module that gets its own clang/gcc options for testing (-mno-vzeroupper). We want this option to apply only to code compiled for unit_test. Also because of #3458 , clang specific target options break the clang build.
The unit "test" in this context here basically replicates the hand encoded get_[ymm|zmm]_caller saved functions in C-code and inline asm with mnemonics.
Fixes #3446 (closed) Issue: #1312