Illegal instructions, invalid opcodes, and Slackware

Last modified: 2024-01-21 09:34

When you run a program that was built for a later generation or different flavor of x86 CPU than the one you have, eventually the CPU hits an instruction that it doesn't know.  At that point it raises the invalid opcode hardware exception.  The Linux kernel handles this exception by delivering the illegal instruction signal to the offending process.  A command line user sees an "illegal instruction" error message and the program crashes.  On a supposedly user-friendly desktop, error messages go into a black hole and the user sees only the unexplained crash.

Although the examples given in this page are for older generation hardware, it is just as easy to get illegal instruction on a 64-bit CPU as on a 32-bit one.  You just fail on a different instruction set—at the moment, probably some fragment of AVX512.

Instruction sets from MMX to SSE2

Figure 1 shows significant instruction sets added to x86 CPUs in the range from MMX to SSE2.  The names of instruction sets are shown plainly while the names of CPU types are shown in parenthesis.  The instruction sets supported by a CPU type include all of those named on a path upward to the top node in the lattice.  So, Pentium (at the top) supports none of the added instruction sets while Athlon 64 (at the bottom) supports all of them.

i586, i686, P2, P3, and P4 are the commonly used nicknames for Pentium, Pentium Pro, Pentium 2, Pentium 3, and Pentium 4 CPUs respectively.

Figure 1.  CPU support for instruction sets from MMX to SSE2 expressed as a concept lattice with reduced labelling (Formal Concept Analysis).

GCC's behavior

The -march switch enables added instruction sets like SSE and 3DNow! consistent with the CPU type.  Added instruction sets can also be enabled individually with switches like -msse.  Tables 1 and 2 summarize GCC 12.2 documentation for the MMX to SSE2 range.

As a general rule, when the GCC command line contains contradictory switches, the later switch takes precedence.  My expectation, then, was that -msse -march=i586 would be equivalent to just -march=i586 and would not generate SSE code.  Instead, GCC merges the i586 instruction set with the SSE instruction set and generates code that won't even run on a Pentium 2 or Athlon, much less a Pentium.

Table 1.  GCC options for specifying x86 CPU type.
CPUGCC option
Pentium -march=i586 or -march=pentium
Pentium MMX -march=pentium-mmx
Pentium Pro -march=i686 or -march=pentiumpro
P2 Deschutes -march=pentium2
Pentium 3 -march=pentium3
Pentium 4 -march=pentium4
K6 -march=k6
K6-2 -march=k6-2
K6-2+ None
Athlon -march=athlon
Athlon XP -march=athlon-xp
Athlon 64 -march=k8 or -march=athlon64

Table 2.  GCC options for enabling instruction sets.
Instruction setGCC option
MMX -mmmx
CMOV None (determined by -march)
FXSR -mfxsr
Extended MMX Subset of -msse and -m3dnowa*
SSE -msse
SSE2 -msse2
3DNow! -m3dnow
Extended 3DNow! Subset of -m3dnowa*

* The 'a' in -m3dnowa presumably stands for Athlon.  -m3dnowa enables the 5 instructions of Extended 3DNow! plus the 19 instructions of Extended MMX.  Extended MMX is also included in SSE.

Impact on Slackware

Most packages in the 32-bit version of Slackware 15.0 are built with -march=i586 and specify i586 in the package name.  A few are built for i686 instead.  But the upstream configuration and build scripts for many packages add -msse, -msse2 or suchlike to the GCC command line.  Slackware's SlackBuild scripts don't remove or cancel out those switches, so packages that are nominally built for the i586 arch actually require a Pentium 3 or Pentium 4 to run.

Some packages furthermore include SSE or SSE2 assembly code.  To produce an i586-compatible build for these, it's necessary to use package-specific configuration options to disable SSE and/or asm code, and Slackware's SlackBuild scripts don't do that.

Finally, Rust has a unique build system with a unique problem resulting from somebody's decision to enable SSE2 in builds targeting i686.  The release notes for Rust version 1.10.0 (2016-07-07) say "This release includes std binaries for the i586-unknown-linux-gnu, i686-unknown-linux-musl, and armv7-linux-androideabi targets.  The i586 target is for old x86 hardware without SSE2...."  Since Slackware uses the i686 target for Rust, not only does rustc itself fail to run on the target architecture, every other package containing Rust code becomes contaminated with SSE2 as a second-order effect.

Following is a summary of the remediations I had to make for Slackware 15.0 to be fit for purpose on an Athlon T-Bird CPU.  Since there are some packages that I never use and some that I always replace with my own build, this is not a complete list of every affected package.

Table 3.  Fixes to Slackware 15.0 SlackBuild scripts.
PackageWho needs iti586 SlackBuild fixT-Bird tuningNotes
Mesa Every GL app Add -Dsse2=false to meson setup switches SLKCFLAGS="-O3 -march=athlon-tbird" For Nvidia GL to work, you must build (or rebuild) the Nvidia driver after replacing the Mesa package with a non-SSE version.
Qt5 KDE Add -no-sse2 -no-sse3 -no-ssse3 -no-sse4.1 -no-sse4.2 -no-avx -no-avx2 -no-avx512 to configure switches SLKCFLAGS="-O3 -march=athlon-tbird"
SDL2 Games, DOSBox-X, ffplay Add --disable-mmx --disable-3dnow --disable-sse --disable-ssemath --disable-sse2 --disable-sse3 to configure switches --enable-mmx --enable-3dnow and SLKCFLAGS="-O3 -march=athlon-tbird"
Rust Emacs (indirectly) Set ARCH=i586 and add docs = false in the [build] section of the build configuration Not applicable It's a blivet.  Disabling docs avoids a stupid FTB.
Librsvg Emacs Rebuild after replacing Rust Not applicable SSE2 contamination from Rust
OpenCV GStreamer Add -DCPU_BASELINE= -DCPU_DISPATCH= to cmake switches SLKCFLAGS="-O3 -march=athlon-tbird" After replacing OpenAL and OpenCV, wipe out ~/.cache/gstreamer-1.0 and run gst-inpect-1.0 to rescan plugins

Here are corresponding patches for the SlackBuilds (i586 not T-Bird).

Bonus:  Proving that it was SSE

When rustc crashed, the kernel logged this at level info:

Jan 22 20:37:29 abit kernel: traps: rustc[1642] trap invalid opcode ip:b36e4400 sp:bfe36230 error:0 in[b3639000+1bf000]

Retrieve the offending instruction:

bash-5.1$ printf "%x\n" $((0xb36e4400 - 0xb3639000))
bash-5.1$ objdump -d /usr/lib/ | fgrep ab400
   ab400:       0f 57 c0                xorps  %xmm0,%xmm0

XORPS is an SSE instruction and XMM0 is an SSE register.