Mastering the CPUID Instruction in x86 Assembly Language

Written by

in

The CPUID (CPU Identification) instruction is the gateway to understanding the underlying hardware of an x86 system. Introduced in the Intel 486 processor, this assembly instruction allows software to query the processor’s identity, model, vendor, and supported feature sets. Mastering CPUID is essential for low-level systems engineers, game developers optimizing for specific SIMD extensions, and OS developers building robust kernels.

This article covers the mechanics of the CPUID instruction, how to use it in x86 assembly, and practical applications for feature detection. Understanding the Mechanics

The CPUID instruction is a zero-operand instruction, meaning it does not take explicit arguments in the code text (e.g., you simply write cpuid). Instead, it relies on implicit inputs and outputs stored in the processor’s general-purpose registers.

Input: The EAX register acts as the “leaf” or primary function selector. In newer processors, the ECX register is sometimes used as a “sub-leaf” or sub-function selector.

Output: After execution, the CPU overwrites the EAX, EBX, ECX, and EDX registers with the requested hardware information.

Because CPUID alters these four primary registers, developers must carefully save any critical data residing in them before executing the instruction. Checking for CPUID Support

Before executing CPUID, a robust program must verify that the processor actually supports it. On older 32-bit x86 processors, support is determined by checking if bit 21 of the EFLAGS register (the ID flag) can be toggled. If software can flip this bit, CPUID is supported. Here is how to perform this check in 32-bit x86 assembly:

section .text global check_cpuid check_cpuid: pushfd ; Push original EFLAGS to stack pop eax ; Pop EFLAGS into EAX mov ecx, eax ; Save original EFLAGS in ECX for comparison xor eax, 1 << 21 ; Toggle bit 21 (ID flag) push eax ; Push modified EFLAGS value to stack popfd ; Load modified value into EFLAGS pushfd ; Push EFLAGS back to stack pop eax ; Pop it into EAX to see if the bit stayed flipped push ecx ; Restore original EFLAGS popfd xor eax, ecx ; Compare original and new EFLAGS test eax, 1 << 21 ; If bit 21 changed, CPUID is supported jnz cpuid_supported mov eax, 0 ; Return 0 (False) ret cpuid_supported: mov eax, 1 ; Return 1 (True) ret Use code with caution.

On modern 64-bit x86-64 processors, this check is generally unnecessary as CPUID support is a mandatory baseline requirement for the architecture. Leaf 0: Querying the Vendor ID String

The most basic use of CPUID is calling Leaf 0 by setting EAX to 0. This operation returns the maximum supported CPUID standard leaf number in EAX, alongside a 12-character ASCII string representing the CPU vendor name split across EBX, EDX, and ECX (in that specific order). For example, Genuine Intel processors return: EBX = 756e6547h (“Genu”) EDX = 49656e69h (“ineI”) ECX = 6c65746eh (“ntel”)

Below is an implementation in 64-bit NASM assembly that retrieves this vendor string:

section .bss vendor_str resb 13 ; Allocate 12 bytes for string + 1 for null terminator section .text global get_cpu_vendor get_cpu_vendor: mov eax, 0 ; Set leaf to 0 cpuid ; Execute CPUID (fills EAX, EBX, ECX, EDX) ; Register order for vendor string is EBX, EDX, ECX mov dword [vendor_str], ebx mov dword [vendor_str + 4], edx mov dword [vendor_str + 8], ecx mov byte [vendor_str + 12], 0 ; Null terminator mov rax, vendor_str ; Return pointer to the string ret Use code with caution. Leaf 1: Processor Info and Feature Bits

Setting EAX to 1 returns the processor’s stepping, model, and family information in EAX, while ECX and EDX are populated with feature flags. These flags indicate whether the CPU supports specific hardware enhancements like SIMD extensions (SSE, AVX) or virtualization capabilities. Crucial feature flags in Leaf 1 include: EDX Bit 25: SSE support EDX Bit 26: SSE2 support ECX Bit 0: SSE3 support ECX Bit 28: AVX support

ECX Bit 30: RDRAND (Hardware Random Number Generator) support Example: Dynamic AVX Feature Detection

To prevent application crashes resulting from executing unsupported instructions, software uses CPUID to check for feature flags at runtime:

section .text global check_avx_support check_avx_support: mov eax, 1 ; Leaf 1 cpuid ; Execute CPUID ; AVX support is indicated by bit 28 of the ECX register bt ecx, 28 ; Bit Test instruction: copies bit 28 to the Carry Flag (CF) jc avx_supported ; Jump if Carry Flag is set (1) mov eax, 0 ; AVX not supported, return 0 ret avx_supported: mov eax, 1 ; AVX supported, return 1 ret Use code with caution. Advanced Leaves: Structured Extended Features

As processors grew more complex, Intel and AMD introduced extended features that exceeded the capacity of Leaf 1. This led to Structured Extended Features, accessed via Leaf 7.

To query Leaf 7, you must set EAX to 7 and additionally clear ECX to 0 (Sub-leaf 0). This sub-leaf returns flags for modern instructions like AVX2 (EBX bit 5), AVX-512 variants, and security features like SHA extensions (EBX bit 29).

Always ensure that the maximum leaf value returned by Leaf 0 is equal to or greater than 7 before attempting to query Leaf 7. Best Practices and Serialization

Preserve Registers: CPUID modifies EAX, EBX, ECX, and EDX. In 64-bit environments, it clears the upper 32 bits of RAX, RBX, RCX, and RDX. Always push these registers to the stack or let your compiler know they are clobbered if writing inline assembly.

Instruction Serialization: CPUID is a serializing instruction. The processor guarantees that all instructions preceding CPUID are fully executed before CPUID executes, and no subsequent instructions are fetched until CPUID finishes. Because it stops out-of-order execution engines, it carries a relatively high clock-cycle penalty and should not be placed inside performance-critical loops.

Kernel Considerations: When writing an operating system, some feature flags (like AVX) require additional checks. Even if CPUID reports AVX support, the OS must explicitly enable it by setting the XCR0 register, or executing an AVX instruction will trigger an Invalid Opcode (#UD) exception. Conclusion

Mastering the CPUID instruction empowers low-level developers to build highly adaptable software. By querying the hardware at runtime, programs can scale gracefully—running legacy fallback routines on older machines while unlocking maximum performance through advanced SIMD extensions on modern hardware.

If you would like to expand this into a working implementation, tell me:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

More posts