Notes from “Breaking Kernel Address Space Layout Randomization (KASLR) with Intel TSX”

As a Georgia Tech OMSCS student as well as working software professional, advanced security topics are always something I want to learn more about. Georgia Tech’s Institute for Information Security & Privacy is presenting a weekly Cybersecurity Lecture Series on Fridays this fall, and being a local I’ve started attending them. Here are my quick (albeit not necessarily complete) notes from this week’s presentation by Yeongjin Jang, a PhD student at Georgia Tech.

Introduction

Goal of many exploits is to invoke commit_creds(), prepare_kernel_creds() et al using code reuse.

KASLR changes the kernel symbol addresses at every boot, so any code reuse exploits require circumvention of KASLR.

Popular OSs that have adopted KASLR:

Linux 2.6.12+
OSX 10.5+
Android 4.0+
iOS 6+
Windows Vista+

TLB Timing Side Channel

Measures timing difference between TLB hit/miss for address requests by generating page faults to test address spaces. (Hund et al, see below).

Map is a bit “noisy” due to multiple layers in memory management:

User
CPU
TLB
OS exception handling
OS noise (largest detractor to mapping algorithm)

DrK Attack

De-randomizing Kernel ASLR

Based on work by Rafal Wojtczuk (see below) leveraging the Transactional Synchronization Extension (TSX) in newer Intel CPU architecture.

Abort handler of TSX:

Suppress all sync exceptions (e.g. page fault)
Does not notify the OS
Drops TLB measured error range from ~4000 cycles to ~180.

Attack targets

DrK is hardware side-channel, mechanism is OS independent
Targeted popular OSs: Linux, Windows, OSX

Attack types

Type 1: Revealing mapping status of each page
Type 2: Finer grained module detection

Type 2 attack achieves almost 100% mapping accuracy in under 2 seconds on average CPUs running almost any OS. Even cloud systems are vulnerable if TSX is enabled on the host CPU, although there is more noise due to virtualization.

What about cache coherence?

Intel TLBs are not coherent! If exploit is context-switched, it doesn’t matter. Each core either pulls from its own TLB or walks the page tables, resulting in the same kinds of timings.

Controlling noise

Dynamic frequency scaling (SpeedStep, TurboBoost, etc) changes the return value of rdtscp()
- Run busy loops ( while(1);) to max out CPU boost
Hardware interrupts and cache conflicts also abort TSX
- Probe multiple times (e.g. 2-200) and take the minimum

Increasing covertness

OS never sees page faults
- TSX suppresses the exception
Possible traces: performance counters
- High count on dTLB/iTLB miss
- High count on tx-aborts

Countermeasures?

Modify CPU to eliminate timing channels
- CPUs have already shipped
Turn off TSX
- Disable microcode not an option from software
Have a more coarse-grained timer
Using separated page tables for kernel and user processes
- High overhead due to frequent TLB flush
Fine grained randomization
- Difficult to implement, performance degradation
Inserting fake mapped/executable pages between mapped
- Doesn’t work as well as you’d hope, ASLR esp on Linux doesn’t give you enough space to work with
Pad modules to vary offsets, might make mapping more difficult

Conclusion

TSX can break KASLR of commodity OSes
Ensure accuracy, speed and covertness
Timing side channel is caused by hardware, OS independent

Notes from “Breaking Kernel Address Space Layout Randomization (KASLR) with Intel TSX”

Introduction

TLB Timing Side Channel

DrK Attack

What about cache coherence?

Controlling noise

Increasing covertness

Countermeasures?

Conclusion

References

You may also like...

Categories

Categories

Introduction

TLB Timing Side Channel

DrK Attack

What about cache coherence?

Controlling noise

Increasing covertness

Countermeasures?

Conclusion

References

You may also like...

The Conceptual Limits of Deep Learning

Windows 10 Technical Preview – nice!

Claude’s take on “The Software Development Lifecycle Is Dead”

Categories

Categories