Introduction
Most people in the game hacking community write their kernel-mode drivers to get around kernel-level anti-cheats such as EasyAntiCheat.
However, those anti-cheats have several methods to detect cheat drivers. The most commonly used way to load the cheat driver is manually mapping it with tools like kdmapper. Unfortunately, manually mapping a driver in this way causes the code to be outside of a valid module.
Recommended communication methods like IOCTL are rendered unusable because they can be detected with a few lines of code.
While reading through Windows Internals Part 1 and reverse engineering nt!KiSystemStartup
I noticed a possible way to abuse a Windows feature to be able to copy shellcode into valid drivers typically read-only .data
section and most importantly, execute it. This concept can potentially be used to prevent detection from game anti-cheats while communicating with kernel-mode.
Checkout my lpmapper repository for the finished proof-of-concept.
In this post, I’m going to demonstrate how to make use of this feature to have executable code in kernel-mode without having to allocate any memory. To understand this concept, we will have to take a short look into one of the core data structures of a processor: Page Tables.
Large Pages
Windows makes use of page tables to be able to create separate virtual memory spaces for each context.
I will not go too much into detail about how they work as there are many posts about page tables out there already, such as this one.
However, one important detail that oftentimes isn’t mentioned is the use of large pages.
A common page table structure looks something like this:
As you can see in this image, a Page Table (PT) can hold up to 512 Page Table Entries (PTEs). With each page having a size of 4096 bytes, this means the last page table addresses 2 megabytes (512 * 4096) of physical memory. This is where large pages come into play. Large pages are a feature exclusively supported by x64 processors. The 7th bit of PDPT-entries and PD-entries, which is called PageSize
is used to determine, whether this page table entry points to another Page table or an entire physical page with the size the page table would map.
For example, if the PageSize
bit is set on a Page-directory entry (PDE), the page frame number of this entry points to a full contiguous physical 2 megabyte page instead of pointing to another page table. The R/W and NX bits of the PTE decide whether the page is writable or executable. These properties apply to the entire page, which means that for normal pages, the smallest protection region you can modify is 4 kilobytes (4096 bytes). For a large page, it is 2 megabytes (512 * 4096 bytes) and for huge pages, it’s 1 gigabyte (512 * 512 * 4096 bytes).
This aspect is going to become important later.
Large pages are used in applications that need to allocate large memory regions and want to be able to access them quicker. Due to the missing step in the virtual to physical translation, the CPU can access large pages faster.
If you want to inspect the page tables on your system live to get a better understanding of page tables I recommend my tool PTView. By default, Windows maps the ntoskrnl.exe and hal.dll images on large pages. You can get their base address from a kernel debugger and enter it into PTView to directly get the large page they are on.
LargePageDrivers
As I mentioned Windows maps the ntoskrnl.exe image onto a large page. However, if we look into nt!MmLoadSystemImageEx
which eventually gets called in the process of Phase 1 system initialization by nt!IoInitSystem
, we will see a function named MiMapSystemImageWithLargePage
being called after a check. As the name says, this function is responsible for mapping system drivers on large pages.
MiUseLargeDriverPage
takes the DriverName string and returns whether the driver should be loaded on a large page.
I have reverse-engineered the function, the LIST_ENTRY struct, and renamed all variables accordingly.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
struct LARGE_PAGE_DRIVER_LIST_ENTRY
{
LIST_ENTRY Flink;
LIST_ENTRY Blink;
UNICODE_STRING Name;
}
bool __stdcall MiUseLargeDriverPage(PCUNICODE_STRING DriverName)
{
LARGE_PAGE_DRIVER_LIST_ENTRY *LargePageDriversListEntry; // rbx
if ( (MiFlags & 0x8000) != 0 || (MiFlags & 0x10000) != 0 )
return 0;
if ( MapAllDriversIntoLargePages != 1 )
{
// Walk the list
for ( LargePageDriversListEntry = LargePageDriversList;
LargePageDriversListEntry != &LargePageDriversList;
LargePageDriversListEntry = LargePageDriversListEntry->Flink )
{
if ( RtlEqualUnicodeString(DriverName, &LargePageDriversListEntry->Name, 1u) )
return 1;
}
return 0;
}
return 1; // return true if MapAllDriversIntoLargePages is true
}
This means, to get a driver to load on a large page during boot we have to make sure it’s in the LargePageDriversList
. By listing all cross-references in IDA we can find out, that this list is being populated inside of nt!MiInitializeDriverImages
which is eventually getting called from nt!MiInitSystem
.
The procedure references the global variable MmLargePageDriverBuffer
from the ntoskrnl .INIT
section. This variable contains the contents of the HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\LargePageDrivers
value from the registry. This value has to be of type multi-string.
It has to either contain the file names of the drivers separated by null-terminators or a *
, which serves as a wildcard and is going to set the MapAllDriversIntoLargePages
I have defined to true. The decompilation of the function shows how the value is being parsed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
FirstLargePageDriverEntry = &LargePageDriversList;
LargePageDriversList = &LargePageDriversList;
// Skip this part if there are no entries
if ( MmLargePageDriverBufferLength != -1 )
{
StartOfBuffer = &MmLargePageDriverBuffer;
EndOfBuffer = (&MmLargePageDriverBuffer + 2 * ((MmLargePageDriverBufferLength - 2) >> 1));
if ( &MmLargePageDriverBuffer < EndOfBuffer )
{
whitespaces = 0x100002601i64;
do
{
currentChar = *StartOfBuffer;
// check if current char is empty
if ( currentChar <= ' '
&& _bittest64(&whitespaces, currentChar)
|| currentChar == "0\0" )
{
currentChar_1 = StartOfBuffer;
}
else
{
if ( currentChar == '*' ) // * for wildcard
{
MapAllDriversIntoLargePages = 1;
break;
}
for ( currentChar_1 = StartOfBuffer; currentChar_1 < EndOfBuffer; ++currentChar_1 )
{
currentCharValue = *currentChar_1; // skip whitespaces
if ( currentCharValue <= ' ' && _bittest64(&whitespaces, currentCharValue) )
break;
if ( currentCharValue == "0\0" )
break;
}
// allocate some memory for the new entry
NewLargePageDriverEntry = MiAllocatePool(0x40, ' ', 0x704C6D4Du);
if ( !NewLargePageDriverEntry )
break;
//initialize entry
DriverNameLength = 2 * (currentChar_1 - StartOfBuffer);
NewLargePageDriverEntry->Name.Buffer = StartOfBuffer;
NewLargePageDriverEntry->Name.Length = DriverNameLength;
NewLargePageDriverEntry->Name.MaximumLength = DriverNameLength;
OldEntry = FirstLargePageDriverEntry;
if ( *FirstLargePageDriverEntry != &LargePageDriversList )
__fastfail(3u);
// Reassign the list links
NewLargePageDriverEntry->Flink = &LargePageDriversList;
*&NewLargePageDriverEntry->Blink = OldEntry;
whitespaces = "\x01\0\0&\x01";
*OldEntry = NewLargePageDriverEntry;
// set new list head
FirstLargePageDriverEntry = NewLargePageDriverEntry;
}
StartOfBuffer = currentChar_1 + 1;
}
while ( currentChar_1 + 1 < EndOfBuffer );
}
Now we reach the important part. Previously we learned that page protection applies to the entire page. Let’s say we load beep.sys
onto a large page now. Its read-only .text
section only has the size of a single page.
Normally the image loader simply would map the .text
section onto a single page and make it write-protected. The .data
section also gets its own page, which then is going to be writable, but not executable.
However, since the loader now is forced to place both the .text
and .data
sections onto the same page, those sections will be writable and executable.
Note that this all is achieved without any page table manipulation and is a legitimate Windows feature.
While modifying the normally read-only .text
section still can easily be detected by comparing the image in memory with the file on the disk, we can freely modify the .data
section and write our shellcode into it, which now can be executed.
Finally, since the shellcode remains inside of the driver’s bounds I can directly point the driver’s Device-IO dispatch to the shellcode location inside of the .data
section and call it from user-mode via DeviceIoControl
.
I will demonstrate this using the beep.sys
driver, which is responsible for handling the Beep
API function.
Implementation
You can find the full implementation of this in the lpmapper repository. The lpmapper project will map the shellcode into beep.sys by default. This can be modified to use any other driver easily. If you want to test it, run the lpmapper-test project. Make sure that you added beep.sys to LargePageDrivers in the registry key mentioned above.
Note that this concept can be abused in many different ways. For example, you could also just place the shellcode in a third-party driver and have it jump into your manually mapped driver.
I however wanted to demonstrate a concept that does not require any memory allocation at all. The idea is to write a simple Device-IO dispatch handler, shrink it down to the essential part and only copy its the function’s instruction bytes into the .data
section. Finally, I will assign the beep.sys driver dispatch to that shellcode.
First of all, we have to make sure the HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\
registry key contains a multi-string value named LargePageDrivers
. We are going to set it to either beep.sys
, or *
if you want to load all drivers onto large pages (which would be kind of wasteful).
If we don’t do that, we are going to receive an ATTEMPTED_WRITE_TO_READONLY_MEMORY
bluescreen, since lpmapper doesn’t check if the section is on a large page and is writable.
I started by writing the dispatch handler in C++ since I have learned that compilers produce much better machine code than humans. The handler itself is very simple and supports 3 operations: reading cr3, getting a process main module base address, and arbitrarily reading/writing to process memory.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
NTSTATUS DeviceIOControlHandler(PDEVICE_OBJECT device, PIRP irp)
{
PIO_STACK_LOCATION irpStack = IoGetCurrentIrpStackLocation(irp);
auto inputBuffer = irpStack->Parameters.DeviceIoControl.Type3InputBuffer;
auto outputBuffer = irp->UserBuffer;
switch (irpStack->Parameters.DeviceIoControl.IoControlCode)
{
case IOCTL_RDCR3:
if (outputBuffer)
{
*(uint64_t*)outputBuffer = __readcr3();
}
break;
case IOCTL_COPY:
if (inputBuffer)
{
memory_copy* data = (memory_copy*)inputBuffer;
PEPROCESS process = 0;
PsLookupProcessByProcessId(data->processId, &process);
if (process)
{
PEPROCESS sourceProcess = data->write ? IoGetCurrentProcess() : process;
PEPROCESS targetProcess = data->write ? process : IoGetCurrentProcess();
size_t dummy = 0;
MmCopyVirtualMemory(sourceProcess, data->source, targetProcess, data->target, data->size, KernelMode, &dummy);
ObDereferenceObject(process);
}
}
break;
case IOCTL_PROCESS_BASE:
if (inputBuffer && outputBuffer)
{
HANDLE processId = *(HANDLE*)inputBuffer;
PEPROCESS process = 0;
PsLookupProcessByProcessId(processId, &process);
if (process)
{
auto base = PsGetProcessSectionBaseAddress(process);
*(PVOID*)outputBuffer = base;
ObDereferenceObject(process);
}
}
break;
default:
return OriginalDispatch(device, irp);
}
irp->IoStatus.Information = 0;
irp->IoStatus.Status = STATUS_SUCCESS;
IofCompleteRequest(irp, IO_NO_INCREMENT);
return STATUS_SUCCESS;
}
I wanted to keep this handler as simple as possible, which is why it doesn’t check for NTSTATUS results for example. This is up to you to implement.
After writing the handler I used my ShellcodeBakery tool to get the shellcode from the compiled binary and display it as a C++ array ready to copy into a source file.
However, the code calls a few imports. Usually, those imports are located inside of the IAT of the driver image. I won’t be mapping the entire driver image though, because I can only fit 4096 bytes into the beep.sys .data
section and the driver image would be spread out across a few pages.
This is why I made another tool, that builds a “custom” import address table at the end of the shellcode and relocates all import calls in the shellcode to their appropriate IAT entry.
I did the same for the call to the OriginalDispatch
. You can find that function table in the source code here. This table gets populated with the import address during runtime before its copied into kernel-mode.
I used kdmapper’s intel_driver
library to access kernel memory because it already had lots of useful functions implemented, such as GetKernelModuleExport
.
At first, lpmapper will try to find the beep.sys module and its DriverObject.
I get the module address from GetKernelModuleExport
, to get the DriverObject I created a new function in kdmapper called CallNtosExport
. This function calls IoGetDeviceObjectPointer
to get the Beep DeviceObject. The DeviceObject holds the DriverObject. This happens here.
After that, I get the original driver dispatch from DriverObject->MajorFunction[14]
. This address is stored in the previously mentioned function table of the shellcode, along with all other needed imports which I can get using GetKernelModuleExport
. The code responsible for this is located here.
Finally, the shellcode is copied into the beep.sys .data
section here.
In the final step, lpmapper sets the DriverObjects Device-IO dispatch to the shellcode location here.
Testing
After running lpmapper you can now test the concept with the lpmapper-test project. This project also displays how to interact with the dispatch handler.
The following code contains a reference to a handle to the Beep driver. You can acquire this handle using CreateFile:
1
2
HANDLE beepHandle = CreateFile(L"\\\\.\\GLOBALROOT\\Device\\Beep", FILE_ANY_ACCESS, 0,
nullptr, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, nullptr);
I’ll briefly go over how the 3 supported operations are implemented.
Read CR3
This operation is tested in TestReadCr3
.
It will return the value of the current CPUs cr3
register, which is the DirectoryTableBase of the current process, into the output buffer.
1
2
3
4
5
6
7
const ULONG IOCTL_READCR3 = CTL_CODE(0x8000, 0x802, METHOD_NEITHER, FILE_ANY_ACCESS);
uint64_t cr3 = 0;
bool success = DeviceIoControl(beepHandle, IOCTL_READCR3,
nullptr, 0,
&cr3, sizeof(uint64_t),
nullptr, nullptr);
Getting a processes main module base
This operation is tested in TestProcessBase
.
After passing in a process id into the input buffer it will return that process’s main module base address in the output buffer.
1
2
3
4
5
6
7
8
9
const ULONG IOCTL_PROCESS_BASE = CTL_CODE(0x8000, 0x800, METHOD_NEITHER, FILE_ANY_ACCESS);
uint64_t processBase = 0;
uint64_t processId = GetCurrentProcessId();
bool success = DeviceIoControl(beepHandle, IOCTL_PROCESS_BASE,
&processId, sizeof(uint64_t),
&processBase, sizeof(uint64_t),
nullptr, nullptr);
Reading and writing process memory
This operation is tested in TestMemoryRead
.
You have to initialize a memory_copy
struct which has to be passed to the input buffer.
The processId
member of that struct holds the target process. The write
member decides whether the procedure reads from the target process, or writes to it. sourceAddress
and targetAddress
have to be set accordingly. The size
member determines the number of bytes to copy.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
struct memory_copy
{
uint64_t processId;
PVOID sourceAddress;
PVOID targetAddress;
BOOL write;
SIZE_T size;
};
const ULONG IOCTL_COPY = CTL_CODE(0x8000, 0x801, METHOD_NEITHER, FILE_ANY_ACCESS);
char buffer[3]; // buffer for "MZ" + '\0'
memory_copy data = {};
data.processId = GetCurrentProcessId();
data.targetAddress = buffer;
data.sourceAddress = (PVOID)baseAddress;
data.write = false;
data.size = 2; //only read MZ into buffer
buffer[2] = 0; // set null terminator after "MZ"
bool success = DeviceIoControl(beepHandle, IOCTL_COPY,
&data, sizeof(memory_copy),
nullptr, 0,
nullptr, nullptr);
Detection
I have tested this on an EasyAntiCheat protected game for a substantial amount of time and did not run into a ban, but that is not enough data, since EasyAntiCheat is known for not always banning a player when it detects something.
Since LargePageDrivers is an in-house Windows feature, the best the anti-cheat can do is to give you a little flag for having a driver mapped in such a way.
I’m going to discuss the possible ways how this could potentially be detected - and what can be done to prevent that from happening.
If you noticed any detection vector that I missed feel free to contact me on Discord or Twitter about it.
Dispatch Hooks
In my example, I hooked the driver’s dispatch. You might be wondering why I can’t just point that to my driver residing inside of a pool.
This simply is, because most modern anti-cheats simply check if the dispatch function is located within the driver’s bounds.
Such a check could look something like this:
1
2
3
4
5
6
7
8
9
PDRIVER_OBJECT diskDriver = //get the DriverObject
PVOID driverDispatch = diskDriver->MajorFunction[IRP_MJ_DEVICE_CONTROL];
if(driverDispatch > diskDriver.DriverStart + diskDriver.DriverSize ||
driverDispatch < diskDriver.DriverStart)
{
// Take action
}
The most common anti-cheats have been doing this for ages.
The EAC-Reversing repository showcases how EasyAntiCheat was doing it back in 2019 at least.
For BattlEye I have analyzed the recently released full bedaisy.sys dump posted on unknowncheats.me by anypot.
I have found a few of those checks, this could be a self-integrity check, however, the principle is the same:
1
2
3
4
5
6
7
8
9
10
11
12
for ( majorFunctionIndex = 0; majorFunctionIndex < 0x1C; ++majorFunctionIndex )
{
majorFunction = DriverObject->MajorFunction[majorFunctionIndex];
if ( majorFunction )
{
DriverStart = DriverObject->DriverStart;
if ( majorFunction < DriverStart || majorFunction >= DriverStart + DriverObject->DriverSize )
{
// take action
}
}
}
If you take a look at those, you will notice that this PoC passes those checks, since the dispatch still points to an address inside of the driver’s bounds.
A way to detect this project specifically would be to parse the Debug symbols for beep.sys and check, if the dispatch is pointing to the correct dispatch handler. However, you can easily modify this project to use any third-party driver that does not have symbols available.
Another way to at least flag this, is to check whether the dispatch is pointing into an executable section using the PE header. The problem with that once again is, that it is technically not illegal to have the dispatch pointing into the .data
section. It’s not common, but a reputable anti-cheat should be careful in taking action due to such flags. If you use techniques to obfuscate and encrypt the shellcode it could potentially make it even harder to detect reliably.
Stack walking
Stack walking is an often-discussed detection method used by anti-cheats. It works by delivering APCs to all threads, getting their contexts, and checking the return addresses on the stack. Those are then used to determine whether the thread has been executing code outside of a valid module.
However, in my example, the thread never leaves the beep.sys or ntoskrnl.exe module. It is running inside of the .data
section however, which could lead to a flag if explicitly checked for.
NMI-Callbacks
Similar to stack walking, NMI callbacks interrupt your thread midway and check where and what it is currently executing.
This could lead to a potential detection if the anti-cheat finds your thread executing inside of the .data
section.
Conclusion
It has been a pleasure researching this feature. I highly recommend you to read through Windows Internals Part 1 as it is a good book and can spark a few ideas.
If you found a mistake in this post or noticed a critical fact that I missed, please contact me over Discord, Twitter, or any channel you find. I highly appreciate any critical feedback I can get.