How can I close a handle in another process?

Many of you are probably familiar with Process Explorer‘s ability to close a handle in any process. How can this be accomplished programmatically?

The standard CloseHandle function can close a handle in the current process only, and most of the time that’s a good thing. But what if you need, for whatever reason, to close a handle in another process?

There are two routes than can be taken here. The first one is using a kernel driver. If this is a viable option, then nothing can prevent you from doing the deed. Process Explorer uses that option, since it has a kernel driver (if launched with admin priveleges at least once). In this post, I will focus on user mode options, some of which are applicable to kernel mode as well.

The first issue to consider is how to locate the handle in question, since its value is unknown in advance. There must be some criteria for which you know how to identify the handle once you stumble upon it. The easiest (and probably most common) case is a handle to a named object.

Let take a concrete example, which I believe is now a classic, Windows Media Player. Regardless of what opnions you may have regarding WMP, it still works. One of it quirks, is that it only allows a single instance of itself to run. This is accomplished by the classic technique of creating a named mutex when WMP comes up, and if it turns out the named mutex already exists (presumabley created by an already existing instance of itself), send a message to its other instance and then exits.

The following screenshot shows the handle in question in a running WMP instance.

wmp1

This provides an opportunity to close that mutex’ handle “behind WMP’s back” and then being able to launch another instance. You can try this by manually closing the handle with Process Explorer and then launch another WMP instance successfully.

If we want to achieve this programmatically, we have to locate the handle first. Unfortunately, the documented Windows API does not provide a way to enumerate handles, not even in the current process. We have to go with the (officially undocumented) Native API if we want to enumerate handles. There two routes we can use:

  1. Enumerate all handles in the system with NtQuerySystemInformation, search for the handle in the PID of WMP.
  2. Enumerate all handles in the WMP process only, searching for the handle yet again.
  3. Inject code into the WMP process to query handles one by one, until found.

Option 3 requires code injection, which can be done by using the CreateRemoteThreadEx function, but requires a DLL that we inject. This technique is very well-known, so I won’t repeat it here. It has the advantage of not requring some of the native APIs we’ll be using shortly.

Options 1 and 2 look very similar, and for our purposes, they are. Option 1 retrieves too much information, so it’s probably better to go with option 2.

Let’s start at the beginning: we need to locate the WMP process. Here is a function to do that, using the Toolhelp API for process enumeration:

#include <windows.h>
#include <TlHelp32.h>
#include <stdio.h>

DWORD FindMediaPlayer() {
	HANDLE hSnapshot = ::CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
	if (hSnapshot == INVALID_HANDLE_VALUE)
		return 0;

	PROCESSENTRY32 pe;
	pe.dwSize = sizeof(pe);

	// skip the idle process
	::Process32First(hSnapshot, &pe);
	
	DWORD pid = 0;
	while (::Process32Next(hSnapshot, &pe)) {
		if (::_wcsicmp(pe.szExeFile, L"wmplayer.exe") == 0) {
			// found it!
			pid = pe.th32ProcessID;
			break;
		}
	}
	::CloseHandle(hSnapshot);
	return pid;
}


int main() {
	DWORD pid = FindMediaPlayer();
	if (pid == 0) {
		printf("Failed to locate media player\n");
		return 1;
	}
	printf("Located media player: PID=%u\n", pid);
	return 0;
}

Now that we have located WMP, let’s get all handles in that process. The first step is opening a handle to the process with PROCESS_QUERY_INFORMATION and PROCESS_DUP_HANDLE (we’ll see why that’s needed in a little bit):

HANDLE hProcess = ::OpenProcess(PROCESS_QUERY_INFORMATION | PROCESS_DUP_HANDLE,
	FALSE, pid);
if (!hProcess) {
	printf("Failed to open WMP process handle (error=%u)\n",
		::GetLastError());
	return 1;
}

If we can’t open a proper handle, then something is terribly wrong. Maybe WMP closed in the meantime?

Now we need to work with the native API to query the handles in the WMP process. We’ll have to bring in some definitions, which you can find in the excellent phnt project on Github (I added extern "C" declaration because we use a C++ file).

#include <memory>

#pragma comment(lib, "ntdll")

#define NT_SUCCESS(status) (status >= 0)

#define STATUS_INFO_LENGTH_MISMATCH      ((NTSTATUS)0xC0000004L)

enum PROCESSINFOCLASS {
	ProcessHandleInformation = 51
};

typedef struct _PROCESS_HANDLE_TABLE_ENTRY_INFO {
	HANDLE HandleValue;
	ULONG_PTR HandleCount;
	ULONG_PTR PointerCount;
	ULONG GrantedAccess;
	ULONG ObjectTypeIndex;
	ULONG HandleAttributes;
	ULONG Reserved;
} PROCESS_HANDLE_TABLE_ENTRY_INFO, * PPROCESS_HANDLE_TABLE_ENTRY_INFO;

// private
typedef struct _PROCESS_HANDLE_SNAPSHOT_INFORMATION {
	ULONG_PTR NumberOfHandles;
	ULONG_PTR Reserved;
	PROCESS_HANDLE_TABLE_ENTRY_INFO Handles[1];
} PROCESS_HANDLE_SNAPSHOT_INFORMATION, * PPROCESS_HANDLE_SNAPSHOT_INFORMATION;

extern "C" NTSTATUS NTAPI NtQueryInformationProcess(
	_In_ HANDLE ProcessHandle,
	_In_ PROCESSINFOCLASS ProcessInformationClass,
	_Out_writes_bytes_(ProcessInformationLength) PVOID ProcessInformation,
	_In_ ULONG ProcessInformationLength,
	_Out_opt_ PULONG ReturnLength);

The #include <memory> is for using unique_ptr<> as we’ll do soon enough. The #parma links the NTDLL import library so that we don’t get an “unresolved external” when calling NtQueryInformationProcess. Some people prefer getting the functions address with GetProcAddress so that linking with the import library is not necessary. I think using GetProcAddress is important when using a function that may not exist on the system it’s running on, otherwise the process will crash at startup, when the loader (code inside NTDLL.dll) tries to locate a function. It does not care if we check dynamically whether to use the function or not – it will crash. Using GetProcAddress will just fail and the code can handle it. In our case, NtQueryInformationProcess existed since the first Windows NT version, so I chose to go with the simplest route.

Our next step is to enumerate the handles with the process information class I plucked from the full list in the phnt project (ntpsapi.h file):

ULONG size = 1 << 10;
std::unique_ptr<BYTE[]> buffer;
for (;;) {
	buffer = std::make_unique<BYTE[]>(size);
	auto status = ::NtQueryInformationProcess(hProcess, ProcessHandleInformation, 
		buffer.get(), size, &size);
	if (NT_SUCCESS(status))
		break;
	if (status == STATUS_INFO_LENGTH_MISMATCH) {
		size += 1 << 10;
		continue;
	}
	printf("Error enumerating handles\n");
	return 1;
}

The Query* style functions in the native API request a buffer and return STATUS_INFO_LENGTH_MISMATCH if it’s not large enough or not of the correct size. The code allocates a buffer with make_unique<BYTE[]> and tries its luck. If the buffer is not large enough, it receives back the required size and then reallocates the buffer before making another call.

Now we need to step through the handles, looking for our mutex. The information returned from each handle does not include the object’s name, which means we have to make yet another native API call, this time to NtQyeryObject along with some extra required definitions:

typedef enum _OBJECT_INFORMATION_CLASS {
	ObjectNameInformation = 1
} OBJECT_INFORMATION_CLASS;

typedef struct _UNICODE_STRING {
	USHORT Length;
	USHORT MaximumLength;
	PWSTR  Buffer;
} UNICODE_STRING;

typedef struct _OBJECT_NAME_INFORMATION {
	UNICODE_STRING Name;
} OBJECT_NAME_INFORMATION, * POBJECT_NAME_INFORMATION;

extern "C" NTSTATUS NTAPI NtQueryObject(
	_In_opt_ HANDLE Handle,
	_In_ OBJECT_INFORMATION_CLASS ObjectInformationClass,
	_Out_writes_bytes_opt_(ObjectInformationLength) PVOID ObjectInformation,
	_In_ ULONG ObjectInformationLength,
	_Out_opt_ PULONG ReturnLength);

NtQueryObject has several information classes, but we only need the name. But what handle do we provide NtQueryObject? If we were going with option 3 above and inject code into WMP’s process, we could loop with handle values starting from 4 (the first legal handle) and incrementing the loop handle by four.

Here we are in an external process, so handing out the handles provided by NtQueryInformationProcess does not make sense. What we have to do is duplicate each handle into our own process, and then make the call. First, we set up a loop for all handles and duplicate each one:

auto info = reinterpret_cast<PROCESS_HANDLE_SNAPSHOT_INFORMATION*>(buffer.get());
for (ULONG i = 0; i < info->NumberOfHandles; i++) {
	HANDLE h = info->Handles[i].HandleValue;
	HANDLE hTarget;
	if (!::DuplicateHandle(hProcess, h, ::GetCurrentProcess(), &hTarget, 
		0, FALSE, DUPLICATE_SAME_ACCESS))
		continue;	// move to next handle
	}

We duplicate the handle from WMP’s process (hProcess) to our own process. This function requires the handle to the process opened with PROCESS_DUP_HANDLE.

Now for the name: we need to call NtQueryObject with our duplicated handle and buffer that should be filled with UNICODE_STRING and whatever characters make up the name.

BYTE nameBuffer[1 << 10];
auto status = ::NtQueryObject(hTarget, ObjectNameInformation, 
	nameBuffer, sizeof(nameBuffer), nullptr);
::CloseHandle(hTarget);
if (!NT_SUCCESS(status))
	continue;

Once we query for the name, the handle is not needed and can be closed, so we don’t leak handles in our own process. Next, we need to locate the name and compare it with our target name. But what is the target name? We see in Process Explorer how the name looks. It contains the prefix used by any process (except UWP processes): “\Sessions\<session>\BasedNameObjects\<thename>”. We need the session ID and the “real” name to build our target name:

WCHAR targetName[256];
DWORD sessionId;
::ProcessIdToSessionId(pid, &sessionId);
::swprintf_s(targetName,
	L"\\Sessions\\%u\\BaseNamedObjects\\Microsoft_WMP_70_CheckForOtherInstanceMutex", 
	sessionId);
auto len = ::wcslen(targetName);

This code should come before the loop begins, as we only need to build it once.

Not for the real comparison of names:

auto name = reinterpret_cast<UNICODE_STRING*>(nameBuffer);
if (name->Buffer && 
	::_wcsnicmp(name->Buffer, targetName, len) == 0) {
	// found it!
}

The name buffer is cast to a UNICODE_STRING, which is the standard string type in the native API (and the kernel). It has a Length member which is in bytes (not characters) and does not have to be NULL-terminated. This is why the function used is _wcsnicmp, which can be limited in its search for a match.

Assuming we find our handle, what do we do with it? Fortunately, there is a trick we can use that allows closing a handle in another process: call DuplicateHandle again, but add the DUPLICATE_CLOSE_SOURCE to close the source handle. Then close our own copy, and that’s it! The mutex is gone. Let’s do it:

// found it!
::DuplicateHandle(hProcess, h, ::GetCurrentProcess(), &hTarget,
	0, FALSE, DUPLICATE_CLOSE_SOURCE);
::CloseHandle(hTarget);
printf("Found it! and closed it!\n");
return 0;

This is it. If we get out of the loop, it means we failed to locate the handle with that name. The general technique of duplicating a handle and closing the source is applicable to kernel mode as well. It does require a process handle with PROCESS_DUP_HANDLE to make it work, which is not always possible to get from user mode. For example, protected and PPL (protected processes light) processes cannot be opened with this access mask, even by administrators. In kernel mode, on the other hand, any process can be opened with full access.

Next Windows Internals (Remote) Training

It’s been a while since I gave the Windows Internals training, so it’s time for another class of my favorite topics!

This time I decided to make it more afordable, to allow more people to participate. The cost is based on whether paid by an individual vs. a company. The training includes lab exercises – some involve working with tools, while others involve coding in C/C++.

  • Public 5-day remote class
  • Dates: April 20, 22, 23, 27, 30
  • Time: 8 hours / day. Exact hours TBD
  • Price: 750 USD (payed by individual) / 1500 USD (payed by company)
  • Register by emailing zodiacon@live.com and specifying “Windows Internals Training” in the title
    • Provide names of participants (discount available for multiple participants from the same company), company name (if any) and preferred time zone.
    • You’ll receive instructions for payment and other details
  • Virtual space is limited!

The training time zone will be finalized closer to the start date.

Objectives: Understand the Windows system architectureExplore the internal workings of process, threads, jobs, virtual memory, the I/O system and other mechanisms fundamental to the way Windows works

Write a simple software device driver to access/modify information not available from user mode

Target Audience: Experienced windows programmers in user mode or kernel mode, interested in writing better programs, by getting a deeper understanding of the internal mechanisms of the windows operating system.Security researchers interested in gaining a deeper understanding of Windows mechanisms (security or otherwise), allowing for more productive research
Pre-Requisites: Basic knowledge of OS concepts and architecture.Power user level working with Windows

Practical experience developing windows applications is an advantage

C/C++ knowledge is an advantage

  • Module 1: System Architecture
    • Brief Windows NT History
    • Windows Versions
    • Tools: Windows, Sysinternals, Debugging Tools for Windows
    • Processes and Threads
    • Virtual Memory
    • User mode vs. Kernel mode
    • Architecture Overview
    • Key Components
    • User/kernel transitions
    • APIs: Win32, Native, .NET, COM, WinRT
    • Objects and Handles
    • Sessions
    • Introduction to WinDbg
    • Lab: Task manager, Process Explorer, WinDbg
  • Module 2: Processes & Jobs
    • Process basics
    • Creating and terminating processes
    • Process Internals & Data Structures
    • The Loader
    • DLL explicit and implicit linking
    • Process and thread attributes
    • Protected processes and PPL
    • UWP Processes
    • Minimal and Pico processes
    • Jobs
    • Nested jobs
    • Introduction to Silos
    • Server Silos and Docker
    • Lab: viewing process and job information; creating processes; setting job limits
  • Module 3: Threads
    • Thread basics
    • Thread Internals & Data Structures
    • Creating and terminating threads
    • Thread Stacks
    • Thread Priorities
    • Thread Scheduling
    • CPU Sets
    • Direct Switch
    • Deep Freeze
    • Thread Synchronization
    • Lab: creating threads; thread synchronization; viewing thread information; CPU sets
  • Module 4: Kernel Mechanisms
    • Trap Dispatching
    • Interrupts
    • Interrupt Request Level (IRQL)
    • Deferred Procedure Calls (DPCs)
    • Exceptions
    • System Crash
    • Object Management
    • Objects and Handles
    • Sharing Objects
    • Thread Synchronization
    • Synchronization Primitives (Mutex, Semaphore, Events, and more)
    • Signaled vs. Non-Signaled
    • High IRQL Synchronization
    • Windows Global Flags
    • Kernel Event Tracing
    • Wow64
    • Lab: Viewing Handles, Interrupts; creating maximum handles; Thread synchronization
  • Module 5: Memory Management
    • Overview
    • Small, large and huge pages
    • Page states
    • Memory Counters
    • Address Space Layout
    • Address Translation Mechanisms
    • Heaps
    • APIs in User mode and Kernel mode
    • Page Faults
    • Page Files
    • Commit Size and Commit Limit
    • Workings Sets
    • Memory Mapped Files (Sections)
    • Page Frame Database
    • Other memory management features
    • Lab: committing & reserving memory; using shared memory; viewing memory related information
  • Module 6: Management Mechanisms
    • The Registry
    • Services
    • Starting and controlling services
    • Windows Management Instrumentation
    • Lab: Viewing and configuring services; Process Monitor

  • Module 7: I/O System
    • I/O System overview
    • Device Drivers
    • Plug & Play
    • The Windows Driver Model (WDM)
    • The Windows Driver Framework (WDF)
    • WDF: KMDF and UMDF
    • Device and Driver Objects
    • I/O Processing and Data Flow
    • IRPs
    • Power Management
    • Driver Verifier
    • Writing a Software Driver
    • Labs: viewing driver and device information; writing a software driver
  • Module 8: Security
    • Security Components
    • Virtualization Based Security
    • Hyper-V
    • Protecting objects
    • SIDs
    • User Access Control (UAC)
    • Tokens
    • Integrity Levels
    • ACLs
    • Privileges
    • Access checks
    • AppContainers
    • Logon
    • Control Flow Guard (CFG)
    • Process mitigations
    • Lab: viewing security information

Where did System Services 0 and 1 go?

System calls on Windows go through NTDLL.dll, where each system call is invoked by a syscall (x64) or sysenter (x86) CPU instruction, as can be seen from the following output of NtCreateFile from NTDLL:

0:000> u
ntdll!NtCreateFile:
00007ffc`c07fcb50 4c8bd1          mov     r10,rcx
00007ffc`c07fcb53 b855000000      mov     eax,55h
00007ffc`c07fcb58 f604250803fe7f01 test    byte ptr [SharedUserData+0x308 (00000000`7ffe0308)],1
00007ffc`c07fcb60 7503            jne     ntdll!NtCreateFile+0x15 (00007ffc`c07fcb65)
00007ffc`c07fcb62 0f05            syscall
00007ffc`c07fcb64 c3              ret
00007ffc`c07fcb65 cd2e            int     2Eh
00007ffc`c07fcb67 c3              ret

The important instructions are marked in bold. The value set to EAX is the system service number (0x55 in this case). The syscall instruction follows (the condition tested does not normally cause a branch). syscall causes transition to the kernel into the System Service Dispatcher routine, which is responsible for dispatching to the real system call implementation within the Executive. I will not go to the exact details here, but eventually, the EAX register must be used as a lookup index into the System Service Dispatch Table (SSDT), where each system service number (index) should point to the actual routine.

On x64 versions of Windows, the SSDT is available in the kernel debugger in the nt!KiServiceTable symbol:

lkd> dd nt!KiServiceTable
fffff804`13c3ec20  fced7204 fcf77b00 02b94a02 04747400
fffff804`13c3ec30  01cef300 fda01f00 01c06005 01c3b506
fffff804`13c3ec40  02218b05 0289df01 028bd600 01a98d00
fffff804`13c3ec50  01e31b00 01c2a200 028b7200 01cca500
fffff804`13c3ec60  02229b01 01bf9901 0296d100 01fea002

You might expect the values in the SSDT to be 64-bit pointers, pointing directly to the system services (this is the scheme used on x86 systems). On x64 the values are 32 bit, and are used as offsets from the start of the SSDT itself. However, the offset does not include the last hex digit (4 bits): this last value is the number of arguments to the system call.

Let’s see if this holds with NtCreateFile. Its service number is 0x55 as we’ve seen from user mode, so to get to the actual offset, we need to perform a simple calculation:

kd> dd nt!KiServiceTable+55*4 L1
fffff804`13c3ed74  020b9207

Now we need to take this offset (without the last hex digit), add it to the SSDT and this should point at NtCreateFile:

lkd> u nt!KiServiceTable+020b920
nt!NtCreateFile:
fffff804`13e4a540 4881ec88000000  sub     rsp,88h
fffff804`13e4a547 33c0            xor     eax,eax
fffff804`13e4a549 4889442478      mov     qword ptr [rsp+78h],rax
fffff804`13e4a54e c744247020000000 mov     dword ptr [rsp+70h],20h

Indeed – this is NtCreateFile. What about the argument count? The value stored is 7. Here is the prototype of NtCreateFile (documented in the WDK as ZwCreateFile):

NTSTATUS NtCreateFile(
  PHANDLE            FileHandle,
  ACCESS_MASK        DesiredAccess,
  POBJECT_ATTRIBUTES ObjectAttributes,
  PIO_STATUS_BLOCK   IoStatusBlock,
  PLARGE_INTEGER     AllocationSize,
  ULONG              FileAttributes,
  ULONG              ShareAccess,
  ULONG              CreateDisposition,
  ULONG              CreateOptions,
  PVOID              EaBuffer,
  ULONG              EaLength);

Clearly, there are 11 parameters, not just 7. Why the discrepency? The stored value is the number of parameters that are passed using the stack. In x64 calling convention, the first 4 arguments are passed using registers: RCX, RDX, R8, R9 (in this order).

Now back to the title of this post. Here are the first few entries in the SSDT again:

lkd> dd nt!KiServiceTable
fffff804`13c3ec20  fced7204 fcf77b00 02b94a02 04747400
fffff804`13c3ec30  01cef300 fda01f00 01c06005 01c3b506

The first two entries look different, with much larger numbers. Let’s try to apply the same logic for the first value (index 0):

kd> u nt!KiServiceTable+fced720
fffff804`2392c340 ??              ???
                    ^ Memory access error in 'u nt!KiServiceTable+fced720'

Clearly a bust. The value is in fact a negative value (in two’s complement), so we need to sign-extend it to 64 bit, and then perform the addition (leaving out the last hex digit as before):

kd> u nt!KiServiceTable+ffffffff`ffced720
nt!NtAccessCheck:
fffff804`1392c340 4c8bdc          mov     r11,rsp
fffff804`1392c343 4883ec68        sub     rsp,68h
fffff804`1392c347 488b8424a8000000 mov     rax,qword ptr [rsp+0A8h]

This is NtAccessCheck. The function’s implementation is in lower addresses than the SSDT itself. Let’s try the same exercise with index 1:

kd> u nt!KiServiceTable+ffffffff`ffcf77b0
nt!NtWorkerFactoryWorkerReady:
fffff804`139363d0 4c8bdc          mov     r11,rsp
fffff804`139363d3 49895b08        mov     qword ptr [r11+8],rbx

And we get system call number 1: NtWorkerFactoryWorkerReady.

For those fond of WinDbg scripting – write a script to display nicely all system call functions and their indices.

 

Windows 10 Desktops vs. Sysinternals Desktops

One of the new Windows 10 features visible to users is the support for additional “Desktops”. It’s now possible to create additional surfaces on which windows can be used. This idea is not new – it has been around in the Linux world for many years (e.g. KDE, Gnome), where users have 4 virtual desktops they can use. The idea is that to prevent clutter, one desktop can be used for web browsing, for example, and another desktop can be used for all dev work, and yet a third desktop could be used for all social / work apps (outlook, WhatsApp, Facebook, whatever).

To create an additional virtual desktop on Windows 10, click on the Task View button on the task bar, and then click the “New Desktop” button marked with a plus sign.

newvirtualdesktop

Now you can switch between desktops by clicking the appropriate desktop button and then launch apps as usual. It’s even possible (by clicking Task View again) to move windows from desktop to desktop, or to request that a window be visible on all desktops.

The Sysinternals tools had a tool called “Desktops” for many years now. It too allows for creation of up to 4 desktops where applications can be launched. The question is – is this Desktops tool the same as the Windows 10 virtual desktops feature? Not quite.

First, some background information. In the kernel object hierarchy under a session object, there are window stations, desktops and other objects. Here’s a diagram summarizing this tree-like relationship:

Sessions

As can be seen in the diagram, a session contains a set of Window Stations. One window station can be interactive, meaning it can receive user input, and is always called winsta0. If there are other window stations, they are non-interactive.

Each window station contains a set of desktops. Each of these desktops can hold windows. So at any given moment, an interactive user can interact with a single desktop under winsta0. Upon logging in, a desktop called “Default” is created and this is where all the normal windows appear. If you click Ctrl+Alt+Del for example, you’ll be transferred to another desktop, called “Winlogon”, that was created by the winlogon process. That’s why your normal windows “disappear” – you have been switched to another desktop where different windows may exist. This switching is done by a documented function – SwitchDesktop.

And here lies the difference between the Windows 10 virtual desktops and the Sysinternals desktops tool. The desktops tool actually creates desktop objects using the CreateDesktop API. In that desktop, it launches Explorer.exe so that a taskbar is created on that desktop – initially the desktop has nothing on it. How can desktops launch a process that by default creates windows in a different desktop? This is possible to do with the normal CreateProcess function by specifying the desktop name in the STARTUPINFO structure’s lpDesktop member. The format is “windowstation\desktop”. So in the desktops tool case, that’s something like “winsta0\Sysinternals Desktop 1”. How do I know the name of the Sysinternals desktop objects? Desktops can be enumerated with the EnumDesktops API. I’ve written a small tool, that enumerates window stations and desktops in the current session. Here’s a sample output when one additional desktop has been created with “desktops”:

desktops1

In the Windows 10 virtual desktops feature, no new desktops are ever created. Win32k.sys just manipulates the visibility of windows and that’s it. Can you guess why? Why doesn’t Window 10 use the CreateDesktop/SwitchDesktop APIs for its virtual desktop feature?

The reason has to do with some limitations that exist on desktop objects. For one, a window (technically a thread) that is bound to a desktop cannot be switched to another; in other words, there is no way to transfer a windows from one desktop to another. This is intentional, because desktops provide some protection. For example, hooks set with SetWindowsHookEx can only be set on the current desktop, so cannot affect other windows in other desktops. The Winlogon desktop, as another example, has a strict security descriptor that prevents non system-level users from accessing that desktop. Otherwise, that desktop could have been tampered with.

The virtual desktops in Windows 10 is not intended for security purposes, but for flexibility and convenience (security always “contradicts” convenience). That’s why it’s possible to move windows between desktops, because there is no real “moving” going on at all. From the kernel’s perspective, everything is still on the same “Default” desktop.

 

 

 

Next Windows Kernel Programming Remote Class

The next public remote Windows kernel Programming class I will be delivering is scheduled for April 15 to 18. It’s going to be very similar to the first one I did at the end of January (with some slight modifications and additions).

Cost: 1950 USD. Early bird (register before March 30th): 1650 USD

I have not yet finalized the time zone the class will be “targeting”. I will update in a few weeks on that.

If you’re interested in registering, please email zodiacon@live.com with the subject “Windows Kernel Programming class” and specify your name, company (if any) and time zone. I’ll reply by providing more information.

Feel free to contact me for questions using the email or through twitter (@zodiacon).

The complete syllabus is outlined below:

Duration: 4 Days
Target Audience: Experienced windows developers, interested in developing kernel mode drivers
Objectives: ·  Understand the Windows kernel driver programming model

·  Write drivers for monitoring processes, threads, registry and some types of objects

·  Use documented kernel hooking mechanisms

·  Write basic file system mini-filter drivers

Pre Requisites: ·  At least 2 years of experience working with the Windows API

·  Basic understanding of Windows OS concepts such as processes, threads, virtual memory and DLLs

Software requirements: ·  Windows 10 Pro 64 bit (latest stable version)
·  Visual Studio 2017 + latest update
·  Windows 10 SDK (latest)
·  Windows 10 WDK (latest)
·  Virtual Machine for testing and debugging

Instructor: Pavel Yosifovich

Abstract

The cyber security industry has grown considerably in recent years, with more sophisticated attacks and consequently more defenders. To have a fighting chance against these kinds of attacks, kernel mode drivers must be employed, where nothing (at least nothing from user mode) can escape their eyes.
The course provides the foundations for the most common software device drivers that are useful not just in cyber security, but also other scenarios, where monitoring and sometimes prevention of operations is required. Participants will write real device drivers with useful features they can then modify and adapt to their particular needs.

Syllabus

  • Module 1: Windows Internals quick overview
    • Processes
    • Virtual memory
    • Threads
    • System architecture
    • User / kernel transitions
    • Introduction to WinDbg
    • Windows APIs
    • Objects and handles
    • Summary

 

  • Module 2: The I/O System
    • I/O System overview
    • Device Drivers
    • The Windows Driver Model (WDM)
    • The Kernel Mode Driver Framework (KMDF)
    • Other device driver models
    • Driver types
    • Software drivers
    • Driver and device objects
    • I/O Processing and Data Flow
    • Accessing devices
    • Asynchronous I/O
    • Summary

 

  • Module 3: Kernel programming basics
    • Setting up for Kernel Development
    • Basic Kernel types and conventions
    • C++ in a kernel driver
    • Creating a driver project
    • Building and deploying
    • The kernel API
    • Strings
    • Linked Lists
    • The DriverEntry function
    • The Unload routine
    • Installation
    • Testing
    • Debugging
    • Summary
    • Lab: deploy a driver

 

  • Module 4: Building a simple driver
    • Creating a device object
    • Exporting a device name
    • Building a driver client
    • Driver dispatch routines
    • Introduction to I/O Request Packets (IRPs)
    • Completing IRPs
    • Handling DeviceIoControl calls
    • Testing the driver
    • Debugging the driver
    • Using WinDbg with a virtual machine
    • Summary
    • Lab: open a process for any access; zero driver; debug a driver

 

  • Module 5: Kernel mechanisms
    • Interrupt Request Levels (IRQLs)
    • Deferred Procedure Calls (DPCs)
    • Dispatcher objects
    • Low IRQL Synchronization
    • Spin locks
    • Work items
    • Summary

 

  • Module 6: Process and thread monitoring
    • Motivation
    • Process creation/destruction callback
    • Specifying process creation status
    • Thread creation/destruction callback
    • Notifying user mode
    • Writing a user mode client
    • Preventing potentially malicious processes from executing
    • Summary
    • Lab: monitoring process/thread activity; prevent specific processes from running; protecting processes

 

  • Module 7: Object and registry notifications
    • Process/thread object notifications
    • Pre and post callbacks
    • Registry notifications
    • Performance considerations
    • Reporting results to user mode
    • Summary
    • Lab: protect specific process from termination; simple registry monitor

 

  • Module 8: File system mini filters
    • File system model
    • Filters vs. mini filters
    • The Filter Manager
    • Filter registration
    • Pre and Post callbacks
    • File name information
    • Contexts
    • File system operations
    • Filter to user mode communication
    • Debugging mini-filters
    • Summary
    • Labs: protect a directory from file deletion; backup file before deletion

Fun with AppContainers

AppContainers are the sanboxes typically used to run UWP processes (also known as metro, store, modern…). A process within an AppContainer runs with an Integrity Level of low, which effectively means it has no access to almost everything, as the default integrity level of objects (such as files) is Medium. This means code running inside an AppContainer can’t do any sigtnificant damage because of that lack of access. Furthermore, from an Object Manager perspective, named objects created by an AppContainer are stored under its own object manager directory, based on an identifier known as AppContainer SID. This means one AppContainer cannot interfere with another’s objects.

For example, if a process not in an AppContainer creates a mutex named “abc”, its full name is really “\Sessions\1\BaseNamedObjects\abc” (assuming the process runs in session 1). On the other hand, if AppContainer A creates a mutex named “abc”, its full name is something like “\Sessions\1\AppContainerNamedObjects\S-1-15-2-466767348-3739614953-2700836392-1801644223-4227750657-1087833535-2488631167\abc”, meaning it can nevr interfere with another AppContainer or any process running outside of an AppContainer.

Although AppContainers were created specifically for store apps, theye can also be used to execute “normal” applications, providing the same level of security and isolation. Let’s see how to do that.

First, we need to create the AppContainer and obtain an AppContainer SID. This SID is based on a hash of the container name. In the UWP world, this name is made up of the application package and the 13 digits of the signer’s hash. For normal applications, we can select any string; selecting the same string would yield the same SID – which means we can actually use it to “bundle” several processes into the same AppContainer.

The first step is to create an AppContainer profile (error handling ommitted):

PSID appContainerSid;
::CreateAppContainerProfile(containerName, containerName, containerName, nullptr, 0, &appContainerSid);

The containerName argument is the important one. If the function fails, it probably means the container profile exists already. In that case, we need to extract the SID from the existing profile:

::DeriveAppContainerSidFromAppContainerName(containerName, &appContainerSid);

The next step is prepare for process creation. The absolute minimum is to initialize a process attribute list with a SECURITY_CAPABILITIES structure to indicate we want the process to be created inside an AppContainer. As part of this, we can specify capabilities this AppContainer should have, such as internet access, access to the documents library and any other capabilities as defined by the Windows Runtime:

STARTUPINFOEX si = { sizeof(si) };
PROCESS_INFORMATION pi;
SIZE_T size;
SECURITY_CAPABILITIES sc = { 0 };
sc.AppContainerSid = appContainerSid;

::InitializeProcThreadAttributeList(nullptr, 1, 0, &size); 
auto buffer = std::make_unique<BYTE[]>(size); 
si.lpAttributeList = reinterpret_cast<LPPROC_THREAD_ATTRIBUTE_LIST>(buffer.get()); 
::InitializeProcThreadAttributeList(si.lpAttributeList, 1, 0, &size)); 
::UpdateProcThreadAttribute(si.lpAttributeList, 0, PROC_THREAD_ATTRIBUTE_SECURITY_CAPABILITIES, &sc, sizeof(sc), nullptr, nullptr));

We specified zero capabilities for now. Now we’re ready to create the process:

::CreateProcess(nullptr, exePath, nullptr, nullptr, FALSE,
	EXTENDED_STARTUPINFO_PRESENT, nullptr, nullptr, 
        (LPSTARTUPINFO)&si, &pi);

We can try this with the usual first victim, Notepad. Notepad launches and everyhing seems OK. However, if we try to open almost any file by using Notepad’s File/Open menu item, we’ll see that notepad has no access to usual things, such as “my documents” or “my pictures”. This is because it’s runnign with Low Integrity Level and files are defaulted to Medium integrity level:

“AppContainer” in Process Explorer implies Low integirty level.

If we would want Notepad to have access to the user’s files, such as documents and pictures, we would have to set explict permissions on these objects allowing access to the AppContainer SID. Functions to use include SetNamedSecurityInfo (see the project on Github for the full code).

I’ve created a simple application to test these things. We can specify a container name, an executable path and click “Run” to execute in in AppContainer. We can add folders/files that would get full permissions:

appcontainer8

Let’s now try a more interesting application – Windows Media Player (yes, I know, who uses the old Media Player these days? But it’s an interesting example). Windows Media Player has the (annoying?) feature where you can only run a single instance of it at any given time. The way this works is that WMP creates a mutex with a very specific name, “Microsoft_WMP_70_CheckForOtherInstanceMutex“, if it already exists, it sends a message to its buddy (a previous instance of WMP) and then terminates. A simple trick we can do with Process Explorer is to close that handle and then launching another WMP.

Let’s try something different: let’s run WMP in an AppContainer. Then let’s run another one in a different AppContainer. Will we get two instances?

appcontainer4

Running WMP in this way popus up its helper, setup_wm.exe which asks for the initial settings for WMP. Clicking “Express Settings” closes the dialog but it then it comes up again! And again! You can’t get rid of it, unless you close the dialog and then WMP does not launch. Can you guess why is that?

If you guessed “permissions” – you are correct. Running Process Monitor when this dialog comes up and filtering for “ACCESS DENIED” shows something like this:

appcontainer5

Clearly, some keys need access so the settings can be saved. The tool allows adding these keys and setting full permissions for them:

appcontainer6

Now we can run WMP in two different containers (change the container name and re-run) and they both run just fine. That’s becuase each mutex now has a unique name prefixed with the AppContainer SID of the relevant AppContainer:

appcontainer7

The code can be found at https://github.com/zodiacon/RunAppContainer.

 

Public Windows Kernel Programming Class

After a short twitter questionaire, I’m excited to announce a Remote Windows Kernel Programming class to be scheduled for the end of January 2019 (28 to 31).

If you want to learn how to write software drivers for Windows (not hardware, plug & play drivers), including file system mini filters – this is the class for you! You should be comfortable with programming on Windows in user mode (although we’ll discuss some of the finer points of working with the Windows API) and have a basic understanding of Windows OS concepts such as processes, threads and virtual memory.

If you’re interested, send an email to zodiacon@live.com with the title “Windows Kernel Programming Training” with your name, company name (if any), and time zone. I will reply with further details.

Here is the syllabus (not final, but should be close enough):

Windows Kernel Programming

Duration: 4 Days (January 28th to 31st, 2019)
Target Audience: Experienced windows developers, interested in developing kernel mode drivers
Objectives: · Understand the Windows kernel driver programming model

· Write drivers for monitoring processes, threads, registry and some types of objects

· Use documented kernel hooking mechanisms

· Write basic file system mini-filter drivers

Pre Requisites: · At least 1 year of experience working with the Windows API

· Basic understanding of Windows OS concepts such as processes, threads, virtual memory and DLLs

Software requirements: · Windows 10 Pro 64 bit (latest official release)

· Virtual machine (preferable Windows 10 64 bit) using any virtualization technology (for testing and debugging)

· Visual Studio 2017 (any SKU) + latest update

· Windows 10 SDK (latest)

· Windows 10 WDK (latest)

Cost: $1950

Syllabus

  • Module 1: Windows Internals quick overview
    • Processes and threads
    • System architecture
    • User / kernel transitions
    • Virtual memory
    • APIs
    • Objects and handles
    • Summary

 

  • Module 2: The I/O System and Device Drivers
    • I/O System overview
    • Device Drivers
    • The Windows Driver Model (WDM)
    • The Kernel Mode Driver Framework (KMDF)
    • Other device driver models
    • Driver types
    • Software drivers
    • Driver and device objects
    • I/O Processing and Data Flow
    • Accessing files and devices
    • Asynchronous I/O
    • Summary

 

  • Module 3: Kernel programming basics
    • Installing the tools: Visual Studio, SDK, WDK
    • C++ in a kernel driver
    • Creating a driver project
    • Building and deploying
    • The kernel API
    • Strings
    • Linked Lists
    • Kernel Memory Pools
    • The DriverEntry function
    • The Unload routine
    • Installation
    • Summary
    • Lab: create a simple driver; deploy a driver

 

  • Module 4: Building a simple driver
    • Creating a device object
    • Exporting a device name
    • Building a driver client
    • Driver dispatch routines
    • Introduction to I/O Request Packets (IRPs)
    • Completing IRPs
    • Dealing with user space buffers
    • Handling DeviceIoControl calls
    • Testing the driver
    • Debugging the driver
    • Using WinDbg with a virtual machine
    • Summary
    • Lab: open a process for any access; zero driver; debug a driver

 

  • Module 5: Kernel mechanisms
    • Interrupt Request Levels (IRQLs)
    • Interrupts
    • Deferred Procedure Calls (DPCs)
    • Dispatcher objects
    • Thread Synchronization
    • Spin locks
    • Work items
    • Summary

 

  • Module 6: Process and thread monitoring
    • Process creation/destruction callback
    • Specifying process creation status
    • Thread creation/destruction callback
    • Notifying user mode
    • Writing a user mode client
    • User/kernel communication
    • Summary
    • Labs: monitoring process/thread activity; prevent specific processes from running; protecting processes

 

  • Module 7: Object and registry notifications
    • Process/thread object notifications
    • Pre and post callbacks
    • Registry notifications
    • Performance considerations
    • Reporting results to user mode
    • Summary
    • Lab: protect specific process from termination; hiding registry keys; simple registry monitor

 

  • Module 8: File system mini filters
    • File system model
    • Filters vs. mini filters
    • The Filter Manager
    • Filter registration
    • Pre and Post callbacks
    • File name information
    • Contexts
    • File system operations
    • Driver to user mode communication
    • Debugging mini-filters
    • Summary
    • Labs: protect a directory from write; hide a file/directory; prevent file/directory deletion; log file operations