The Power of UI Automation

What if you needed to get a list of all the open browser tabs in some browser? In the (very) old days you might assume that each tab is its own window, so you could find a main browser window (using FindWindow, for example), and then enumerate child windows with EnumChildWindows to locate the tabs. Unfortunately, this approach is destined to fail. Here is a screenshot of WinSpy looking at a main window of Microsoft Edge:

MS Edge showing only two child windows

The title of the main window hints to the existence of 26 tabs, but there are only two child windows and they are not tabs. The inevitable conclusion is that the tabs are not windows at all. They are being “drawn” with some technology that the Win32 windowing infrastructure doesn’t know about nor cares.

How can we get information about those browsing tabs? Enter UI Automation.

UI Automation has been around for many years, starting with the older technology called “Active Accessibility“. This technology is geared towards accessibility while providing rich information that can be consumed by accessibility clients. Although Active Accessibility is still supported for compatibility reasons, a newer technology called UI Automation supersedes it.

UI Automation provides a tree of UI automation elements representing various aspects of a user interface. Some elements represent “true” Win32 windows (have HWND), some represent internal controls like buttons and edit boxes (created with whatever technology), and some elements are virtual (don’t have any graphical aspects), but instead provide “metadata” related to other items.

The UI Automation client API uses COM, where the root object implements the IUIAutomation interface (it has extended interfaces implemented as well). To get the automation object, the following C++ code can be used (we’ll see a C# example later):

CComPtr<IUIAutomation> spUI;
auto hr = spUI.CoCreateInstance(__uuidof(CUIAutomation));
if (FAILED(hr))
	return Error("Failed to create Automation root", hr);

The client automation interfaces are declared in <UIAutomationClient.h>. The code uses the ATL CComPtr<> smart pointers, but any COM smart or raw pointers will do.

With the UI Automation object pointer in hand, several options are available. One is to enumerate the full or part of the UI element tree. To get started, we can obtain a “walker” object by calling IUIAutomation::get_RawViewWalker. From there, we can start enumerating by calling IUIAutomationTreeWalker interface methods, like GetFirstChildElement and GetNextSiblingElement.

Each element, represented by a IUIAutomationElement interface provides a set of properties, some available directly on the interface (e.g. get_CurrentName, get_CurrentClassName, get_CurrentProcessId), while others hide behind a generic method, get_CurrentPropertyValue, where each property has an integer ID, and the result is a VARIANT, to allow for various types of values.

Using this method, the menu item View Automation Tree in WinSpy shows the full automation tree, and you can drill down to any level, while many of the selected element’s properties are shown on the right:

WinSpy automation tree view

If you dig deep enough, you’ll find that MS Edge tabs have a UI automation class name of “EdgeTab”. This is the key to locating browser tabs. (Other browsers may have a different class name). To find tabs, we can enumerate the full tree manually, but fortunately, there is a better way. IUIAutomationElement has a FindAll method that searches for elements based on a set of conditions. The conditions available are pretty flexible – based on some property or properties of elements, which can be combined with And, Or, etc. to get more complex conditions. In our case, we just need one condition – a class name called “EdgeTab”.

First, we’ll create the root object, and the condition (error handling omitted for brevity):

int main() {
	::CoInitialize(nullptr);

	CComPtr<IUIAutomation> spUI;
	auto hr = spUI.CoCreateInstance(__uuidof(CUIAutomation));

	CComPtr<IUIAutomationCondition> spCond;
	CComVariant edgeTab(L"EdgeTab");
	spUI->CreatePropertyCondition(UIA_ClassNamePropertyId, edgeTab, &spCond);

We have a single condition for the class name property, which has an ID defined in the automation headers. Next, we’ll fire off the search from the root element (desktop):

CComPtr<IUIAutomationElementArray> spTabs;
CComPtr<IUIAutomationElement> spRoot;
spUI->GetRootElement(&spRoot);
hr = spRoot->FindAll(TreeScope_Descendants, spCond, &spTabs);

All that’s left to do is harvest the results:

int count = 0;
spTabs->get_Length(&count);
for (int i = 0; i < count; i++) {
	CComPtr<IUIAutomationElement> spTab;
	spTabs->GetElement(i, &spTab);
	CComBSTR name;
	spTab->get_CurrentName(&name);
	int pid;
	spTab->get_CurrentProcessId(&pid);
	printf("%2d PID %6d: %ws\n", i + 1, pid, name.m_str);
}

Try it!

.NET Code

A convenient Nuget package called Interop.UIAutomationClient.Signed provides wrappers for the automation API for .NET clients. Here is the same search done in C# after adding the Nuget package reference:

static void Main(string[] args) {
    const int ClassPropertyId = 30012;
    var ui = new CUIAutomationClass();
    var cond = ui.CreatePropertyCondition(ClassPropertyId, "EdgeTab");
    var tabs = ui.GetRootElement().FindAll(TreeScope.TreeScope_Descendants, cond);
    for (int i = 0; i < tabs.Length; i++) {
        var tab = tabs.GetElement(i);
        Console.WriteLine($"{i + 1,2} PID {tab.CurrentProcessId,6}: {tab.CurrentName}");
    }
}

More Automation

There is a lot more to UI automation – the word “automation” implies some more control. One capability of the API is providing various notifications when certain aspects of elements change. Examples include the IUIAutomation methods AddAutomationEventHandler, AddFocusChangedEventHandler, AddPropertyChangedEventHandler, and AddStructureChangedEventHandler.

More specific information on elements (and some control) is also available with more specific interfaces related to controls, such as IUIAutomationTextPattern, IUIAutomationTextRange, and manu more.

Happy automation!

ObjDir – Rust Version

In the previous post, I’ve shown how to write a minimal, but functional, Projected File System provider using C++. I also semi-promised to write a version of that provider in Rust. I thought we should start small, by implementing a command line tool I wrote years ago called objdir. Its purpose is to be a “command line” version of a simplified WinObj from Sysinternals. It should be able to list objects (name and type) within a given object manager namespace directory. Here are a couple of examples:

D:\>objdir \
PendingRenameMutex (Mutant)
ObjectTypes (Directory)
storqosfltport (FilterConnectionPort)
MicrosoftMalwareProtectionRemoteIoPortWD (FilterConnectionPort)
Container_Microsoft.OutlookForWindows_1.2024.214.400_x64__8wekyb3d8bbwe-S-1-5-21-3968166439-3083973779-398838822-1001 (Job)
MicrosoftDataLossPreventionPort (FilterConnectionPort)
SystemRoot (SymbolicLink)
exFAT (Device)
Sessions (Directory)
MicrosoftMalwareProtectionVeryLowIoPortWD (FilterConnectionPort)
ArcName (Directory)
PrjFltPort (FilterConnectionPort)
WcifsPort (FilterConnectionPort)
...

D:\>objdir \kernelobjects
MemoryErrors (SymbolicLink)
LowNonPagedPoolCondition (Event)
Session1 (Session)
SuperfetchScenarioNotify (Event)
SuperfetchParametersChanged (Event)
PhysicalMemoryChange (SymbolicLink)
HighCommitCondition (SymbolicLink)
BcdSyncMutant (Mutant)
HighMemoryCondition (SymbolicLink)
HighNonPagedPoolCondition (Event)
MemoryPartition0 (Partition)
...

Since enumerating object manager directories is required for our ProjFS provider, once we implement objdir in Rust, we’ll have good starting point for implementing the full provider in Rust.

This post assumes you are familiar with the fundamentals of Rust. Even if you’re not, the code should still be fairly understandable, as we’re mostly going to use unsafe rust to do the real work.

Unsafe Rust

One of the main selling points of Rust is its safety – memory and concurrency safety guaranteed at compile time. However, there are cases where access is needed that cannot be checked by the Rust compiler, such as the need to call external C functions, such as OS APIs. Rust allows this by using unsafe blocks or functions. Within unsafe blocks, certain operations are allowed which are normally forbidden; it’s up to the developer to make sure the invariants assumed by Rust are not violated – essentially making sure nothing leaks, or otherwise misused.

The Rust standard library provides some support for calling C functions, mostly in the std::ffi module (FFI=Foreign Function Interface). This is pretty bare bones, providing a C-string class, for example. That’s not rich enough, unfortunately. First, strings in Windows are mostly UTF-16, which is not the same as a classic C string, and not the same as the Rust standard String type. More importantly, any C function that needs to be invoked must be properly exposed as an extern "C" function, using the correct Rust types that provide the same binary representation as the C types.

Doing all this manually is a lot of error-prone, non-trivial, work. It only makes sense for simple and limited sets of functions. In our case, we need to use native APIs, like NtOpenDirectoryObject and NtQueryDirectoryObject. To simplify matters, there are crates available in crates.io (the master Rust crates repository) that already provide such declarations.

Adding Dependencies

Assuming you have Rust installed, open a command window and create a new project named objdir:

cargo new objdir

This will create a subdirectory named objdir, hosting the binary crate created. Now we can open cargo.toml (the manifest) and add dependencies for the following crates:

[dependencies]
ntapi = "0.4"
winapi = { version = "0.3.9", features = [ "impl-default" ] }

winapi provides most of the Windows API declarations, but does not provide native APIs. ntapi provides those additional declarations, and in fact depends on winapi for some fundamental types (which we’ll need). The feature “impl-default” indicates we would like the implementations of the standard Rust Default trait provided – we’ll need that later.

The main Function

The main function is going to accept a command line argument to indicate the directory to enumerate. If no parameters are provided, we’ll assume the root directory is requested. Here is one way to get that directory:

let dir = std::env::args().skip(1).next().unwrap_or("\\".to_owned());

(Note that unfortunately the WordPress system I’m using to write this post has no syntax highlighting for Rust, the code might be uglier than expected; I’ve set it to C++).

The args method returns an iterator. We skip the first item (the executable itself), and grab the next one with next. It returns an Option<String>, so we grab the string if there is one, or use a fixed backslash as the string.

Next, we’ll call a helper function, enum_directory that does the heavy lifting and get back a Result where success is a vector of tuples, each containing the object’s name and type (Vec<(String, String)>). Based on the result, we can display the results or report an error:

let result = enum_directory(&dir);
match result {
    Ok(objects) => {
        for (name, typename) in &objects {
            println!("{name} ({typename})");
        }
        println!("{} objects.", objects.len());
    },
    Err(status) => println!("Error: 0x{status:X}")
};

That is it for the main function.

Enumerating Objects

Since we need to use APIs defined within the winapi and ntapi crates, let’s bring them into scope for easier access at the top of the file:

use winapi::shared::ntdef::*;
use ntapi::ntobapi::*;
use ntapi::ntrtl::*;

I’m using the “glob” operator (*) to make it easy to just use the function names directly without any prefix. Why these specific modules? Based on the APIs and types we’re going to need, these are where these are defined (check the documentation for these crates).

enum_directory is where the real is done. Here its declararion:

fn enum_directory(dir: &str) -> Result<Vec<(String, String)>, NTSTATUS> {

The function accepts a string slice and returns a Result type, where the Ok variant is a vector of tuples consisting of two standard Rust strings.

The following code follows the basic logic of the EnumDirectoryObjects function from the ProjFS example in the previous post, without the capability of search or filter. We’ll add that when we work on the actual ProjFS project in a future post.

The first thing to do is open the given directory object with NtOpenDirectoryObject. For that we need to prepare an OBJECT_ATTRIBUTES and a UNICODE_STRING. Here is what that looks like:

let mut items = vec![];

unsafe {
    let mut udir = UNICODE_STRING::default();
    let wdir = string_to_wstring(&dir);
    RtlInitUnicodeString(&mut udir, wdir.as_ptr());
    let mut dir_attr = OBJECT_ATTRIBUTES::default();
    InitializeObjectAttributes(&mut dir_attr, &mut udir, OBJ_CASE_INSENSITIVE, NULL, NULL);

We start by creating an empty vector to hold the results. We don’t need any type annotation because later in the code the compiler would have enough information to deduce it on its own. We then start an unsafe block because we’re calling C APIs.

Next, we create a default-initialized UNICODE_STRING and use a helper function to convert a Rust string slice to a UTF-16 string, usable by native APIs. We’ll see this string_to_wstring helper function once we’re done with this one. The returned value is in fact a Vec<u16> – an array of UTF-16 characters.

The next step is to call RtlInitUnicodeString, to initialize the UNICODE_STRING based on the UTF-16 string we just received. Methods such as as_ptr are necessary to make the Rust compiler happy. Finally, we create a default OBJECT_ATTRIBUTES and initialize it with the udir (the UTF-16 directory string). All the types and constants used are provided by the crates we’re using.

The next step is to actually open the directory, which could fail because of insufficient access or a directory that does not exist. In that case, we just return an error. Otherwise, we move to the next step:

let mut hdir: HANDLE = NULL;
match NtOpenDirectoryObject(&mut hdir, DIRECTORY_QUERY, &mut dir_attr) {
    0 => {
        // do real work...
    },
    err => Err(err),
}

The NULL here is just a type alias for the Rust provided C void pointer with a value of zero (*mut c_void). We examine the NTSTATUS returned using a match expression: If it’s not zero (STATUS_SUCCESS), it must be an error and we return an Err object with the status. if it’s zero, we’re good to go. Now comes the real work.

We need to allocate a buffer to receive the object information in this directory and be prepared for the case the information is too big for the allocated buffer, so we may need to loop around to get the next “chunk” of data. This is how the NtQueryDirectoryObject is expected to be used. Let’s allocate a buffer using the standard Vec<> type and prepare some locals:

const LEN: u32 = 1 << 16;
let mut first = 1;
let mut buffer: Vec<u8> = Vec::with_capacity(LEN as usize);
let mut index = 0u32;
let mut size: u32 = 0;

We’re allocating 64KB, but could have chosen any number. Now the loop:

loop {
    let start = index;
    if NtQueryDirectoryObject(hdir, buffer.as_mut_ptr().cast(), LEN, 0, first, &mut index, &mut size) < 0 {
        break;
    }
    first = 0;
    let mut obuffer = buffer.as_ptr() as *const OBJECT_DIRECTORY_INFORMATION;
    for _ in 0..index - start {
        let item = *obuffer;
        let name = String::from_utf16_lossy(std::slice::from_raw_parts(item.Name.Buffer, (item.Name.Length / 2) as usize));
        let typename = String::from_utf16_lossy(std::slice::from_raw_parts(item.TypeName.Buffer, (item.TypeName.Length / 2) as usize));
        items.push((name, typename));
        obuffer = obuffer.add(1);
    }
}
Ok(items)

There are quite a few things going on here. if NtQueryDirectoryObject fails, we break out of the loop. This happens when there are is no more information to give. If there is data, buffer is cast to a OBJECT_DIRECTORY_INFORMATION pointer, and we can loop around on the items that were returned. start is used to keep track of the previous number of items delivered. first is 1 (true) the first time through the loop to force the NtQueryDirectoryObject to start from the beginning.

Once we have an item (item), its two members are extracted. item is of type OBJECT_DIRECTORY_INFORMATION and has two members: Name and TypeName (both UNICODE_STRING). Since we want to return standard Rust strings (which, by the way, are UTF-8 encoded), we must convert the UNICODE_STRINGs to Rust strings. String::from_utf16_lossy performs such a conversion, but we must specify the number of characters, because a UNICODE_STRING does not have to be NULL-terminated. The trick here is std::slice::from_raw_parts that can have a length, which is half of the number of bytes (Length member in UNICODE_STRING).

Finally, Vec<>.push is called to add the tuple (name, typename) to the vector. This is what allows the compiler to infer the vector type. Once we exit the loop, the Ok variant of Result<> is returned with the vector.

The last function used is the helper to convert a Rust string slice to a UTF-16 null-terminated string:

fn string_to_wstring(s: &str) -> Vec<u16> {
    let mut wstring: Vec<_> = s.encode_utf16().collect();
    wstring.push(0);    // null terminator
    wstring
}

And that is it. The Rust version of objdir is functional.

The full source is at zodiacon/objdir-rs: Rust version of the objdir tool (github.com)

If you want to know more about Rust, consider signing up for my upcoming Rust masterclass programming.

Projected File System

A little-known feature in modern Windows is the ability to expose hierarchical data using the file system. This is called Windows Projected File System (ProjFS), available since Windows 10 version 1809. There is even a sample that exposes the Registry hierarchy using this technology. Using the file system as a “projection” mechanism provides a couple of advantages over a custom mechanism:

  • Any file viewing tool can present the information such as Explorer, or commands in a terminal.
  • “Standard” file APIs are used, which are well-known, and available in any programming language or library.

Let’s see how to build a Projected File System provider from scratch. We’ll expose object manager directories as file system directories, and other types of objects as “files”. Normally, we can see the object manager’s namespace with dedicated tools, such as WinObj from Sysinternals, or my own Object Explorer:

WinObj showing parts of the object manager namespace

Here is an example of what we are aiming for (viewed with Explorer):

Explorer showing the root of the object manager namespace

First, support for ProjFS must be enabled to be usable. You can enable it with the Windows Features dialog or PowerShell:

Enable-WindowsOptionalFeature -Online -FeatureName Client-ProjFS -NoRestart

We’ll start by creating a C++ console application named ObjMgrProjFS; I’ve used the Windows Desktop Wizard project with a precompiled header (pch.h):

#pragma once

#include <Windows.h>
#include <projectedfslib.h>

#include <string>
#include <vector>
#include <memory>
#include <map>
#include <ranges>
#include <algorithm>
#include <format>
#include <optional>
#include <functional>

projectedfslib.h is where the ProjFS declarations reside. projectedfslib.lib is the import library to link against. In this post, I’ll focus on the main coding aspects, rather than going through every little piece of code. The full code can be found at https://github.com/zodiacon/objmgrprojfs. It’s of course possible to use other languages to implement a ProjFS provider. I’m going to attempt one in Rust in a future post 🙂

The projected file system must be rooted in a folder in the file system. It doesn’t have to be empty, but it makes sense to use such a directory for this purpose only. The main function will take the requested root folder as input and pass it to the ObjectManagerProjection class that is used to manage everything:

int wmain(int argc, const wchar_t* argv[]) {
	if (argc < 2) {
		printf("Usage: ObjMgrProjFS <root_dir>\n");
		return 0;
	}

	ObjectManagerProjection omp;
	if (auto hr = omp.Init(argv[1]); hr != S_OK)
		return Error(hr);

	if (auto hr = omp.Start(); hr != S_OK)
		return Error(hr);

	printf("Virtualizing at %ws. Press ENTER to stop virtualizing...\n", argv[1]);
	char buffer[3];
	gets_s(buffer);

	omp.Term();

	return 0;
}

Let start with the initialization. We want to create the requested directory (if it doesn’t already exist). If it does exist, we’ll use it. In fact, it could exist because of a previous run of the provider, so we can keep track of the instance ID (a GUID) so that the file system itself can use its caching capabilities. We’ll “hide” the GUID in a hidden file within the directory. First, create the directory:

HRESULT ObjectManagerProjection::Init(PCWSTR root) {
	GUID instanceId = GUID_NULL;
	std::wstring instanceFile(root);
	instanceFile += L"\\_obgmgrproj.guid";

	if (!::CreateDirectory(root, nullptr)) {
		//
		// failed, does it exist?
		//
		if (::GetLastError() != ERROR_ALREADY_EXISTS)
			return HRESULT_FROM_WIN32(::GetLastError());

If creation fails not because it exists, bail out with an error. Otherwise, get the instance ID that may be there and use that GUID if present:

	auto hFile = ::CreateFile(instanceFile.c_str(), GENERIC_READ, 
		FILE_SHARE_READ, nullptr, OPEN_EXISTING, 0, nullptr);
	if (hFile != INVALID_HANDLE_VALUE && ::GetFileSize(hFile, nullptr) == sizeof(GUID)) {
		DWORD ret;
		::ReadFile(hFile, &instanceId, sizeof(instanceId), &ret, nullptr);
		::CloseHandle(hFile);
	}
}

If we need to generate a new GUID, we’ll do that with CoCreateGuid and write it to the hidden file:

if (instanceId == GUID_NULL) {
	::CoCreateGuid(&instanceId);
	//
	// write instance ID
	//
	auto hFile = ::CreateFile(instanceFile.c_str(), GENERIC_WRITE, 0, nullptr, CREATE_NEW, FILE_ATTRIBUTE_HIDDEN, nullptr);
	if (hFile != INVALID_HANDLE_VALUE) {
		DWORD ret;
		::WriteFile(hFile, &instanceId, sizeof(instanceId), &ret, nullptr);
		::CloseHandle(hFile);
	}
}

Finally, we must register the root with ProjFS:

auto hr = ::PrjMarkDirectoryAsPlaceholder(root, nullptr, nullptr, &instanceId);
if (FAILED(hr))
	return hr;

m_RootDir = root;
return hr;

Once Init succeeds, we need to start the actual virtualization. To that end, a structure of callbacks must be filled so that ProjFS knows what functions to call to get the information requested by the file system. This is the job of the Start method:

HRESULT ObjectManagerProjection::Start() {
	PRJ_CALLBACKS cb{};
	cb.StartDirectoryEnumerationCallback = StartDirectoryEnumerationCallback;
	cb.EndDirectoryEnumerationCallback = EndDirectoryEnumerationCallback;
	cb.GetDirectoryEnumerationCallback = GetDirectoryEnumerationCallback;
	cb.GetPlaceholderInfoCallback = GetPlaceholderInformationCallback;
	cb.GetFileDataCallback = GetFileDataCallback;

	auto hr = ::PrjStartVirtualizing(m_RootDir.c_str(), &cb, this, nullptr, &m_VirtContext);
	return hr;
}

The callbacks specified above are the absolute minimum required for a valid provider. PrjStartVirtualizing returns a virtualization context that identifies our provider, which we need to use (at least) when stopping virtualization. It’s a blocking call, which is convenient in a console app, but for other cases, it’s best put in a separate thread. The this value passed in is a user-defined context. We’ll use that to delegate these static callback functions to member functions. Here is the code for StartDirectoryEnumerationCallback:

HRESULT ObjectManagerProjection::StartDirectoryEnumerationCallback(const PRJ_CALLBACK_DATA* callbackData, const GUID* enumerationId) {
	return ((ObjectManagerProjection*)callbackData->InstanceContext)->DoStartDirectoryEnumerationCallback(callbackData, enumerationId);
}

The same trick is used for the other callbacks, so that we can implement the functionality within our class. The class ObjectManagerProjection itself holds on to the following data members of interest:

struct GUIDComparer {
	bool operator()(const GUID& lhs, const GUID& rhs) const {
		return memcmp(&lhs, &rhs, sizeof(rhs)) < 0;
	}
};

struct EnumInfo {
	std::vector<ObjectNameAndType> Objects;
	int Index{ -1 };
};
std::wstring m_RootDir;
PRJ_NAMESPACE_VIRTUALIZATION_CONTEXT m_VirtContext;
std::map<GUID, EnumInfo, GUIDComparer> m_Enumerations;

EnumInfo is a structure used to keep an object directory’s contents and the current index requested by the file system. A map is used to keep track of all current enumerations. Remember, it’s the file system – multiple directory listings may be happening at the same time. As it happens, each one is identified by a GUID, which is why it’s used as a key to the map. m_VirtContext is the returned value from PrjStartVirtualizing.

ObjectNameAndType is a little structure that stores the details of an object: its name and type:

struct ObjectNameAndType {
	std::wstring Name;
	std::wstring TypeName;
};

The Callbacks

Obviously, the bulk work for the provider is centered in the callbacks. Let’s start with StartDirectoryEnumerationCallback. Its purpose is to let the provider know that a new directory enumeration of some sort is beginning. The provider can make any necessary preparations. In our case, it’s about adding a new enumeration structure to manage based on the provided enumeration GUID:

HRESULT ObjectManagerProjection::DoStartDirectoryEnumerationCallback(const PRJ_CALLBACK_DATA* callbackData, const GUID* enumerationId) {
	EnumInfo info;
	m_Enumerations.insert({ *enumerationId, std::move(info) });
	return S_OK;
}

We just add a new entry to our map, since we must be able to distinguish between multiple enumerations that may be happening concurrently. The complementary callback ends an enumeration which is where we delete the item from the map:

HRESULT ObjectManagerProjection::DoEndDirectoryEnumerationCallback(const PRJ_CALLBACK_DATA* callbackData, const GUID* enumerationId) {
	m_Enumerations.erase(*enumerationId);
	return S_OK;
}

So far, so good. The real work is centered around the GetDirectoryEnumerationCallback callback where actual enumeration must take place. The callback receives the enumeration ID and a search expression – the client may try to search using functions such as FindFirstFile / FindNextFile or similar APIs. The provided PRJ_CALLBACK_DATA contains the basic details of the request such as the relative directory itself (which could be a subdirectory). First, we reject any unknown enumeration IDs:

HRESULT ObjectManagerProjection::DoGetDirectoryEnumerationCallback(
	const PRJ_CALLBACK_DATA* callbackData, const GUID* enumerationId, 
	PCWSTR searchExpression, PRJ_DIR_ENTRY_BUFFER_HANDLE dirEntryBufferHandle) {

	auto it = m_Enumerations.find(*enumerationId); 
	if(it == m_Enumerations.end())
		return E_INVALIDARG;
    auto& info = it->second;

Next, we need to enumerate the objects in the provided directory, taking into consideration the search expression (that may require returning a subset of the items):

	if (info.Index < 0 || (callbackData->Flags & PRJ_CB_DATA_FLAG_ENUM_RESTART_SCAN)) {
		auto compare = [&](auto name) {
			return ::PrjFileNameMatch(name, searchExpression);
			};
		info.Objects = ObjectManager::EnumDirectoryObjects(callbackData->FilePathName, nullptr, compare);
		std::ranges::sort(info.Objects, [](auto const& item1, auto const& item2) { 
			return ::PrjFileNameCompare(item1.Name.c_str(), item2.Name.c_str()) < 0; 
			});
		info.Index = 0;
	}

There are quite a few things happening here. ObjectManager::EnumDirectoryObjects is a helper function that does the actual enumeration of objects in the object manager’s namespace given the root directory (callbackData->FilePathName), which is always relative to the virtualization root, which is convenient – we don’t need to care where the actual root is. The compare lambda is passed to EnumDirectoryObjects to provide a filter based on the search expression. ProjFS provides the PrjFileNameMatch function we can use to test if a specific name should be returned or not. It has the logic that caters for wildcards like * and ?.

Once the results return in a vector (info.Objects), we must sort it. The file system expects returned files/directories to be sorted in a case insensitive way, but we don’t actually need to know that. PrjFileNameCompare is provided as a function to use for sorting purposes. We call sort on the returned vector passing this function PrjFileNameCompare as the compare function.

The enumeration must happen if the PRJ_CB_DATA_FLAG_ENUM_RESTART_SCAN is specified. I also enumerate if it’s the first call for this enumeration ID.

Now that we have results (or an empty vector), we can proceed by telling ProjFS about the results. If we have no results, just return success (an empty directory):

if (info.Objects.empty())
	return S_OK;

Otherwise, we must call PrjFillDirEntryBuffer for each entry in the results. However, ProjFS provides a limited buffer to accept data, which means we need to keep track of where we left off because we may be called again (without the PRJ_CB_DATA_FLAG_ENUM_RESTART_SCAN flag) to continue filling in data. This is why we keep track of the index we need to use.

The first step in the loop is to fill in details of the item: is it a subdirectory or a “file”? We can also specify the size of its data and common times like creation time, modify time, etc.:

while (info.Index < info.Objects.size()) {
	PRJ_FILE_BASIC_INFO itemInfo{};
	auto& item = info.Objects[info.Index];
	itemInfo.IsDirectory = item.TypeName == L"Directory";
	itemInfo.FileSize = itemInfo.IsDirectory ? 0 : 
		GetObjectSize((callbackData->FilePathName + std::wstring(L"\\") + item.Name).c_str(), item);

We fill in two details: a directory or not, based on the kernel object type being “Directory”, and a file size (in case of another type object). What is the meaning of a “file size”? It can mean whatever we want it to mean, including just specifying a size of zero. However, I decided that the “data” being held in an object would be text that provides the object’s name, type, and target (if it’s a symbolic link). Here are a few example when running the provider and using a command window:

C:\objectmanager>dir p*
Volume in drive C is OS
Volume Serial Number is 18CF-552E

Directory of C:\objectmanager

02/20/2024 11:09 AM 60 PdcPort.ALPC Port
02/20/2024 11:09 AM 76 PendingRenameMutex.Mutant
02/20/2024 11:09 AM 78 PowerMonitorPort.ALPC Port
02/20/2024 11:09 AM 64 PowerPort.ALPC Port
02/20/2024 11:09 AM 88 PrjFltPort.FilterConnectionPort
5 File(s) 366 bytes
0 Dir(s) 518,890,110,976 bytes free

C:\objectmanager>type PendingRenameMutex.Mutant
Name: PendingRenameMutex
Type: Mutant

C:\objectmanager>type powerport
Name: PowerPort
Type: ALPC Port

Here is PRJ_FILE_BASIC_INFO:

typedef struct PRJ_FILE_BASIC_INFO {
    BOOLEAN IsDirectory;
    INT64 FileSize;
    LARGE_INTEGER CreationTime;
    LARGE_INTEGER LastAccessTime;
    LARGE_INTEGER LastWriteTime;
    LARGE_INTEGER ChangeTime;
    UINT32 FileAttributes;
} PRJ_FILE_BASIC_INFO;

What is the meaning of the various times and file attributes? It can mean whatever you want – it might make sense for some types of data. If left at zero, the current time is used.

GetObjectSize is a helper function that calculates the number of bytes needed to keep the object’s text, which is what is reported to the file system.

Now we can pass the information for the item to ProjFS by calling PrjFillDirEntryBuffer:

	if (FAILED(::PrjFillDirEntryBuffer(
		(itemInfo.IsDirectory ? item.Name : (item.Name + L"." + item.TypeName)).c_str(), 
		&itemInfo, dirEntryBufferHandle)))
		break;
	info.Index++;
}

The “name” of the item is comprised of the kernel object’s name, and the “file extension” is the object’s type name. This is just a matter of choice – I could have passed the object’s name only so that it would appear as a file with no extension. If the call to PrjFillDirEntryBuffer fails, it means the buffer is full, so we break out, but the index is not incremented, so we can provide the next object in the next callback that does not requires a rescan.

We have two callbacks remaining. One is GetPlaceholderInformationCallback, whose purpose is to provide “placeholder” information about an item, without providing its data. This is used by the file system for caching purposes. The implementation is like so:

HRESULT ObjectManagerProjection::DoGetPlaceholderInformationCallback(const PRJ_CALLBACK_DATA* callbackData) {
	auto path = callbackData->FilePathName;
	auto dir = ObjectManager::DirectoryExists(path);
	std::optional<ObjectNameAndType> object;
	if (!dir)
		object = ObjectManager::ObjectExists(path);
	if(!dir && !object)
		return HRESULT_FROM_WIN32(ERROR_FILE_NOT_FOUND);

	PRJ_PLACEHOLDER_INFO info{};
	info.FileBasicInfo.IsDirectory = dir;
	info.FileBasicInfo.FileSize = dir ? 0 : GetObjectSize(path, object.value());
	return PrjWritePlaceholderInfo(m_VirtContext, callbackData->FilePathName, &info, sizeof(info));
}

The item could be a file or a directory. We use the file path name provided to figure out if it’s a directory kernel object or something else by utilizing some helpers in the ObjectManager class (we’ll examine those later). Then the structure PRJ_PLACEHOLDER_INFO is filled with the details and provided to PrjWritePlaceholderInfo.

The final required callback is the one that provides the data for files – objects in our case:

HRESULT ObjectManagerProjection::DoGetFileDataCallback(const PRJ_CALLBACK_DATA* callbackData, UINT64 byteOffset, UINT32 length) {
	auto object = ObjectManager::ObjectExists(callbackData->FilePathName);
	if (!object)
		return HRESULT_FROM_WIN32(ERROR_FILE_NOT_FOUND);

	auto buffer = ::PrjAllocateAlignedBuffer(m_VirtContext, length);
	if (!buffer)
		return E_OUTOFMEMORY;

	auto data = GetObjectData(callbackData->FilePathName, object.value());
	memcpy(buffer, (PBYTE)data.c_str() + byteOffset, length);
	auto hr = ::PrjWriteFileData(m_VirtContext, &callbackData->DataStreamId, buffer, byteOffset, length);
	::PrjFreeAlignedBuffer(buffer);

	return hr;
}

First we check if the object’s path is valid. Next, we need to allocate buffer for the data. There are some ProjFS alignment requirements, so we call PrjAllocateAlignedBuffer to allocate a properly-aligned buffer. Then we get the object data (a string, by calling our helper GetObjectData), and copy it into the allocated buffer. Finally, we pass the buffer to PrjWriteFileData and free the buffer. The byte offset provided is usually zero, but could theoretically be larger if the client reads from a non-zero position, so we must be prepared for it. In our case, the data is small, but in general it could be arbitrarily large.

GetObjectData itself looks like this:

std::wstring ObjectManagerProjection::GetObjectData(PCWSTR fullname, ObjectNameAndType const& info) {
	std::wstring target;
	if (info.TypeName == L"SymbolicLink") {
		target = ObjectManager::GetSymbolicLinkTarget(fullname);
	}
	auto result = std::format(L"Name: {}\nType: {}\n", info.Name, info.TypeName);
	if (!target.empty())
		result = std::format(L"{}Target: {}\n", result, target);
	return result;
}

It calls a helper function, ObjectManager::GetSymbolicLinkTarget in case of a symbolic link, and builds the final string by using format (C++ 20) before returning it to the caller.

That’s all for the provider, except when terminating:

void ObjectManagerProjection::Term() {
	::PrjStopVirtualizing(m_VirtContext);
}

The Object Manager

Looking into the ObjectManager helper class is somewhat out of the focus of this post, since it has nothing to do with ProjFS. It uses native APIs to enumerate objects in the object manager’s namespace and get details of a symbolic link’s target. For more information about the native APIs, check out my book “Windows Native API Programming” or search online. First, it includes <Winternl.h> to get some basic native functions like RtlInitUnicodeString, and also adds the APIs for directory objects:

typedef struct _OBJECT_DIRECTORY_INFORMATION {
	UNICODE_STRING Name;
	UNICODE_STRING TypeName;
} OBJECT_DIRECTORY_INFORMATION, * POBJECT_DIRECTORY_INFORMATION;

#define DIRECTORY_QUERY  0x0001

extern "C" {
	NTSTATUS NTAPI NtOpenDirectoryObject(
		_Out_ PHANDLE hDirectory,
		_In_ ACCESS_MASK AccessMask,
		_In_ POBJECT_ATTRIBUTES ObjectAttributes);

	NTSTATUS NTAPI NtQuerySymbolicLinkObject(
		_In_ HANDLE LinkHandle,
		_Inout_ PUNICODE_STRING LinkTarget,
		_Out_opt_ PULONG ReturnedLength);

	NTSTATUS NTAPI NtQueryDirectoryObject(
		_In_  HANDLE hDirectory,
		_Out_ POBJECT_DIRECTORY_INFORMATION DirectoryEntryBuffer,
		_In_  ULONG DirectoryEntryBufferSize,
		_In_  BOOLEAN  bOnlyFirstEntry,
		_In_  BOOLEAN bFirstEntry,
		_In_  PULONG  EntryIndex,
		_Out_ PULONG  BytesReturned);
	NTSTATUS NTAPI NtOpenSymbolicLinkObject(
		_Out_  PHANDLE LinkHandle,
		_In_   ACCESS_MASK DesiredAccess,
		_In_   POBJECT_ATTRIBUTES ObjectAttributes);
}

Here is the main code that enumerates directory objects (some details omitted for clarity, see the full source code in the Github repo):

std::vector<ObjectNameAndType> ObjectManager::EnumDirectoryObjects(PCWSTR path, 
	PCWSTR objectName, std::function<bool(PCWSTR)> compare) {
	std::vector<ObjectNameAndType> objects;
	HANDLE hDirectory;
	OBJECT_ATTRIBUTES attr;
	UNICODE_STRING name;
	std::wstring spath(path);
	if (spath[0] != L'\\')
		spath = L'\\' + spath;

	std::wstring object(objectName ? objectName : L"");

	RtlInitUnicodeString(&name, spath.c_str());
	InitializeObjectAttributes(&attr, &name, 0, nullptr, nullptr);
	if (!NT_SUCCESS(NtOpenDirectoryObject(&hDirectory, DIRECTORY_QUERY, &attr)))
		return objects;

	objects.reserve(128);
	BYTE buffer[1 << 12];
	auto info = reinterpret_cast<OBJECT_DIRECTORY_INFORMATION*>(buffer);
	bool first = true;
	ULONG size, index = 0;
	for (;;) {
		auto start = index;
		if (!NT_SUCCESS(NtQueryDirectoryObject(hDirectory, info, sizeof(buffer), FALSE, first, &index, &size)))
			break;
		first = false;
		for (ULONG i = 0; i < index - start; i++) {
			ObjectNameAndType data;
			auto& p = info[i];
			data.Name = std::wstring(p.Name.Buffer, p.Name.Length / sizeof(WCHAR));
			if(compare && !compare(data.Name.c_str()))
				continue;
			data.TypeName = std::wstring(p.TypeName.Buffer, p.TypeName.Length / sizeof(WCHAR));
			if(!objectName)
				objects.push_back(std::move(data));
			if (objectName && _wcsicmp(object.c_str(), data.Name.c_str()) == 0 || 
				_wcsicmp(object.c_str(), (data.Name + L"." + data.TypeName).c_str()) == 0) {
				objects.push_back(std::move(data));
				break;
			}
		}
	}
	::CloseHandle(hDirectory);
	return objects;
}

NtQueryDirectoryObject is called in a loop with increasing indices until it fails. The returned details for each entry is the object’s name and type name.

Here is how to get a symbolic link’s target:

std::wstring ObjectManager::GetSymbolicLinkTarget(PCWSTR path) {
	std::wstring spath(path);
	if (spath[0] != L'\\')
		spath = L"\\" + spath;

	HANDLE hLink;
	OBJECT_ATTRIBUTES attr;
	std::wstring target;
	UNICODE_STRING name;
	RtlInitUnicodeString(&name, spath.c_str());
	InitializeObjectAttributes(&attr, &name, 0, nullptr, nullptr);
	if (NT_SUCCESS(NtOpenSymbolicLinkObject(&hLink, GENERIC_READ, &attr))) {
		WCHAR buffer[1 << 10];
		UNICODE_STRING result;
		result.Buffer = buffer;
		result.MaximumLength = sizeof(buffer);
		if (NT_SUCCESS(NtQuerySymbolicLinkObject(hLink, &result, nullptr)))
			target.assign(result.Buffer, result.Length / sizeof(WCHAR));
		::CloseHandle(hLink);
	}
	return target;
}

See the full source code at https://github.com/zodiacon/ObjMgrProjFS.

Conclusion

The example provided is the bare minimum needed to write a ProjFS provider. This could be interesting for various types of data that is convenient to access with I/O APIs. Feel free to extend the example and resolve any bugs.

Rust Programming Masterclass Training

Unless you’ve been living under a rock for the past several years (and you are a software developer), the Rust programming language is hard to ignore – in fact, it’s been voted as the “most loved” language for several years (whatever that means). Rust provides the power and performance of C++ with full memory and concurrency safety. It’s a system programming languages, but has high-level features like functional programming style and modularity. That said, Rust has a relatively steep learning curve compared to other mainstream languages.

I’m happy to announce a new training class – Rust Programming Masterclass. This is a brand new, 4 day class, split into 8 half-days, that covers all the foundational pieces of Rust. Here is the list of modules:

  • Module 1: Introduction to Rust
  • Module 2: Language Fundamentals
  • Module 3: Ownership
  • Module 4: Compound Types
  • Module 5: Common Types and Collections
  • Module 6: Modules and Project Management
  • Module 7: Error Handling
  • Module 8: Generics and Traits
  • Module 9: Smart Pointers
  • Module 10: Functional Programming
  • Module 11: Threads and Concurrency
  • Module 12: Async and Await
  • Module 13: Unsafe Rust and Interoperability
  • Module 14: Macros
  • Module 15: Lifetimes

Dates are listed below. The times are 11am-3pm EST (8am-12pm PST) (4pm-8pm UT)
March: 25, 27, 29, April: 1, 3, 5, 8, 10.

Cost: 850 USD (if paid by an individual), 1500 USD if paid by a company. Previous students in my classes get 10% off.

Special bonus for this course: anyone registering gets a 50% discount to any two courses at https://training.trainsec.net.

Registration

If you’d like to register, please send me an email to zodiacon@live.com and provide your full name, company (if any), preferred contact email, and your time zone.

The sessions will be recorded, so you can watch any part you may be missing, or that may be somewhat overwhelming in “real time”.

As usual, if you have any questions, feel free to send me an email, or DM on X (twitter) or Linkedin.

x64 Architecture and Programming Class

I promised this class a while back, and now it is happening. This is a brand new, 3 day class, split into 6 half-days, that covers the x64 processor architecture, programming in general, and programming in the context of Windows. The syllabus can be found here. It may change a bit, but should mostly be stable.

Dates are listed below. The times are 12pm-4pm EST (9am-1pm PST) (5pm-9pm UT)
January: 15, 17, 22, 24, 29, 31.

Cost: 750 USD (if paid by an individual), 1400 USD if paid by a company.

Registration

If you’d like to register, please send me an email to zodiacon@live.com and provide your full name, company (if any), preferred contact email, and your time zone. Previous participants in my classes get 10% off.

The sessions will be recorded, so you can watch any part you may be missing, or that may be somewhat overwhelming in “real time”.

As usual, if you have any questions, feel free to send me an email, or DM on X (twitter) or Linkedin.

Kernel Programming MasterClass

It’s been a while since I have taught a public class. I am happy to launch a new class that combines Windows Kernel Programming and Advanced Windows Kernel Programming into a 6-day (48 hours) masterclass. The full syllabus can be found here.

There is a special bonus for those registering for this class: you get one free recorded course from Windows Internals and Programming (trainsec.net)!

For those who have attended the Windows Kernel Programming class, and wish to capture the more “advanced” stuff, I offer one of two options:

  • Join the second part (3 days) of the training, at 60% of the entire course cost.
  • Register for the entire course with a 20% discount, and get the free recorded course.

The course is planned to stretch from mid-December to late-January, in 4-hour chunks to make it easier to combine with other activities and also have the time to do lab exercises (very important for truly understanding the material). Yes, I know christmas is in the middle there, I’ll keep the last week of December free 🙂

The course will be conducted remotely using MS Teams or similar.

Dates and times (not final, but unlikely to change much, if at all):

  • Dec 2023: 12, 14, 19, 21: 12pm-4pm EST (9am-1pm PST)
  • Jan 2024: 2, 4, 9, 11, 16, 18, 23, 25: 12pm-4pm EST (9am-1pm PST)

Training cost:

  • Early bird (until Nov 22): 1150 USD
  • After Nov 22: 1450 USD

If you’d like to register, please write to zodiacon@live.com with your name, company name (if any), and time zone. If you have any question, use the same email or DM me on X (Twitter) or Linkedin.

Windows Hook Events

Many developers and researcher are faimilar with the SetWindowsHookEx API that provides ways to intercept certain operations related to user interface, such as messages targetting windows. Most of these hooks can be set on a specific thread, or all threads attached to the current desktop. A short video showing how to use this API can be found here. One of the options is to inject a DLL to the target process(es) that is invoked inline to process the relevant events.

There is another mechanism, less known, that provides various events that relate to UI, that can similarly be processed by a callback. This can be attached to a specific thread or process, or to all processes that have threads attached to the current desktop. The API in question is SetWinEventHook:

HWINEVENTHOOK SetWinEventHook(
    _In_ DWORD eventMin,
    _In_ DWORD eventMax,
    _In_opt_ HMODULE hmodWinEventProc,
    _In_ WINEVENTPROC pfnWinEventProc,
    _In_ DWORD idProcess,
    _In_ DWORD idThread,
    _In_ DWORD dwFlags);

The function allows invoking a callback (pfnWinEventProc) when an event occurs. eventMin and eventMax provide a simple way to filter events. If all events are needed, EVENT_MIN and EVENT_MAX can be used to cover every possible event. The module is needed if the function is inside a DLL, so that hmodWinEventProc is the module handle loaded into the calling process. The DLL will automatically be loaded into target process(es) as needed, very similar to the way SetWindowsHookEx works.

idProcess and idThread allow targetting a specific thread, a specific process, or all processes in the current desktop (if both IDs are zero). Targetting all processes is possible even without a DLL. In that case, the event information is marshalled back to the caller’s process and invoked there. This does require to pass the WINEVENT_OUTOFCONTEXT flag to indicate this requirement. The following example shows how to install such event monitoring for all processes/threads in the current desktop:

auto hHook = ::SetWinEventHook(EVENT_MIN, EVENT_MAX, nullptr, 
    OnEvent, 0, 0,
	WINEVENT_OUTOFCONTEXT | 
    WINEVENT_SKIPOWNPROCESS | WINEVENT_SKIPOWNTHREAD);

::GetMessage(nullptr, nullptr, 0, 0);

The last two flags indicate that events from the caller’s process should not be reported. Notice the weird-looking GetMessage call – it’s required for the event handler to be called. The weird part is that a MSG structure is not needed, contrary to the function’s SAL that requires a non-NULL pointer.

The event handler itself can do anything, however, the information provided is fundamentally different than SetWindowsHookEx callbacks. For example, there is no way to “change” anything – it’s just notifying about things that already happended. These events are related to accessibility and are not directly related to windows messaging. Here is the event handler prototype:

void CALLBACK OnEvent(HWINEVENTHOOK hWinEventHook, DWORD event, 
    HWND hwnd, LONG idObject, LONG idChild, DWORD eventTid, DWORD time);

event is the event being reported. Various such events are defined in WinUser.h and there are many values that can be used by third paries and OEMs. It’s worthwile checking the header file because every Microsoft-defined event has details as to when such an event is raised, and the meaning of idObject, idChild and hwnd for that event. eventTid is the thread ID from which the event originated. hwnd is typically the window or constrol associated with the event (if any) – some events are general enough so that no hwnd is provided.

We can get more information on the object that is associated with the event by tapping into the accessibility API. Accessibility objects implement the IAccessible COM interface at least, but may implement other interfaces as well. To get an IAccesible pointer from an event handler, we can use AccessibleObjectFromEvent:

CComPtr<IAccessible> spAcc;
CComVariant child;
::AccessibleObjectFromEvent(hwnd, idObject, idChild, &spAcc, &child);

I’ve included <atlbase.h> to get the ATL client side support (smart pointers and COM type wrappers). Other APIs that can bring an IAccessible in other contexts include AccessibleObjectFromPoint and AccessibleObjectFromWindow.

Note that you must also include <oleacc.h> and link with oleacc.lib.

IAccessible has quite a few methods and properties, the simplest of which is Name that is mandatory for implementors to provide:

CComBSTR name;
spAcc->get_accName(CComVariant(idChild), &name);

Refer to the documentation for other members of IAccessible. We can also get the details of the process associated with the event by going through the window handle or the thread ID and retrieving the executable name. Here is an example with a window handle:

DWORD pid = 0;
WCHAR exeName[MAX_PATH];
PCWSTR pExeName = L"";

if (hwnd && ::GetWindowThreadProcessId(hwnd, &pid)) {
    auto hProcess = ::OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, FALSE, pid);
    if (hProcess) {
        DWORD size = _countof(exeName);
        if (::QueryFullProcessImageName(hProcess, 0, exeName, &size))
            pExeName = wcsrchr(exeName, L'\\') + 1;
        ::CloseHandle(hProcess);
    }
}

GetWindowThreadProcessId retrieves the process ID (and thread ID) associated with a window handle. We could go with the given thread ID – call OpenThread and then GetProcessIdOfThread. The interested reader is welcome to try this approach to retrieve the process ID. Here is the full event handler for this example dumping all using printf:

void CALLBACK OnEvent(HWINEVENTHOOK hWinEventHook, DWORD event, HWND hwnd,
    LONG idObject, LONG idChild, DWORD idEventThread, DWORD time) {
    CComPtr<IAccessible> spAcc;
    CComVariant child;
    ::AccessibleObjectFromEvent(hwnd, idObject, idChild, &spAcc, &child);
    CComBSTR name;
    if (spAcc)
        spAcc->get_accName(CComVariant(idChild), &name);
    DWORD pid = 0;
    WCHAR exeName[MAX_PATH];
    PCWSTR pExeName = L"";

    if (hwnd && ::GetWindowThreadProcessId(hwnd, &pid)) {
        auto hProcess = ::OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, FALSE, pid);
        if (hProcess) {
            DWORD size = _countof(exeName);
            if (::QueryFullProcessImageName(hProcess, 0, exeName, &size))
                pExeName = wcsrchr(exeName, L'\\') + 1;
            ::CloseHandle(hProcess);
        }
    }
    printf("Event: 0x%X (%s) HWND: 0x%p, ID: 0x%X Child: 0x%X TID: %u PID: %u (%ws) Time: %u Name: %ws\n",
        event, EventNameToString(event),
        hwnd, idObject, idChild, idEventThread, 
        pid, pExeName,
        time, name.m_str);
}

EventNameToString is a little helper converting some event IDs to names. If you run this code (SimpleWinEventHook project), you’ll see lots of output, because one of the reported events is EVENT_OBJECT_LOCATIONCHANGE that is raised (among other reasons) when the mouse cursor position changes:

Event: 0x800C (Name Change) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0x1DC TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78492375 Name: (null)
Event: 0x8000 (Object Create) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0x1DD TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78492375 Name: (null)
Event: 0x800C (Name Change) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0x1DD TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78492375 Name: (null)
Event: 0x8000 (Object Create) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0x1DE TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78492375 Name: (null)
Event: 0x800C (Name Change) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0x1DE TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78492375 Name: (null)
...
Event: 0x800B (Location Changed) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 72172 PID: 0 () Time: 78492562 Name: Normal
Event: 0x800B (Location Changed) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 72172 PID: 0 () Time: 78492562 Name: Normal
...
Event: 0x800B (Location Changed) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 72172 PID: 0 () Time: 78492718 Name: Vertical size
Event: 0x800B (Location Changed) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 72172 PID: 0 () Time: 78492734 Name: Vertical size
Event: 0x800C (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 72172 PID: 0 () Time: 78492734 Name: Normal
Event: 0x800A (State Changed) HWND: 0x000000000001019E, ID: 0xFFFFFFFC Child: 0x16 TID: 15636 PID: 14060 (explorer.exe) Time: 78493000 Name: (null)
Event: 0x800A (State Changed) HWND: 0x00000000000101B0, ID: 0xFFFFFFFC Child: 0x6 TID: 15636 PID: 14060 (explorer.exe) Time: 78493000 Name: (null)
Event: 0x8004 () HWND: 0x0000000000010010, ID: 0xFFFFFFFC Child: 0x0 TID: 72172 PID: 1756 () Time: 78493000 Name: Desktop
Event: 0x8 (Capture Start) HWND: 0x0000000000271D5A, ID: 0x0 Child: 0x0 TID: 72172 PID: 67928 (WindowsTerminal.exe) Time: 78493000 Name: c:\Dev\Temp\WinEventHooks\x64\Debug\SimpleWinEventHook.exe
Event: 0x800B (Location Changed) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 72172 PID: 0 () Time: 78493093 Name: Normal
Event: 0x8001 (Object Destroy) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0x45 TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78493093 Name: (null)
Event: 0x8001 (Object Destroy) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0xB0 TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78493093 Name: (null)
...
Event: 0x800C (Name Change) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0x1A TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78493093 Name: (null)
Event: 0x800C (Name Change) HWND: 0x00000000000216F6, ID: 0xFFFFFFFC Child: 0x1B TID: 39060 PID: 64932 (Taskmgr.exe) Time: 78493109 Name: (null)
Event: 0x800B (Location Changed) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 72172 PID: 0 () Time: 78493109 Name: Normal
Event: 0x9 (Capture End) HWND: 0x0000000000271D5A, ID: 0x0 Child: 0x0 TID: 72172 PID: 67928 (WindowsTerminal.exe) Time: 78493109 Name: c:\Dev\Temp\WinEventHooks\x64\Debug\SimpleWinEventHook.exe

DLL Injection

Instead of getting events on the SetWinEventHook caller’s thread, a DLL can be injected. Such a DLL must export the event handler so that the process setting up the handler can locate the function with GetProcAddress.

As an example, I created a simple DLL that implements the event handler similarly to the previous example (without the process name) like so:

extern "C" __declspec(dllexport)
void CALLBACK OnEvent(HWINEVENTHOOK hWinEventHook, DWORD event, HWND hwnd,
	LONG idObject, LONG idChild, DWORD idEventThread, DWORD time) {
	CComPtr<IAccessible> spAcc;
	CComVariant child;
	::AccessibleObjectFromEvent(hwnd, idObject, idChild, &spAcc, &child);
	CComBSTR name;
	if (spAcc)
		spAcc->get_accName(CComVariant(idChild), &name);

	printf("Event: 0x%X (%s) HWND: 0x%p, ID: 0x%X Child: 0x%X TID: %u Time: %u Name: %ws\n",
		event, EventNameToString(event),
        hwnd, idObject, idChild, idEventThread,
		time, name.m_str);
}

Note the function is exported. The code uses printf, but there is no guarantee that a target process has a console to use. The DllMain function creates such a console and attached the standard output handle to it (otherwise printf wouldn’t have an output handle, since the process wasn’t bootstraped with a console):

HANDLE hConsole;

BOOL APIENTRY DllMain(HMODULE hModule, DWORD reason, PVOID lpReserved) {
	switch (reason) {
		case DLL_PROCESS_DETACH:
			if (hConsole)   // be nice
				::CloseHandle(hConsole);
			break;

		case DLL_PROCESS_ATTACH:
			if (::AllocConsole()) {
				auto hConsole = ::CreateFile(L"CONOUT$", GENERIC_WRITE, 
                    0, nullptr, OPEN_EXISTING, 0, nullptr);
				if (hConsole == INVALID_HANDLE_VALUE)
					return FALSE;
				::SetStdHandle(STD_OUTPUT_HANDLE, hConsole);
			}
			break;
	}
	return TRUE;
}

The injector process (WinHookInject project) first grabs a target process ID (if any):

int main(int argc, const char* argv[]) {
	DWORD pid = argc < 2 ? 0 : atoi(argv[1]);
	if (pid == 0) {
		printf("Warning: injecting to potentially processes with threads connected to the current desktop.\n");
		printf("Continue? (y/n) ");
		char ans[3];
		gets_s(ans);
		if (tolower(ans[0]) != 'y')
			return 0;
	}

The warning is shown of no PID is provided, because creating consoles for certain processes could wreak havoc. If you do want to inject a DLL to all processes on the desktop, avoid creating consoles.

Once we have a target process (or not), we need to load the DLL (hardcoded for simplicity) and grab the exported event handler function:

auto hLib = ::LoadLibrary(L"Injected.Dll");
if (!hLib) {
	printf("DLL not found!\n");
	return 1;
}
auto OnEvent = (WINEVENTPROC)::GetProcAddress(hLib, "OnEvent");
if (!OnEvent) {
	printf("Event handler not found!\n");
	return 1;
}

The final step is to register the handler. If you’re targetting all processes, you’re better off limiting the events you’re interested in, especially the noisy ones. If you just want a DLL injected and you don’t care about any events, select a range that has no events and then call a relevant function to force the DLL to be loaded into the target process(es). I’ll let the interested reader figure these things out.

auto hHook = ::SetWinEventHook(EVENT_MIN, EVENT_MAX, 
	hLib, OnEvent, pid, 0, WINEVENT_INCONTEXT);
::GetMessage(nullptr, nullptr, 0, 0);

Note the arguments include the DLL module, the handler address, and the flag WINEVENT_INCONTEXT. Here is some output when using this DLL on a Notepad instance. A console is created the first time Notepad causes an event to be raised:

Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 34756 Time: 70717718 Name: Edit
Event: 0x800C (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 34756 Time: 70717718 Name: Horizontal size
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 34756 Time: 70717718 Name: Horizontal size
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717734 Name: Horizontal size
Event: 0x800C (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717734 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717734 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717734 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717750 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717765 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717765 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717781 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717781 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717796 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717796 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717812 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717812 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717828 Name: Edit
Event: 0x800B (Name Change) HWND: 0x0000000000000000, ID: 0xFFFFFFF7 Child: 0x0 TID: 29516 Time: 70717843 Name: Edit
Event: 0x8 (Capture Start) HWND: 0x0000000000091CAC, ID: 0x0 Child: 0x0 TID: 29516 Time: 70717843 Name: (null)
Event: 0x3 (Foreground) HWND: 0x00000000000A1D50, ID: 0x0 Child: 0x0 TID: 34756 Time: 70717843 Name: Untitled - Notepad
Event: 0x8004 () HWND: 0x0000000000010010, ID: 0xFFFFFFFC Child: 0x0 TID: 29516 Time: 70717859 Name: Desktop 1
Event: 0x800B (Name Change) HWND: 0x00000000000A1D50, ID: 0x0 Child: 0x0 TID: 34756 Time: 70717859 Name: Untitled - Notepad
...

The full code is at zodiacon/WinEventHooks: SetWinEventHook Sample (github.com)

Writing Your Own Programming Language

Ever since I realized BASIC wasn’t the only living programming language, I thought about writing my own. Who wouldn’t? If you’re a developer, surely this idea popped into your mind at some point. No matter how much you love a particular programming language, you always have some ideas for improvement or even removal of annoying features.

The post assumes you have some background in compilers, and understand concepts like tokenizing (scanning), parsing, and Abstract Syntax Trees (ASTs)

Obviously, writing a programming language is not for the faint of heart. Even before you set out to implement your language, you have to design it first. Or maybe you have some fundamental ideas that would make your language unique, and you may decide to flesh out the details while you’re implementing it.

A new programming language does not have to be “general-purpose” – that is, it could be a “domain specific language” (DSL), which means it’s best suited for certain domain(s) or tasks. This makes your life (usually) at least somewhat easier; in addition, you’ll be unlikely to compete with the gazillion general-purpose languages out there. Still, a general-purpose language might be your goal.

Designing a programming language is a big topic, well outside the scope of this post. I’ll focus on the implementation details, so to speak. There are other considerations for a programming language beyond the language itself – its accompanying standard library, tooling (e.g., some IDE or at least syntax highlighting), debugging, testing, and few more. One decision is whether to make your language compiled or interpreted. This decision may not affect some aspects of the implementation, but it will definitely affect the language’s back-end. You can even support both interpretation and compilation for maximum flexibility.

I played around with the idea of creating a programming language for many years, never really getting very far beyond a basic parser and a minimal interpreter. Lately, I’ve read more about Pratt Parsing, that sparked my interest again. Pratt Parsing is one of many techniques for parsing expressions, something like “a+2*b”, and doing that correctly (parenthesis, operator precedence and associativity). Pratt parsing is really elegant, much more so than other techniques, and it’s also more flexible, supporting (indirectly) ternary operations and other unusual constructs. Once you have an expression parser, the rest of the parser is fairly easy to implement (relatively speaking) using the recursive-descent approach which is well suited for hand-crafted parsers.

Robert Nystrom gives a nice introduction to Pratt Parsing and an elegant idea for implementing it. His implementation is in Java, but there is a link to a C# implementation and even one in Rust. My go-to language is C++ (still), so you know where this is going. I’ve implemented a Pratt parser based on Robert’s ideas, and it turned out very well.

I’ve also been interested in visualization (a term which has way too much stuffed into it), but I thought I’d start small. A popular teaching language in the 80s was LOGO. Although it was treated as a “toy language”, it was a full-blown language, mostly resembling LISP internally.

However, LOGO became famous because of the “Turtle Graphics” built-in support that was provided, which allowed drawing with an imaginary turtle (you could even ask LOGO to show it) that would follow your commands like moving forward, backwards, rotating, lifting the pen and putting it back down. Why not create a fun version of Turtle Graphics with ideas from LOGO?

Here is an example from LOGO to draw a symmetric hexagon:

REPEAT 6 [ FD 100 RT 60 ]

You can probably guess what is going on here. “FD” is “forward” and “RT” is “right”, although it could be mistaken for “rotate”. LOGO supported functions as well, so you could create complex shapes by reusing functions.

My language, called “Logo2” for a lack of originality at this time, tries to capture that fun drawing, but put the syntax more inline with the C-family of functions, which I like more. The above hexagon is written with Logo2 like so:

repeat 6 {
    fd(100); rt(60);
}

Indentation is not significant, so it all could be placed on the same line. You can also define functions and execute them:

fn circle(size, steps) {
    repeat steps {
        fd(size); rt(360 / steps);
    }
}

repeat 10 {
    circle(80, 20); rt(36);
}

I also added support for colors, with the pencolor(r,g,b) function, something I don’t recall LOGO having in the 80s.

Implementation

There are 3 main projects in the solution (a fourth project in the works to create a simple IDE for easier experimentation):

  • Logo2Core – contains the tokenizer, parser, and interpreter.
  • Logo2Runtime – contains the runtime support for turtle graphics, currently using GDI+.
  • Logo2 – is a simple REPL, that can parse and execute single line statements. If you provide a command line argument, it’s treated as file name to be parsed and executed. Anything not inside a function is executed directly (for now).

The Tokenizer

The tokenizer’s job (Tokenizer class) is to read text and turn it into a bunch of tokens. A token is a single unit of the language, like a number, keyword, identifier, operator, etc. To start tokenization, the Tokenize method can be invoked with the string to tokenize.

The Next() method returns the next token, whereas the Peek() method returns the next token without advancing the stream forward. This means the tokenizer is not doing all the work immediately, but only advanced to the next token when requested. The parser is the one “driving” the tokenizer.

The implementation of the tokenizer is not perfect, but it works well-enough. I didn’t want to use any existing tools like YACC (or BISON), for a couple reasons. For one, I don’t like generated code that I have little control colover. Second, I like to understand what I am writing. Writing a tokenizer is not rocket science, but it’s not trivial, either. And since one of my goals is to experiment, I need the freedom not available with generated code.

The Parser

The parser is much more interesting than the tokenizer (by far). This is where the syntax of the language is fleshed out. Just like with tokenization, usage of tools like LEX (or FLEX) is inappropriate. In fact, most languages have their own hand-written parser. The parser accepts a string to parse (Parse method) or a filename (ParseFile method) and begins the parsing. It calls on the tokenizer when the next token is needed.

The Init method of the parser initializes the tokenizer with the specific tokens it should be able to recognize (like specific keywords and operators), and also initializes its own “parslets” (defined in the above mentioned article) to make Pratt Parsing work. I will not show here the Pratt Parsing part since there’s quite a bit of code there, but here is an example of parsing the “repeat” statement:

std::unique_ptr<RepeatStatement> Parser::ParseRepeatStatement() {
	Next();		// eat "repeat"
	auto times = ParseExpression();

	m_LoopCount++;
	auto block = ParseBlock();
	m_LoopCount--;
    return std::make_unique<RepeatStatement>(
        std::move(times), std::move(block));
}

ParseExpression parses an expression to be used for the argument to repeat. Then ParseBlock is called to parse a curly-brace surrounded block of code. Finally, the result is an AST node representing a “repeat” statement is created, initialized, and returned to the caller.

The m_LoopCount variable is incremented when entering loop parsing and decremented afterwards. This is done so that parsing the keywords break and continue can check if there is any enclosing loop for these keywords to make sense.

Here is ParseBlock:

std::unique_ptr<BlockExpression>
Parser::ParseBlock(std::vector<std::string> const& args) {
	if (!Match(TokenType::OpenBrace))
		AddError(ParserError(ParseErrorType::OpenBraceExpected, Peek()));

	m_Symbols.push(std::make_unique<SymbolTable>(m_Symbols.top().get()));

	for (auto& arg : args) {
		Symbol sym;
		sym.Name = arg;
		sym.Flags = SymbolFlags::None;
		sym.Type = SymbolType::Argument;
		AddSymbol(sym);
	}

	auto block = std::make_unique<BlockExpression>();
	while (Peek().Type != TokenType::CloseBrace) {
		auto stmt = ParseStatement();
		if (!stmt)
			break;
		block->Add(std::move(stmt));
	}
	Next();		// eat close brace
	m_Symbols.pop();
	return block;
}

ParseBlock starts by making sure there is an open curly brace. Then it creates a symbol table and pushes it to be the “current” as there is a new scope within the block. The parameter to ParseBlock is used when parsing a function body, where these “args” are the parameters to the function. If this is the case, they are added to the symbol table as local variables.

The main work is to call ParseStatement as many times as needed until a close brace is encountered. The block is a vector of statements being filled up. Finally, the symbol table is popped and the AST node is returned.

ParseStatement is a big switch that calls the appropriate specific parsing method based on the first token encountered. Here is an excerpt:

std::unique_ptr<Statement> Parser::ParseStatement() {
	auto peek = Peek();
	if (peek.Type == TokenType::Invalid) {
		return nullptr;
	}

	switch (peek.Type) {
		case TokenType::Keyword_Var: 
             return ParseVarConstStatement(false);
		case TokenType::Keyword_Const: 
             return ParseVarConstStatement(true);
		case TokenType::Keyword_Repeat: 
             return ParseRepeatStatement();
		case TokenType::Keyword_While: 
             return ParseWhileStatement();
		case TokenType::Keyword_Fn: 
             return ParseFunctionDeclaration();
		case TokenType::Keyword_Return: 
             return ParseReturnStatement();
        case TokenType::Keyword_Break: 
             return ParseBreakContinueStatement(false);
        case TokenType::Keyword_Continue:
             return ParseBreakContinueStatement(true);
	}
	auto expr = ParseExpression();
	if (expr) {
		Match(TokenType::SemiColon);
		return std::make_unique<ExpressionStatement>(std::move(expr));
	}
	AddError(ParserError(ParseErrorType::InvalidStatement, peek));
	return nullptr;
}

If a statement is not recognized, an expression parsing is attempted. This allows using Logo2 as a simple calculator, for example. ParseStatement is where the support for more statements is added based on an initial token.

Once an AST is built by the parser, the next step is to execute the AST by some interpreter. In a more complex language (maybe once it grows some more), some semantic analysis may be appropriate, which is about looking at the usage of the language beyond the syntax. For now, we’ll just interpret what we have, and if any error is encountered it’s going to be a runtime error. Some parsing errors can be caught without semantic analysis, but some cannot.

The Interpreter

The Interpreter class provides the runtime behavior, by “executing” the AST. It receives the root of the AST tree constructed by the parser by implementing the well-known Visitor design pattern, whose purpose here is to decouple between the AST node types and the way they are handled by the interpreter. Alternatively, it would be possible to add a virtual “Execute” or “Eval” method to AST nodes, so the nodes can “evaluate” themselves, but that creates coupling, and goes against the single-responsibility principle (SRP) that states that a class should have one and only one job. Using the visitor pattern also makes it easier to add semantic analysis later without modifying the AST node types.

The gist of the visitor pattern is to have an “Accept” method in the AST nodes that calls back to whoever (the visitor) with the current node details. For example, here it is for a binary operator:

class BinaryExpression : public Expression {
public:
    BinaryExpression(std::unique_ptr<Expression> left, 
        Token op, std::unique_ptr<Expression> right);
	Value Accept(Visitor* visitor) const override;

	std::string ToString() const override;

	Expression* Left() const;
	Expression* Right() const;
	Token const& Operator() const;

private:
	std::unique_ptr<Expression> m_Left, m_Right;
	Token m_Operator;
};

Value BinaryExpression::Accept(Visitor* visitor) const {
	return visitor->VisitBinary(this);
}

This same idea is repeated for all concrete AST nodes. The Visitor type is abstract, implemented by the Interpreter class having methods like: VisitBinary, VisitRepeat, etc.

Each one of these “Visit” method’s purpose is to “execute” (or evaluate) that node. Here is an excerpt for the binary expression visiting:

Value Interpreter::VisitBinary(BinaryExpression const* expr) {
    switch (expr->Operator().Type) {
    case TokenType::Add: 
       return expr->Left()->Accept(this) + expr->Right()->Accept(this);
    case TokenType::Sub:
       return expr->Left()->Accept(this) - expr->Right()->Accept(this);
    case TokenType::Mul:
       return expr->Left()->Accept(this) * expr->Right()->Accept(this);
    case TokenType::Div:
       return expr->Left()->Accept(this) / expr->Right()->Accept(this);
    }
    return Value();
}

Here it is for “repeat”:

Value Interpreter::VisitRepeat(RepeatStatement const* expr) {
    auto count = Eval(expr->Count());
    if (!count.IsInteger())
        throw RuntimeError(ErrorType::TypeMismatch, expr->Count());

    auto n = count.Integer();
    while (n-- > 0) {
        try {
            Eval(expr->Block());
        }
        catch (BreakOrContinue const& bc) {
            if (!bc.Continue)
                break;
        }
    }
    return nullptr;     // repeat has no return value
}

You should get the idea at this point. (Eval is just a simple wrapper that calls Accept with the provided node).

The Value type used with the above code (the one returned from Accept methods is the way to represent “values” in Logo2. Logo2 is a dynamically typed language (at least for now), so variables can hold any one of a listed of supported types, encapsulated in Value. You can think of that as a C-style union. Specifically, it wraps a std::variant<> C++17 type that currently supports the following: 64-bit integer, 64-bit floating point (double), bool, string (std::string), and null (representing no value). The list of possibilities will increase, allowing user-defined types as well.

Turtle Graphics

The Logo2Runtime project contains the support for managing turtles, and displaying their “drawings”. The Turtle class is a graphics-free type to manage the state of the turtle – its position and heading, but also a list of “command” indicating operations the turtle has been instructed to do, such as drawing a line, changing color, or changing width of drawing. This list is necessary whenever a window’s output needs to be refreshed.

The Window class servers as a wrapper for an HWND, that also has the “power” to draw a set of turtle commands. Here is its DrawTurtle method:

void Window::DrawTurtle(Gdiplus::Graphics& g, Turtle* t) const {
    for (auto& cmd : t->GetCommands()) {
        DrawTurtleCommand(g, t, cmd);
    }
}

Each command does the right thing:

void Window::DrawTurtleCommand(Gdiplus::Graphics& g, Turtle* t, 
    TurtleCommand const& cmd) const {
    switch (cmd.Type) {
        case TurtleCommandType::DrawLine:
            g.DrawLine(m_Pen.get(), cmd.Line.From.X, 
               cmd.Line.From.Y, cmd.Line.To.X, cmd.Line.To.Y);
            break;

        case TurtleCommandType::SetWidth:
        {
            Color color;
            m_Pen->GetColor(&color);
            m_Pen.reset(new Pen(color, cmd.Width));
            break;
        }

        case TurtleCommandType::SetColor:
        {
            Color color;
            color.SetValue(cmd.Color);
            m_Pen.reset(new Pen(color, m_Pen->GetWidth()));
            break;
        }
    }
}

The graphical objects are GDI+ objects provided by the Windows API. Of course, it would be possible to switch to a different API. I chose GDI+ for its flexibility and 2D capabilities.

The Runtime class ties a turtle and a window together. It holds on to a (single) Turtle object and single Window object. In the future, this is going to be more dynamic, so any number of windows and turtles can be created, even more than one turtle in the same window.

The REPL

A simple REPL is implemented in the Logo2 project. It’s not trivial, as there is a user interface that must be kept alive, meaning messages have to be pumped. This means using functions like gets_s is not good enough, as they block the calling thread. Assuming the UI is on the same thread, this will cause the UI to become non-responsive. For now, the same thread is used, so that no special synchronization is required. The downside is that a custom input “loop” has to be written, and currently it’s very simple, and only supports the BACKSPACE key for typing error correction.

The first step is to get the input, key by key. If there is no key available, messages are pumped. A WM_QUIT message indicates it’s time to exit. Not very elegant, but here goes:

Tokenizer t;
Parser parser(t);
Interpreter inter;
Runtime runtime(inter);
runtime.Init();
runtime.CreateLogoWindow(L"Logo 2", 800, 800);

for (;;) {
	std::print(">> ");
	std::string input;
	int ch = 0;
	MSG msg{};
	while (ch != 13) {
		while (::PeekMessage(&msg, nullptr, 0, 0, PM_REMOVE) && 
                 msg.message != WM_QUIT) {
			::TranslateMessage(&msg);
			::DispatchMessage(&msg);
		}
		if (msg.message == WM_QUIT)
			break;

		if (_kbhit()) {
			ch = _getch();
			if (isprint(ch)) {
				input += (char)ch;
				printf("%c", ch);
			}
			else if (ch == 8) {		// backspace
				printf("\b \b");
				input = input.substr(0, input.length() - 1);
			}
			else {
				if (_kbhit())
					_getch();
			}
		}
	}

	if (msg.message == WM_QUIT)
		break;

Once we have a line of input, it’s time to parse and (if no errors occur), execute:

try {
	printf("\n");
	auto ast = parser.Parse(input);
	if (parser.HasErrors()) {
		for (auto& err : parser.Errors()) {
			printf("Error (%d,%d): %d\n", 
               err.ErrorToken.Line, err.ErrorToken.Col, err.Error);
		}
		continue;
	}
	try {
		auto result = ast->Accept(&inter); // execute!
		if (result != nullptr)
			std::println("{}", result.ToString());
	}
	catch (RuntimeError const& err) {
		printf("Runtime error: %d\n", (int)err.Error);
	}
}
catch (ParserError const& err) {
	printf("Error (%d,%d): %d\n", err.ErrorToken.Line, 
         err.ErrorToken.Col, err.Error);
	continue;
}

Some parser errors are accumulated in a vector, some throw an exception (errors where it would be difficult for the parser to recover confidently). At runtime, errors could occur as well, such as the wrong types being used with certain operations.

Conclusion

Writing a language can be lots of fun. You can invent your “dream” language. For me, the Logo2 experiment is ongoing. I’m planning to build a simple IDE, to extend the language to support user-defined types, lambdas (with closures), and much more. Your ideas are welcome as well!

The project is at zodiacon/Logo2 (github.com)

Thread Priorities in Windows

When a thread is created, it has some priority, which sets its importance compared to other threads competing for CPU time. The thread priority range is 0 to 31 (31 being the highest), where priority zero is used by the memory manager’s zero-page thread(s), whose purpose is to zero out physical pages (for reasons outside the scope of this post), so technically the allowed priority range is 1 to 31.

It stands to reason (to some extent), that a developer could change a thread’s priority to some valid value in the range of 1 to 31, but this is not the case. The Windows API sets up rules as to how thread priorities may change. First, there is a process priority class (sometimes called Base Priority), that specifies the default thread priority within that process. Processes don’t run – threads do, but still this is a process property and affects all threads in the process. You can see the value of this property very simply with Task Manager’s Base Priority column (not visible by default):

Base Priority column in Task Manager

There are six priority classes (the priority of which is specified after the colon):

  • Idle (called Low in Task Manager, probably not to give the wrong impression): 4
  • Below Normal (6)
  • Normal (8)
  • Above Normal (10)
  • Highest (13)
  • Realtime (24)

A few required notes:

  • Normal is the default priority class unless overridden in some way. For example, double-clicking an executable in Explorer will launch a new process with priority class of Normal (8).
  • The term “Realtime” does not imply Windows is a real-time OS; it’s not. “Real-time” just means “higher than all the others”.
  • To set the Realtime priority class, the process in question must have the SeIncreaseBasePriorityPrivilege, normally granted to administrators. If “Realtime” is requested, but the process’s token does not poses that privilege, the result is “High”. The reason has do to with the fact that many kernel threads have priorities in the real-time range, and it could be problematic if too many threads spend a lot of time running in these priorities, potentially leading to kernel threads getting less time than they need.

Is this the end of the story? Not quite. For example, looking at Task Manager, processes like Csrss.exe (Windows subsystem process) or Smss.exe (Session manager) seem to have a priority class of Normal as well. Is this really the case? Yes and no (everyone likes that kind of answer, right?) We’ll get to that soon.

Setting a Thread’s priority

Changing the process priority class is possible with the SetPriorityClass API. For example, a process can change its own priority class like so:

::SetPriorityClass(::GetCurrentProcess(), HIGH_PRIORITY_CLASS);

You can do the same in .NET by utilizing the System.Diagnostics.Process class:

Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;

You can also change priority class using Task Manager or Process Explorer, by right-clicking a process and selecting “Set Priority”.

Once the priority class is changed, it affects all threads in that process. But how?

It turns out that a specific thread’s priority can be changed around the process priority class. The following diagram shows the full picture:

Every small rectangle in the above diagram indicates a valid thread priority. For example, the Normal priority classes allows setting thread priorities to 1, 6, 7, 8, 9, 10, 15. To be more generic, here are the rules for all except the Realtime class. A thread priority is by default the same as the process priority class, but it can be -1, -2, +1, +2 from that base, or have two extreme values (internally called “Saturation”) with the values 1 and 15.

The Realtime range is unique, where the base priority is 24, but all priorities from 16 to 31 are available. The SetThreadPriority API that can be used to change an individual thread’s priority accepts an enumeration value (as its second argument) rather than an absolute value. Here are the macro definitions:

#define THREAD_PRIORITY_LOWEST         // -2  
#define THREAD_PRIORITY_BELOW_NORMAL   // -1
#define THREAD_PRIORITY_NORMAL         // 0
#define THREAD_PRIORITY_HIGHEST        // + 2
#define THREAD_PRIORITY_ABOVE_NORMAL   // + 1
#define THREAD_PRIORITY_TIME_CRITICAL  // 15 or 31
#define THREAD_PRIORITY_IDLE           // 1 or 16

Here is an example of changing the current thread’s priority to +2 compared to the process priority class:

::SetThreadPriority(::GetCurrentThread(), THREAD_PRIORITY_HIGHEST);

And a C# version:

Thread.CurrentThread.Priority = ThreadPriority.Highest;

You can see threads priorities in Process Explorer‘s bottom view:

Thread priorities in Process Explorer

There are two columns for priorities – A base priority and a Dynamic priority. The base priority is the priority set by code (SetThreadPriority) or the default, while the dynamic priority is the current thread’s priority, which could be slightly higher than the base (temporarily), and is changed because of certain decisions made by the kernel scheduler and other components and drivers that can produce such an effect. These thread boosting scenarios are outside the scope of this post.

If you want to see all threads in the system with their priorities, you can use my System Explorer tool, and select System / Threads menu item:

System Explorer showing all threads in the system

The two priority column are shown (Priority is the same as Dynamic Priority in Process Explorer). You can sort by any column, including the priority to see which threads have the highest priority.

Native APIs

If you look in Process Explorer, there is a column named Base Priority under the Process Performance tab:

Process Performance tab

With this column visible, it indicates a process priority with a number. It’s mostly the corresponding number to the priority class (e.g. 10 for Above Normal, 13 for High, etc.), but not always. For example, Smss.exe has a value of 11, which doesn’t correspond to any priority class. Csrss.exe processes have a value of 13.

Changing to these numbers can only be done with the Native API. Specifically, NtSetInformationProcess with the ProcessBasePriority enumeration value can make that change. Weirdly enough, if the value is higher than the current process priority, the same privilege mentioned earlier is required. The weird part, is that calling SetPriorityClass to change Normal to High always works, but calling NtSetInformationProcess to change from 8 to 13 (the same as Normal to High) requires that privilege; oh, well.

What about a specific thread? The native API allows changing a priority of a thread to any given value directly without the need to depend on the process priority class. Choosing a priority in the realtime range (16 or higher) still requires that privilege. But at least you get the flexibility to choose any priority value. The call to use is NtSetInformationThread with ThreadPriority enumeration. For example:

KPRIORITY priority = 14;
NtSetInformationThread(NtCurrentThread(), ThreadPriority, 
    &priority, sizeof(priority));

Note: the definitions for the native API can be obtained from the phnt project.

What happens if you need a high priority (16 or higher) but don’t have admin privileges in the process? Enter the Multimedia Class Scheduler.

The MMCSS Service

The multimedia class service coupled with a driver (mmcss.sys) provide a thread priority service intended for “multimedia” applications that would like to get some guarantee when “playing” multimedia. For example, if you have Spotify running locally, you’ll find there is one thread with priority 22, although the process itself has a priority class Normal:

Spotify threads

You can use the MMCSS API to get that kind of support. There is a Registry key that defines several “tasks” applications can use. Third parties can add more tasks:

MMCSS tasks

The base key is: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Multimedia\SystemProfile\Tasks

The selected “Audio” task has several properties that are read by the MMCSS service. The most important is Priority, which is between 1 (low) and 8 (high) representing the relative priority compared to other “tasks”. Some values aren’t currently used (GPU Priority, SFIO Priority), so don’t expect anything from these.

Here is an example that uses the MMCSS API to increase the current thread’s priority:

#include <Windows.h>
#include <avrt.h>

#pragma comment(lib, "avrt")

int main() {
	DWORD index = 0;
    HANDLE h = AvSetMmThreadCharacteristics(L"Audio", &index);
	AvSetMmThreadPriority(h, AVRT_PRIORITY_HIGH);

The priority itself is an enumeration, where each value corresponds to a range of priorities (all above 15).

The returned HANDLE by the way, is to the MMCSS device (\Device\MMCSS). The argument to AvSetMmThreadCharacteristics must correspond to one of the “Tasks” registered. Calling AvRevertMmThreadCharacteristics reverts the thread to “normal”. There are more APIs in that set, check the docs.

Happy Threading!

Window Stations and Desktops

A while back I blogged about the differences between the virtual desktop feature exposed to users on Windows 10/11, and the Desktops tool from Sysinternals. In this post, I’d like to shed some more light on Window Stations, desktops, and windows. I assume you have read the aforementioned blog post before continuing.

We know that Window Stations are contained in sessions. Can we enumerate these? The EnumWindowStations API is available in the Windows API, but it only returns the Windows Stations in the current session. There is no “EnumSessionWindowStations”. Window Stations, however, are named objects, and so are visible in tools such as WinObj (running elevated):

Window stations in session 0

The Window Stations in session 0 are at \Windows\WindowStations
The Window Stations in session x are at \Sessions\x\Windows\WindowStations

The OpenWindowStation API only accepts a “local” name, under the callers session. The native NtUserOpenWindowStation API (from Win32u.dll) is more flexible, accepting a full object name:

HWINSTA NtUserOpenWindowStation(POBJECT_ATTRIBUTES attr, ACCESS_MASK access);

Here is an example that opens the “msswindowstation” Window Station:

#include <Windows.h>
#include <winternl.h>

#pragma comment(lib, "ntdll")

HWINSTA NTAPI _NtUserOpenWindowStation(_In_ POBJECT_ATTRIBUTES attr, _In_ ACCESS_MASK access);
int main() {
	// force Win32u.DLL to load
	::LoadLibrary(L"user32");
	auto NtUserOpenWindowStation = (decltype(_NtUserOpenWindowStation)*)
		::GetProcAddress(::GetModuleHandle(L"win32u"), "NtUserOpenWindowStation");

	UNICODE_STRING winStaName;
	RtlInitUnicodeString(&winStaName, L"\\Windows\\WindowStations\\msswindowstation");
	OBJECT_ATTRIBUTES winStaAttr;
	InitializeObjectAttributes(&winStaAttr, &winStaName, 0, nullptr, nullptr);
	auto hWinSta = NtUserOpenWindowStation(&winStaAttr, READ_CONTROL);
	if (hWinSta) {
        // do something with hWinSta
        ::CloseWindowStation(hWinSta);
    }

You may or may not have enough power to open a handle with the required access – depending on the Window Station in question. Those in session 0 are hardly accessible from non-session 0 processes, even with the SYSTEM account. You can examine their security descriptor with the kernel debugger (as other tools will return access denied):

lkd> !object \Windows\WindowStations\msswindowstation
Object: ffffe103f5321c00  Type: (ffffe103bb0f0ae0) WindowStation
    ObjectHeader: ffffe103f5321bd0 (new version)
    HandleCount: 4  PointerCount: 98285
    Directory Object: ffff808433e412b0  Name: msswindowstation
lkd> dt nt!_OBJECT_HEADER ffffe103f5321bd0

   +0x000 PointerCount     : 0n98285
   +0x008 HandleCount      : 0n4
   +0x008 NextToFree       : 0x00000000`00000004 Void
   +0x010 Lock             : _EX_PUSH_LOCK
   +0x018 TypeIndex        : 0xa2 ''
   +0x019 TraceFlags       : 0 ''
   +0x019 DbgRefTrace      : 0y0
   +0x019 DbgTracePermanent : 0y0
   +0x01a InfoMask         : 0xe ''
   +0x01b Flags            : 0 ''
   +0x01b NewObject        : 0y0
   +0x01b KernelObject     : 0y0
   +0x01b KernelOnlyAccess : 0y0
   +0x01b ExclusiveObject  : 0y0
   +0x01b PermanentObject  : 0y0
   +0x01b DefaultSecurityQuota : 0y0
   +0x01b SingleHandleEntry : 0y0
   +0x01b DeletedInline    : 0y0
   +0x01c Reserved         : 0
   +0x020 ObjectCreateInfo : 0xfffff801`21c53940 _OBJECT_CREATE_INFORMATION
   +0x020 QuotaBlockCharged : 0xfffff801`21c53940 Void
   +0x028 SecurityDescriptor : 0xffff8084`3da8aa6c Void
   +0x030 Body             : _QUAD
lkd> !sd 0xffff8084`3da8aa60
->Revision: 0x1
->Sbz1    : 0x0
->Control : 0x8014
            SE_DACL_PRESENT
            SE_SACL_PRESENT
            SE_SELF_RELATIVE
->Owner   : S-1-5-18
->Group   : S-1-5-18
->Dacl    : 
->Dacl    : ->AclRevision: 0x2
->Dacl    : ->Sbz1       : 0x0
->Dacl    : ->AclSize    : 0x1c
->Dacl    : ->AceCount   : 0x1
->Dacl    : ->Sbz2       : 0x0
->Dacl    : ->Ace[0]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl    : ->Ace[0]: ->AceFlags: 0x0
->Dacl    : ->Ace[0]: ->AceSize: 0x14
->Dacl    : ->Ace[0]: ->Mask : 0x0000011b
->Dacl    : ->Ace[0]: ->SID: S-1-1-0

You can become SYSTEM to help with access by using PsExec from Sysinternals to launch a command window (or whatever) as SYSTEM but still run in the interactive session:

psexec -s -i -d cmd.exe

If all else fails, you may need to use the “Take Ownership” privilege to make yourself the owner of the object and change its DACL to allow yourself full access. Apparently, even that won’t work, as getting something from a Window Station in another session seems to be blocked (see replies in Twitter thread). READ_CONTROL is available to get some basic info.

Here is a screenshot of Object Explorer running under SYSTEM that shows some details of the “msswindowstation” Window Station:

Guess which processes hold handles to this hidden Windows Station?

Once you are able to get a Window Station handle, you may be able to go one step deeper by enumerating desktops, if you managed to get at least WINSTA_ENUMDESKTOPS access mask:

::EnumDesktops(hWinSta, [](auto deskname, auto param) -> BOOL {
	printf(" Desktop: %ws\n", deskname);
	auto h = (HWINSTA)param;
	return TRUE;
	}, (LPARAM)hWinSta);

Going one level deeper, you can enumerate the top-level windows in each desktop (if any). For that you will need to connect the process to the Window Station of interest and then call EnumDesktopWindows:

void DoEnumDesktopWindows(HWINSTA hWinSta, PCWSTR name) {
	if (::SetProcessWindowStation(hWinSta)) {
		auto hdesk = ::OpenDesktop(name, 0, FALSE, DESKTOP_READOBJECTS);
		if (!hdesk) {
			printf("--- failed to open desktop %ws (%d)\n", name, ::GetLastError());
			return;
		}
		static WCHAR pname[MAX_PATH];
		::EnumDesktopWindows(hdesk, [](auto hwnd, auto) -> BOOL {
			static WCHAR text[64];
			if (::IsWindowVisible(hwnd) && ::GetWindowText(hwnd, text, _countof(text)) > 0) {
				DWORD pid;
				auto tid = ::GetWindowThreadProcessId(hwnd, &pid);
				auto hProcess = ::OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, FALSE, pid);
				BOOL exeNameFound = FALSE;
				PWSTR exeName = nullptr;
				if (hProcess) {
					DWORD size = MAX_PATH;
					exeNameFound = ::QueryFullProcessImageName(hProcess, 0, pname, &size);
					::CloseHandle(hProcess);
					if (exeNameFound) {
						exeName = ::wcsrchr(pname, L'\\');
						if (exeName == nullptr)
							exeName = pname;
						else
							exeName++;
					}
				}
				printf("  HWND: 0x%08X PID: 0x%X (%d) %ws TID: 0x%X (%d): %ws\n", 
					(DWORD)(DWORD_PTR)hwnd, pid, pid, 
					exeNameFound ? exeName : L"", tid, tid, text);
			}
			return TRUE;
			}, 0);
		::CloseDesktop(hdesk);
	}
}

Calling SetProcessWindowStation can only work with a Windows Station that belongs to the current session.

Here is an example output for the interactive session (Window Stations enumerated with EnumWindowStations):

Window station: WinSta0
 Desktop: Default
  HWND: 0x00010E38 PID: 0x4D04 (19716) Zoom.exe TID: 0x5FF8 (24568): ZPToolBarParentWnd
  HWND: 0x000A1C7A PID: 0xB804 (47108) VsDebugConsole.exe TID: 0xDB50 (56144): D:\Dev\winsta\x64\Debug\winsta.exe
  HWND: 0x00031DE8 PID: 0xBF40 (48960) devenv.exe TID: 0x94E8 (38120): winsta - Microsoft Visual Studio Preview
  HWND: 0x00031526 PID: 0x1384 (4996) msedge.exe TID: 0xE7C (3708): zodiacon/ObjectExplorer: Explore Kernel Objects on Windows and
  HWND: 0x00171A9A PID: 0xA40C (41996)  TID: 0x9C08 (39944): WindowStation (\Windows\WindowStations\msswindowstation)
  HWND: 0x000319D0 PID: 0xA40C (41996)  TID: 0x9C08 (39944): Object Manager - Object Explorer 2.0.2.0 (Administrator)
  HWND: 0x001117DC PID: 0x253C (9532) ObjExp.exe TID: 0x9E10 (40464): Object Manager - Object Explorer 2.0.2.0 (Administrator)
  HWND: 0x00031CA8 PID: 0xBE5C (48732) devenv.exe TID: 0xC250 (49744): OpenWinSta - Microsoft Visual Studio Preview (Administrator)
  HWND: 0x000B1884 PID: 0xA8A0 (43168) DbgX.Shell.exe TID: 0xA668 (42600):  - KD '', Local Connection  - WinDbg 1.2306.12001.0 (Administra
...
  HWND: 0x000101C8 PID: 0x3598 (13720) explorer.exe TID: 0x359C (13724): Program Manager
Window station: Service-0x0-45193$
 Desktop: sbox_alternate_desktop_0x6A80
 Desktop: sbox_alternate_desktop_0xA94C
 Desktop: sbox_alternate_desktop_0x3D8C
 Desktop: sbox_alternate_desktop_0x7EF8
 Desktop: sbox_alternate_desktop_0x72FC
 Desktop: sbox_alternate_desktop_0x27B4
 Desktop: sbox_alternate_desktop_0x6E80
 Desktop: sbox_alternate_desktop_0x6C54
 Desktop: sbox_alternate_desktop_0x68C8
 Desktop: sbox_alternate_desktop_0x691C
 Desktop: sbox_alternate_desktop_0x4150
 Desktop: sbox_alternate_desktop_0x6254
 Desktop: sbox_alternate_desktop_0x5B9C
 Desktop: sbox_alternate_desktop_0x59B4
 Desktop: sbox_alternate_desktop_0x1384
 Desktop: sbox_alternate_desktop_0x5480

The desktops in the Window Station “Service-0x0-45193$” above don’t seem to have top-level visible windows.

You can also access the clipboard and atom table of a given Windows Station, if you have a powerful enough handle. I’ll leave that as an exercise as well.

Finally, what about session enumeration? That’s the easy part – no need to call NtOpenSession with Session objects that can be found in the “\KernelObjects” directory in the Object Manager’s namespace – the WTS family of functions can be used. Specifically, WTSEnumerateSessionsEx can provide some important properties of a session:

void EnumSessions() {
	DWORD level = 1;
	PWTS_SESSION_INFO_1 info;
	DWORD count = 0;
	::WTSEnumerateSessionsEx(WTS_CURRENT_SERVER_HANDLE, &level, 0, &info, &count);
	for (DWORD i = 0; i < count; i++) {
		auto& data = info[i];
		printf("Session %d (%ws) Username: %ws\\%ws State: %s\n", data.SessionId, data.pSessionName, 
			data.pDomainName ? data.pDomainName : L"NT AUTHORITY", data.pUserName ? data.pUserName : L"SYSTEM", 
			StateToString((WindowStationState)data.State));
    }
	::WTSFreeMemory(info);
}

What about creating a process to use a different Window Station and desktop? One member of the STARTUPINFO structure passed to CreateProcess (lpDesktop) allows setting a desktop name and an optional Windows Station name separated by a backslash (e.g. “MyWinSta\MyDesktop”).

There is more to Window Stations and Desktops that meets the eye… this should give interested readers a head start in doing further research.