The term “Zombie Process” in Windows is not an official one, as far as I know. Regardless, I’ll define zombie process to be a process that has exited (for whatever reason), but at least one reference remains to the kernel process object (EPROCESS), so that the process object cannot be destroyed.
How can we recognize zombie processes? Is this even important? Let’s find out.
All kernel objects are reference counted. The reference count includes the handle count (the number of open handles to the object), and a “pointer count”, the number of kernel clients to the object that have incremented its reference count explicitly so the object is not destroyed prematurely if all handles to it are closed.
Process objects are managed within the kernel by the EPROCESS (undocumented) structure, that contains or points to everything about the process – its handle table, image name, access token, job (if any), threads, address space, etc. When a process is done executing, some aspects of the process get destroyed immediately. For example, all handles in its handle table are closed; its address space is destroyed. General properties of the process remain, however, some of which only have true meaning once a process dies, such as its exit code.
Process enumeration tools such as Task Manager or Process Explorer don’t show zombie processes, simply because the process enumeration APIs (EnumProcesses
, Process32First
/Process32Next
, the native NtQuerySystemInformation
, and WTSEnumerateProcesses
) don’t return these – they only return processes that can still run code. The kernel debugger, on the other hand, shows all processes, zombie or not when you type something like !process 0 0. Identifying zombie processes is easy – their handle table and handle count is shown as zero. Here is one example:
kd> !process ffffc986a505a080 0 PROCESS ffffc986a505a080 SessionId: 1 Cid: 1010 Peb: 37648ff000 ParentCid: 0588 DirBase: 16484cd000 ObjectTable: 00000000 HandleCount: 0. Image: smartscreen.exe
Any kernel object referenced by the process object remains alive as well – such as a job (if the process is part of a job), and the process primary token (access token object). We can get more details about the process by passing the detail level “1” in the !process command:
lkd> !process ffffc986a505a080 1 PROCESS ffffc986a505a080 SessionId: 1 Cid: 1010 Peb: 37648ff000 ParentCid: 0588 DirBase: 16495cd000 ObjectTable: 00000000 HandleCount: 0. Image: smartscreen.exe VadRoot 0000000000000000 Vads 0 Clone 0 Private 16. Modified 7. Locked 0. DeviceMap ffffa2013f24aea0 Token ffffa20147ded060 ElapsedTime 1 Day 15:11:50.174 UserTime 00:00:00.000 KernelTime 00:00:00.015 QuotaPoolUsage[PagedPool] 0 QuotaPoolUsage[NonPagedPool] 0 Working Set Sizes (now,min,max) (17, 50, 345) (68KB, 200KB, 1380KB) PeakWorkingSetSize 2325 VirtualSize 0 Mb PeakVirtualSize 2101341 Mb PageFaultCount 2500 MemoryPriority BACKGROUND BasePriority 8 CommitCharge 20 Job ffffc98672eea060
Notice the address space does not exist anymore (VadRoot is zero). The VAD (Virtual Address Descriptors) is a data structure managed as a balanced binary search tree that describes the address space of a process – which parts are committed, which parts are reserved, etc. No address space exists anymore. Other details of the process are still there as they are direct members of the EPROCESS structure, such as the kernel and user time the process has used, its start and exit times (not shown in the debugger’s output above).
We can ask the debugger to show the reference count of any kernel object by using the generic !object command, to be followed by !trueref if there are handles open to the object:
lkd> !object ffffc986a505a080 Object: ffffc986a505a080 Type: (ffffc986478ce380) Process ObjectHeader: ffffc986a505a050 (new version) HandleCount: 1 PointerCount: 32768 lkd> !trueref ffffc986a505a080 ffffc986a505a080: HandleCount: 1 PointerCount: 32768 RealPointerCount: 1
Clearly, there is a single handle open to the process and that’s the only thing keeping it alive.
One other thing that remains is the unique process ID (shown as Cid in the above output). Process and thread IDs are generated by using a private handle table just for this purpose. This explains why process and thread IDs are always multiples of four, just like handles. In fact, the kernel treats PIDs and TIDs with the HANDLE type, rather with something like ULONG. Since there is a limit to the number of handles in a process (16711680, the reason is not described here), that’s also the limit for the number of process and threads that could exist on a system. This is a rather large number, so probably not an issue from a practical perspective, but zombie processes still keep their PIDs “taken”, so it cannot be reused. This means that in theory, some code can create millions of processes, terminate them all, but not close the handles it receives back, and eventually new processes could not be created anymore because PIDs (and TIDs) run out. I don’t know what would happen then 🙂
Here is a simple loop to do something like that by creating and destroying Notepad processes but keeping handles open:
WCHAR name[] = L"notepad";
STARTUPINFO si{ sizeof(si) };
PROCESS_INFORMATION pi;
int i = 0;
for (; i < 1000000; i++) { // use 1 million as an example
auto created = ::CreateProcess(nullptr, name, nullptr, nullptr,
FALSE, 0, nullptr, nullptr, &si, &pi);
if (!created)
break;
::TerminateProcess(pi.hProcess, 100);
printf("Index: %6d PID: %u\n", i + 1, pi.dwProcessId);
::CloseHandle(pi.hThread);
}
printf("Total: %d\n", i);
The code closes the handle to the first thread in the process, as keeping it alive would create “Zombie Threads”, much like zombie processes – threads that can no longer run any code, but still exist because at least one handle is keeping them alive.
How can we get a list of zombie processes on a system given that the “normal” tools for process enumeration don’t show them? One way of doing this is to enumerate all the process handles in the system, and check if the process pointed by that handle is truly alive by calling WaitForSingleObject
on the handle (of course the handle must first be duplicated into our process so it’s valid to use) with a timeout of zero – we don’t want to wait really. If the result is WAIT_OBJECT_0
, this means the process object is signaled, meaning it exited – it’s no longer capable of running any code. I have incorporated that into my Object Explorer (ObjExp.exe) tool. Here is the basic code to get details for zombie processes (the code for enumerating handles is not shown but is available in the source code):
m_Items.clear();
m_Items.reserve(128);
std::unordered_map<DWORD, size_t> processes;
for (auto const& h : ObjectManager::EnumHandles2(L"Process")) {
auto hDup = ObjectManager::DupHandle(
(HANDLE)(ULONG_PTR)h->HandleValue , h->ProcessId,
SYNCHRONIZE | PROCESS_QUERY_LIMITED_INFORMATION);
if (hDup && WAIT_OBJECT_0 == ::WaitForSingleObject(hDup, 0)) {
//
// zombie process
//
auto pid = ::GetProcessId(hDup);
if (pid) {
auto it = processes.find(pid);
ZombieProcess zp;
auto& z = it == processes.end() ? zp : m_Items[it->second];
z.Pid = pid;
z.Handles.push_back({ h->HandleValue, h->ProcessId });
WCHAR name[MAX_PATH];
if (::GetProcessImageFileName(hDup,
name, _countof(name))) {
z.FullPath =
ProcessHelper::GetDosNameFromNtName(name);
z.Name = wcsrchr(name, L'\\') + 1;
}
::GetProcessTimes(hDup,
(PFILETIME)&z.CreateTime, (PFILETIME)&z.ExitTime,
(PFILETIME)&z.KernelTime, (PFILETIME)&z.UserTime);
::GetExitCodeProcess(hDup, &z.ExitCode);
if (it == processes.end()) {
m_Items.push_back(std::move(z));
processes.insert({ pid, m_Items.size() - 1 });
}
}
}
if (hDup)
::CloseHandle(hDup);
}
The data structure built for each process and stored in the m_Items
vector is the following:
struct HandleEntry {
ULONG Handle;
DWORD Pid;
};
struct ZombieProcess {
DWORD Pid;
DWORD ExitCode{ 0 };
std::wstring Name, FullPath;
std::vector<HandleEntry> Handles;
DWORD64 CreateTime, ExitTime, KernelTime, UserTime;
};
The ObjectManager::DupHandle
function is not shown, but it basically calls DuplicateHandle
for the process handle identified in some process. if that works, and the returned PID is non-zero, we can go do the work. Getting the process image name is done with GetProcessImageFileName
– seems simple enough, but this function gets the NT name format of the executable (something like \Device\harddiskVolume3\Windows\System32\Notepad.exe), which is good enough if only the “short” final image name component is desired. if the full image path is needed in Win32 format (e.g. “c:\Windows\System32\notepad.exe”), it must be converted (ProcessHelper::GetDosNameFromNtName
). You might be thinking that it would be far simpler to call QueryFullProcessImageName
and get the Win32 name directly – but this does not work, and the function fails. Internally, the NtQueryInformationProcess
native API is called with ProcessImageFileNameWin32
in the latter case, which fails if the process is a zombie one.
Running Object Explorer and selecting Zombie Processes from the System menu shows a list of all zombie processes (you should run it elevated for best results):

The above screenshot shows that many of the zombie processes are kept alive by GameManagerService.exe. This executable is from Razer running on my system. It definitely has a bug that keeps process handle alive way longer than needed. I’m not sure it would ever close these handles. Terminating this process will resolve the issue as the kernel closes all handles in a process handle table once the process terminates. This will allow all those processes that are held by that single handle to be freed from memory.
I plan to add Zombie Threads to Object Explorer – I wonder how many threads are being kept “alive” without good reason.