How to make sense of those annoying blue screen crashes. And maybe even fix them.
My dad had updated his Windows 7 computer to Windows 10. But he said that he was getting frequent freezes and blue screens of death. Frequent as in two or three each day.
So he asked me to take a look.
My assumption with any blue screen of death is it was caused by either a) bad hardware or b) bad drivers. As much as Windows gets a bad rap for the blue screen of death, it’s very rare to get one on good hardware and a clean install. Most likely, there’s some misbehaving hardware or a buggy driver.
(Sure, there are bugs in the Microsoft drivers too, but troubleshooting is an exercise in optimisim. Fixing Microsoft’s mistakes isn’t really in our power, so just hope your blue screen isn’t a Microsoft bug. On the plus side, Microsoft is known to fix bugs every now and then, when you submit error reports and crash dumps).
The technical name for a Blue Screen of Death is a Bug Check. A bug check is when Windows realises something has gone so horribly wrong on your computer that it can’t keep running. So, with as little fuss as it can manage, it kills itself and restarts.
This is actually a good thing. Because if it blindly kept running, it might do something very unexpected (and also very stupid). Unexpected like destroying the files on your disk. Stupid like calculate your employees’ pay cheque wrong. Or both, like damaging the million dollar industrial control equipment attached to the computer.
Rather than doing something really bad, Windows just stops. Which is terribly inconvenient, but better than finding your family photos, financial records and university thesis are gone.
The first step in troubleshooting is to get a memory dump.
Windows will save a copy of what was in your computer’s memory to disk as part of a bug check. This may be a minidump or a full or partial memory dump.
Windows will tend to save a minidump the first time a bug check happens, and escalate to larger memory dumps if they keep happening.
Check for a file called
C:\Windows\memory.dmp or files in
Work with whatever is newest.
The settings for crash dumps are set in Control Panel -> System -> Advanced System Settings -> Startup And Recovery -> Write Debugging Information.
My dad was saving each memory dump file he got, so I had plenty to work with here.
A bug check is really good from a troubleshooting point of view. There’s lots of details in it. But a hard lock up or freeze is much more difficult to diagnose.
As it turns out, there is a way to bug check a Windows computer by a keyboard shortcut. You have to opt-in by setting some registry entries though.
Add a value:
- Data Type:
CTRL + SCRLK x 2
This assumes your computer is “spinning” rather than really crashed. It’s entirely possible a frozen computer won’t respond to this. But, it’s worth a shot!
To make any sense of the crash dump you’ll need to install the Windows Debugger or
windbg (less affectionately known as wind bag).
Download it as part of the Windows SDK, just choose Debugging Tools for Windows.
It’s worth checking you have a recent version of the debugger, as newer versions of Windows have more bug check codes. And newer debuggers tend to have better analysis logic. Every version of Windows released has a corresponding debugger, and Windows 10 releases are coming 2 or 3 times per year, so update regularly!
Then, go to File -> Open a Crash Dump, and find your
Windbg will load it and spew out a bunch of messages as it loads the memory dump and loads symbols.
You’ll probably see various messages about symbols not being loaded right.
To fix this type
.symfix into the little
1: kd> command prompt at the bottom.
If you find yourself analysing memory dumps often (as in, more than once), it’s worth creating a folder
c:\symbols and setting an environment variable like so:
Before I even look at what caused the bug check, I run
This tells windbg to go analyse the crash and tell me what it think caused it.
There’s even a handy blue link to click, so I don’t even need to type anything!
Often, this will immediately point a finger at a driver which you should try to remove or update. (Frequently, your video or hard disk driver).
(This is, of course, terribly bad practice.
You should always check what the bug check code is and what it means.
Because some codes mean
!analyze will get things horribly wrong.
But its such an easy thing to do, I’ll do this without even thinking.)
Usually, you get a nice stack trace with
!analyze, but this one had lots of noise about symbol errors.
So I did a manual
kb to get a Stack Backtrace.
My dad’s memory dump showed the
ULCDRHlp driver near the top of the stack.
Which we found was an old ULead DVD driver (from a time when burning DVDs needed a special driver).
But the analysis said memory corruption was the cause of the crash, which means
ULCDRHlp crashed the system, but probably didn’t corrupt the memory in the first place.
Most likey, some other driver corrupted memory and poor old
ULCDRHlp was an innocent bystander.
But, given its an old driver and wasn’t needed, we disabled it anyway.
It also provides a nice, one-click way to stop loading a Windows driver.
After you untick the checkbox, reboot your computer.
(And, if your computer doesn’t restart, use safe mode to re-enable the driver).
The windbg help file is actually very detailed. Every possible code is listed, along with a description of what it means, and some basic steps you can take to troubleshoot.
Seriously, it’s one of the best help files I’ve read (and I’ve read plenty).
Help -> Contents -> Bug Checks (blue screens) -> Bug Check Code Reference
If you lost your bug check code in debug spew, or missed in on the blue screen, you can tell the debugger to show it to you by doing a
The bug check code for my dad was
Unfortunately, the help file indicated “this is a very common bug check”. And listed a bunch of instructions which didn’t match what I was seeing on the screen.
I decided I was in over my head and tried a different avenue.
Driver verifier is a program that comes with Windows to add additional sanity checks to drivers. It slows your computer down, and uses more memory, but means a memory corruption style of crash can be noticed when the offending driver causes the corruption, rather than when some innocent driver actually crashes the computer.
Driver developers often use Driver Verifer to check their drivers are doing everything correctly. And helps us track down problem drivers.
- Search or Run
verifer, and elevate to admin.
- Choose Create standard settings
- Choose Automatically select drivers built for older versions of Windows
- You’ll get a list of older drivers - 4 on my dad’s computer.
I checked details of these, noted them down and I named two as suspect:
Dad commented that
windrvr6 part of an old microcontroller programmer and he’d heard some bad things about its stability.
(Note, selecting all drivers installed on this computer caused a bug check on boot on my dad’s computer, and I needed safe-mode to fix it. Be warned).
After selecting the 4 drivers and rebooting, I got a new bug check during the boot sequence:
The help file said sub-code
0x83 meant: The driver called MmMapIoSpace without having locked down the MDL pages.
(And I won’t pretend I know what that means, other than it doesn’t sound much like memory corruption).
This was driver verifier in action: it trapped a problematic driver really quickly. And meant we got better info to troubleshoot with!
The crash analysis clearly showed driver
windrvr6 on the stack as the likely culperate.
The error wasn’t obviously related to memory corruption, so I wasn’t confident that
windrvr6 was actually the root cause, but it certainly looked like a contibuting factor.
So we used autoruns to disable
I rebooted and disabled driver verifier. Given that the crash only happened every 6-18 hours, I left dad with instructions to keep note of further crashes. (At this point, I thought I’d need at least one more round of troubleshooting).
A week later, my dad reported no more blue screens! So problem solved!
driver verifier, you can make a decent guess at what is causing your Windows computer to blue screen.
Once you identify a driver causing the problem, you can either a) uninstall it, b) disable it, or c) update it.
I don’t claim to be a kernel debugger (most of my development is in nice high level languages like C#). The links below have additional resources you can use if you have a blue screen (which I used to prepare this post).