Archive

Posts Tagged ‘irq conflicts’

BSOD | Blue screen error of death | windbg troubleshooting

June 5th, 2009

Blue screen of Death signifies a state where the Windows operating system cannot proceed further and will need the system to be shutdown, because of either a serious hardware conflicts or system errors.

When you look at the blue screen error, it is typical that there are some strange error codes and descriptions specific to the problem. Also known as a Stop error, Blue screen error or casually BSOD. These errors are common due to hardware IRQ conflicts, faulty device drivers, or faulty memory access. If the error has occurred for the first time on your machine, it could be just because of an addition of new hardware. Removing any newly installed device drivers or sometimes just restarting the system will get through the BSOD, but if the issue is occurring repeatedly then there is something more cooking than what a general reboot normally gives a solution.

I have had my turns and heads with Windows BSOD and with some searching here and there on the internet, I figured out that there are better ways to resolve the BSOD permanently. One of them is the ‘windbg’ tool, this has proven quiet handy for me personally and I recommend this tool to be very best in the field.

If you are a serious guy on Windows, and you have encountered BSOD, it is important to understand what is behind the screen causing everything to be just blue with some weird meaning white characters all over.

Whenever a blue screen occurs on your machine, System generates a file named MEMORY.DMP under the directory of C:\WINNT ( this is the default path and you change the location at My Computer -> Properties -> Advanced -> Settings tab in Startup and Recovery column ). The files with extension .DMP are not readable in nomal text editors, they are binary in format. Microsoft generously provided a tool called ‘windbg’, for evaluating these memory dump files and provide the crucial information and hinting reasons for causing the BSOD ( Blue Screen Of Death ).

Enough of the talk, now let us quickly proceed to see how to make use of Windbg tool.

How to use windbg tool: 

  • Install windbg on your machine. You can download windbg from here
  • This is a small executable and installs in just a few quick seconds.
  • Run windbg from Start -> Programs -> Debugging Tools for Windows. This opens a window with a blank screen.
  • Now select File -> Source file and point to the memory.dmp file on your machine. Generally, located under %WINNT% directory.
  • Note: windbg can work on memory dump files created on any hardware certified by microsoft under HAL. This is usually the case that when a machine is not able to boot, the hard disk can be plugged onto another Windows machine and windbg tool can still operate on the .dmp files that are generated on other Windows machines.
  • Once windbg is pointed to the dump file, you can see a textpad frame containing a number of hexadecimal numbers and errors.
  • Don’t worry, this is although cryptic, we do not have to decipher it and do not play with them :-), this is the business of windbg :-)
  • Find out the section with a header ‘Bugcheck Analysis” under which you find below similar lines: 

Example:

-----------------------
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************
PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except,
it must be protected by a Probe.  Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: eaed4304, memory referenced.
Arg2: 00000000, value 0 = read operation, 1 = write operation.
Arg3: bf89ba6c, If non-zero, the instruction address which referenced the bad memory
      address.
Arg4: 00000001, (reserved)
-----------------------
  • In most cases, the above information will yield the exact reasons explaining the root cause for the BSOD errors. In the above case the error was caused due to accessing of a faulty or non-paged system memory (pointing fingers at your application accessing other application data! ).

7. If your are not able to understand the cause from the above section, you have hyperlink “!analyze -v” present on the same page. Click the hyperlink and you will some technical error codes representing the error caused. The “!analyze -v” tool for the above error has resulted me the following error codes: 

-----------------------
READ_ADDRESS: unable to get nt!MmSpecialPoolStart
unable to get nt!MmSpecialPoolEnd
unable to get MmPageSize (0x0) - probably bad symbols
eaed4304
FAULTING_IP:
+ffffffffbf89ba6c
GetContextState failed, 0x80070026
GetContextState failed, 0x80070026
GetContextState failed, 0x80070026
GetContextState failed, 0x80070026
bf89ba6c ??              ???
MM_INTERNAL_CODE:  1
DEFAULT_BUCKET_ID:  DRIVER_FAULT
BUGCHECK_STR:  0x50
STACK_TEXT:
GetContextState failed, 0x80070026
Unable to get current machine context, Win32 error 0n38
 
STACK_COMMAND:  kb
SYMBOL_NAME:  ANALYSIS_INCONCLUSIVE
FOLLOWUP_NAME:  MachineOwner
MODULE_NAME: Unknown_Module
IMAGE_NAME:  Unknown_Image
DEBUG_FLR_IMAGE_TIMESTAMP:  0
BUCKET_ID:  CORRUPT_MODULELIST
Followup: MachineOwner
-----------------------

The error code in bold “DEFAULT_BUCKET_ID” is what you have to look at here. In this case it is “DRIVER_FAULT”, meaning it was a driver conflict or error which has caused the BSOD. You can now proceed with further troubleshooting like finding out if there were any recent hardware/driver changes made to the systems and so on. 

Few Tips: 

  • There could be situations where the bsod doesn’t even allow you to start your operating system, then just try removing any newly installed hardware components on your machine will work most of the times. If not, then look for where exactly the BSOD is being displayed, is it when the windows is starting? Then try logging into Safe mode/Command prompt (by pressing F8 key at the boot menu) and get the dump files copied to another machine and then troubleshoot the issue. 
  • If it is a memory related error, then try swapping the memory modules between their slots. 
  • If it is because of bad blocks on the harddisk, boot the machine from the bootable disk and use the ‘chkdsk’ utility with /P option to fix the bad blocks on your hard disk. 

Useful links:

What is a BSOD?

Windows , , , , , , , , , , , , , , , ,