Sunday, November 30, 2008

Troubleshooting new PC bluescreens

When I first got my custom-built new PC running 64-bit Vista put together a few weeks ago, everything seemed to be running great: It was very fast and responsive, all of the hardware components appeared to be working, and I could play 3D games on the system with no errors.

However, after a few days of using the system, it became clear that there was a problem: On four occasions, after I left the machine running overnight, I woke up in the morning to find that the machine had bluescreened while it was unattended overnight.  Each time, I found that all of my open programs/windows had closed, and a error dialog was open with a message saying that the machine had bluescreened (using that term, "bluescreened"!).  (However, on two occasions, the machine ran ok overnight, without bluescreening.)

Also, on one occasion, the machine bluescreened while I was actively using it, while I was playing a game of Bionic Commando: Rearmed, which was especially aggravating.

I really dislike system instability.  I've always placed a premium on stability on systems I build; while troubleshooting and tracking down problems can sometimes be interesting, I'd much rather be spending my time on my computer to work on a project, or to play a game.  So, I set out to track down and fix the cause of the bluescreens.  (Note: This is the time that having a custom-built machine can be "interesting" -- if I couldn't figure out the cause of the issue, I wouldn't have the fallback option of dialing up a vendor's 1-800 number to get help dealing with the problem!)

Bad RAM?

My first thought was that one of the sticks of RAM in the system might be bad, or maybe that the two sets of two RAM sticks that I had put into the machine -- a set of 2 2GB sticks from Corsair, and a set of 2 2GB sticks from Crucial (8 GB total) -- were incompatible with one another.  I wasn't terribly happy about this prospect, since it would involve additional troubleshooting which stick(s) of RAM were responsible for the problem, and then having to ship the parts back to the store for a replacement or a refund -- something I've never needed to do before.

I decided to use a memory test utility to try and determine whether there really was a RAM issue.  I found a nice blog post by Shivaranjan Bhoopathy detailing Vista's built-in memory diagnostic tool (thanks Shivaranjan!).  I had been previously unaware of this utility; I'd had it in mind that I'd need to find a 3rd-party utility to do the job.

I ran the utility (which was designed to run after a reboot of the machine, and then automatically running the utility on the subsequent boot before Windows loads).  To my relief, the utility reported that all of my RAM was ok!  However, this meant that I needed to continue looking for the cause of the bluescreens.

Heat Issue?

My experience over the years has shown that weird, sometimes-reproducible, sometimes-not, hardware-related issues are often attributable to overheating. 

I found a nice, free utility for Windows, SpeedFan, which gives a readout of CPU temperature (among other features).  SpeedFan reported that my two CPUs were running at a temperature of between 70 and 75 degrees C -- very hot, dangerously so, for the CPU! 

I also rebooted the machine, entered the built-in BIOS utility program as the machine was booting, and checked the temperature there; the BIOS utility program confirmed that the CPU temperature was a very high 70+ degrees C.

So, at this point I thought I'd found the cause of the problem; the only question was how to fix it.  I turned off the machine, opened up the case, and checked the heatsink.  I found that the heatsink was slightly loose -- I was able to wiggle it back and forth slightly with my fingers; if I had installed the heatsink correctly, then I shouldn't have been able to move it at all. 

The problem turned out to be that two of the four "posts" on the corners of the heatsink which bolt the heatsink tightly down against the surface of the CPU were not tightened down all of the way.  As a result, the heatsink wasn't making tight contact against the CPU surface, and consequently wasn't doing a good job of drawing the heat away from the CPU. 

I properly tightened down the heatsink, and confirmed that it was now tightly bolted down against the CPU surface, and couldn't be "wiggled."  I turned the machine back on, and monitored the temperature with SpeedFan.  This time, the CPU temperature never rose above 40-45 degrees C, even after the machine had been on for a while.  Much better!

Unfortunately, after I left the machine on overnight once again, I came back to it in the morning to find that it had, once again, bluescreened while it was unattended overnight.  This meant that I needed to continue looking for the cause of the issue. 

BIOS and Network Driver Update

At this point, I was running out of ideas of things to check.  I had been doing some large file copies over the network while the machine was unattended overnight (copying photos and music files from my old PC to the new one); I thought that maybe a problem with the network driver or the machine BIOS might be responsible for the bluescreening problems.

I visited the Foxconn downloads site (my motherboard manufacturer's site), and downloaded a new Network driver and installed that; then (unfortunately violating the troubleshooting principle of "only change one thing at a time between tests"), I also downloaded and installed an updated BIOS, using the Foxconn LiveUpdate utility, also from the Foxconn site.

After the BIOS update, I was afraid momentarily that I had "bricked" my motherboard when, after the machine rebooted following the update, I was presented with a scary-looking error message following the machine's power-on self-tests:

CMOS Checksum Bad

However, after some hurried research via Google search (on another machine), this error message turned out only to represent a notification that the machine's BIOS had been updated.  I was able to just bypass the error and continue to boot into Windows, and the machine was fine.

This notification is a good thing, in the case that I might have had a virus that had performed a BIOS update (for who-knows-what purposes).  However, (1) the error message was somewhat unnecessarily scary/unhelpful, and (2) it might have been nice if the Foxconn update utility would have warned me about the message in advance, so I didn't have to get so worried upon seeing it! 

The same "CMOS Checksum Bad" error message appeared again upon a subsequent boot, but I was (apparently) able to clear it simply by going into the machine's boot-time BIOS utility, and then doing a save-and-exit from the utility (without changing anything).

Conclusion

In any event, after installing the BIOS and Network driver updates, I've had no further bluescreening problems!  The machine has been rock-solid stable ever since -- just the way I like it.

I can conclude that either or both of the BIOS and Network driver updates was responsible for fixing the problem -- although as I noted earlier, it would have been nice if I'd performed the updates one at a time, so I could better conclude what the specific solution to the problem was.

I'm also happy in retrospect that the bluescreens occurred, since it led me to discovering the heat issue with the machine and the improperly-installed heatsink; if I hadn't noticed that, letting the machine run for a long period of time at 70+ degrees C might have had a significant negative impact on the life of the CPU.  I also got to discover a couple of cool utilities that I hadn't been previously aware of, namely, the built-in Vista memory diagnostic tool, and the SpeedFan temperature-monitoring utility.

Fix: Front Audio Panel doesn't work on Foxconn P45A-S Motherboard

After recently building my new PC, one of the few problems I had is that the front audio connection on the PC wasn't working.  The PC case (a XION II XON-101) included an audio jack on the front panel of the case, and the motherboard (a Foxconn P45A-series) supported that front audio connection, but when I would plug in headphones to the front audio jack, nothing would happen; sound would continue playing through the speakers (connected via the PC's back panel audio jack), and no sound would come through the speakers connected to the front audio jack.

I opened up the case and double-checked that the audio cables from the front panel of the case were properly connected to the motherboard, but everything appeared to be fine.

After some Google searching, I ended up finding the clue to the solution in a post near the bottom of this forums.whirlpool.net.au forum thread: A manufacturer-specific audio driver needed to be installed, instead of the default Microsoft driver provided with Vista. 

Checking my motherboard manual, I found that the onboard audio was provided by Realtek. I downloaded the Realtek audio driver from the Foxconn support site, installed it, and the problem was solved!  Audio now plays properly through the headphones when headphones are plugged into the front audio jack.

Thursday, November 20, 2008

New PC 2008! Budget: Sub-$700

Earlier this month, I built a new PC to use as my primary home desktop machine.  It was the 3rd PC build I've done, with the earlier builds having been in 2004 and in 2000. 

One of my goals for this build was to keep the budget under $700 (not including a monitor).  This is the parts list that I ended up with:

Motherboard Foxconn P45A-S LGA 775 Intel ATX $110 newegg.com
Processor Intel Core 2 Duo E7200 Wolfdale 2.53GHz $80 frys.com (on sale)
RAM Corsair 4 GB (2x 2GB) DDR2-800 (PC2 6400) TWIN2X4096-6400C5DHX $26 frys.com (on sale)
  Crucial 4 GB (2x 2GB) DDR2-800 (PC2 6400) $30 frys.com (on sale)
Hard Drive Western Digital 1 TB SATA2 16 MB Cache $120 newegg.com
Video Card XFX GeForce 9800 GT 512 MB $110 newegg.com
DVD-RW Drive LG|GH20NS15 20X SATA $23 newegg.com (on sale)
OS Vista 64-bit Home Premium (OEM) $100 newegg.com
Case + PSU XION II Black Steel ATX Mid Tower $60 newegg.com

Total cost of all parts and software: $659

Everything else that I needed either came integrated on the motherboard (sound, network) or else was "recycled" from my previous machine (monitor, mouse, keyboard, UPS, XBox 360 USB gamepad).

Here are the assembled goods, just prior to the build:

NewPC2008_TheGoods

I was able to snag the good deals (at least as of this month, 11/2008!) on the processor and RAM via my RSS subscription to slickdeals.net; I knew that I was going to be building the machine in November, so when those deals came across Slickdeals in October, I snapped them up.

I also was able to save some money on the cost of the system by putting all of the individual parts on my birthday wish list; I got some of the components for my birthday, saving me from having to buy them.  (Thanks very much Dad, Dad-in-law K., and Jeremy!)

I snapped a few more pics during the early stages of assembling the machine.  The empty case:

NewPC2008_EmptyCase

The motherboard (Foxconn P45A-S), fresh out of the box, with no parts inserted yet:

NewPC2008_EmptyMobo

The motherboard with the CPU (Intel Core 2 Duo E7200) and heatsink mounted, in the case:

NewPC2008_CpuInMoboInCase

So far, I'm really enjoying 64-bit Vista.  With the 8 GB of RAM and the other parts I put into the system, it runs very smoothly -- as fast, if not faster, than XP ran on my old 2004 machine that was built on a similar budget.  In particular, Vista seems to start up (from a cold boot) noticeably faster than XP used to.  Based on my experience so far, I'd recommend Vista over XP for anyone purchasing a new desktop machine, at least for any machine with better than low-range specs.

I did have an initial issue with the machine bluescreening on a few occasions when I left it running overnight, which I've since resolved; I'll detail my experiences in troubleshooting that issue, and the eventual solution, in a future post.

With the set of hardware in this build, I have a Vista Windows Experience Index score of 5.4 overall:

  • 5.4 Processor
  • 5.9 RAM
  • 5.9 Graphics
  • 5.9 Gaming Graphics
  • 5.9 Hard Disk

This is pretty much in line with my expectations; the nice thing is that I picked out a motherboard which will allow me to upgrade to a faster, quad core, processor in a couple of years should I feel the need to do so.  (With the good deal that I got on the processor that I bought, upgrading from the 2.54 GHz processor I bought to even a 3.0 GHz would have run me around an additional $80 -- double the price of what I paid -- so I'm happy with the deal I got, even if the processor is slightly "underpowered" compared to the rest of the machine.)

In practice, the machine has run very smoothly with several of my older games that I've tried out on it that chugged a bit on my older system, such as Oblivion and Titan Quest; I was also pleasantly surprised to find that the machine also runs Call of Duty 4, a fairly new game which was included with the GeForce 9800 that I bought, very smoothly as well, even on "high" settings.  Hopefully this computer will turn out to be serviceable for use as a gaming machine for at least a few years (in addition to its roles as a hobbyist development machine, and general household use PC)!