Announcement

Collapse
No announcement yet.

Hardware failing - where to look first?

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Hardware failing - where to look first?

    For the last couple of weeks my main desktop (see sig) is failing... sob!

    This is mostly a vent as I process my problem. However, good tips and advice are always welcome!

    Anyway, the symptoms are: System totally shuts off under moderate load (transcoding a video for example) and then restarts itself. Light loads (internet, vmware) system is OK. Temps are within range, no messages in any log files, no warning - just shuts down, pause 3-4 secs, boots back up. I'm not calling it a "reboot" because it takes a slight bit longer to restart than punching the reset button.

    I tested the RAM first, both by removing pairs and running memtest overnight. Doesn't seem to be a RAM issue. That leaves CPU, motherboard/chipset, power supply - and that's about it.

    To restate a point: temps are normal for my system so it's not overheating. I have turned off overclocking and returned to normal voltages and clock speeds but this has made no difference. A bit of history: machine is about 4.5 years old. Ran seriously overclocked (3.8 ghz) for about a year and a half, then became occasionally unstable during the summer so I slowed it to 3.08ghz and left it there. Power supply is on variable speed fan that I bump up in the summer to keep it going, but CPU, GPU, NB, and SB are all water cooled.

    My first guess is it's the old CPU giving up the ghost, but I hate to throw $300 into an old technology quad-core unless it's the solution. If it's the mobo, I'd rather spend the $900 to move up to a P67 chipset and an i7. Unfortunately, I'm at end-of-life for most of my components - DDR2 RAM and the 850w P/S doesn't have latest required mobo connectors for the newer stuff so it the whole kit-and-kaboodle (video card will do for a bit longer).

    I suppose I could buy the cheapest dual core as a test to see if the crash goes away, and then wait for the mobo I want to become available....

    ...what to do

    Please Read Me

    #2
    Re: Hardware failing - where to look first?

    Look at all the ribbon cable clamps. Over time, the heating/cooling cycle that occurs causes the ribbon cables to stretch. This elongates the holes that are punched into each of the lines from the microteeth on the clamps. This can cause interuptions in power when any of the holes, under heat, elongates sufficiently to break contact with the microtooth. Many years ago I fell prey to this.
    Windows no longer obstructs my view.
    Using Kubuntu Linux since March 23, 2007.
    "It is a capital mistake to theorize before one has data." - Sherlock Holmes

    Comment


      #3
      Re: Hardware failing - where to look first?

      What Snowhog said... and... cracks in the Mobo as it loses plasticizers and begins shrinking can cause thin copper traces to break. When cold they conduct but as the board warms up they separate. A 10X loop can reveal the cracks. Capacitor electrolytes can begin to dry out. When cool they work, When warm their values shift significantly.

      Also, I use an infrared thermometer gun (Mastercool, part#52224-SP) to locate hot parts or components. It has a laser lit to pin-point where the thermal sensor is looking at. I slowly sweep the laser over board and components. But, most mobos and their parts cannot be replaced practically and it is easier and better to replace the mobos (and cpu's) and video.
      "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
      – John F. Kennedy, February 26, 1962.

      Comment


        #4
        Re: Hardware failing - where to look first?

        I've done a plug-replug and vacuum awhile back but it was well before the problem began. This system runs 24/7 which I believe stabilizes heat expansion somewhat.

        I will commit to a pull-apart inspection and re-assembly before I buy anything, including a thermal laser gun... but man, that'd be cool to own! 8)

        Please Read Me

        Comment


          #5
          Re: Hardware failing - where to look first?

          I have had this problem several times over the years and it usually is the power supply. I would rule that out first. I keep an old power supply around for just an occasion.
          FKA: tanderson

          Comment


            #6
            Re: Hardware failing - where to look first?

            Funny you should mention that Tanderson, I woke up this morning thinking the same thing. In my case, since my PS fan is somewhat slowed down (for noise) I decided I would try boosting it to full speed and testing. Results to follow.

            Also, for those interested and following this thread: I ran mprime CPU torture test 1 - a CPU/FPU intensive test (little or no memory) and it failed on the second iteration so this backs up my belief it's not RAM.

            Next, I installed a graphics intensive game (OilRush) and ran it. It would load, I could log into my profile and then I let it sit for a few minutes. About five minutes into the intro (screen graphics and audio) it would crash. This leads me to believe it's power related because so many parts were involved but not necessarily CPU intensity.

            Unfortunately, I do have some older PS's but they're not of the variety that will power this beast - 24 pin + 4 pin mobo and PCIe video card connectors required.

            If the cooling fan doesn't do it, I'll buy a newer model PS and give it a whirl. With 8gb RAM (4 sticks), four HD's, an overclocked GPU, water pump, ssd, cd-rom and all the other accessories it would be a surprise that I'm stressing the 700 watt limit!

            EDIT: Went to NewEgg and they have a wattage calculator. It says I'm about 5% under-powered! Granted, these are generic calculations, but it explains a lot. Hmmm, OCZ is having a sale on their 1000 watt unit...

            Please Read Me

            Comment


              #7
              Re: Hardware failing - where to look first?

              Originally posted by oshunluvr
              ...
              I will commit to a pull-apart inspection and re-assembly before I buy anything, including a thermal laser gun... but man, that'd be cool to own! 8)
              Useful, too! I got it to locate areas of heat loss around the house, both inside and outside. You can be 15 or 20 feet away from some spot and still measure its temp. great for seeing heat loss through the roof Shoot the bark of a tree from inside the door of the house to measure the outdoor temp. . It can measure from -50 to 500 C, with a resolution of 0.5C, repeatable to +- 1C. I've also used it to measure my homemade ice cream while it was being made, to calibrate my wife's oven, scan laptops I recondition for hot components, measure the temperature of electrical outlets serving high current loads to make sure they don't have contact resistance high enough to start a fire, and my grandsons LOVE to go around measuring things with it!
              "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
              – John F. Kennedy, February 26, 1962.

              Comment


                #8
                Re: Hardware failing - where to look first?

                Ok, pushed the fan up to the "Cleared for Take-off" settings and my system returned to stability. Ordered the new OCZ ZX-1000 about 2 minutes later.

                Now I can take the old (still functional, just under powered) PS and modify it for water cooling!

                and Hey, GG - that lazer thingy is only $50... maybe I need one of those!

                Please Read Me

                Comment


                  #9
                  Re: Hardware failing - where to look first?

                  $45 !!!
                  http://www.amazon.com/Mastercool-Inf...9624938&sr=8-5
                  And it comes with a metal probe thermometer, good for testing internal temperature of meat when baking it in the oven. (Surface temperature doesn't mean much in this context)
                  "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
                  – John F. Kennedy, February 26, 1962.

                  Comment


                    #10
                    Re: Hardware failing - where to look first?

                    Intermittent electrical problems are truly the pits. The worst (but far from only) case I ever dealt with was a PC I sold to a company back in the early 1990s. Anyone remember Advanced Logic Research (ALR)? That was the PC brand -- a good one, supposedly. Anyway, the customer called and said it would spontaneously reboot, at irregular long (hours) intervals. We brought it into the shop, and sure enough, she was correct. But why? Zooming to the end of a painful story -- there was a cold solder joint in the PSU. At some point in warming up and using the PC for productive work, it would open and pooof! Reboot. That was NOT easy to figure out.

                    OK, the SECOND worst case I had was with a server, running Novell networking (anyone remember that?). The server would spontaneously throw up a memory error, while the network was running. I sent my field tech with a set of new memory modules to replace them on site. Next day, guess what? So, I sent a tech and had it brought back into the shop. After running a few hours, it would throw up a memory error. We swapped in some more new memory modules. After running a few hours, up came the error. We declared it "the motherboard", got an RMA from ALR, and pulled the motherboard to return it. While on the bench, removing the memory modules, I noticed a tiny bit of fuzz, embedded in the memory connector. I removed the fuzz, reinstalled the memory modules, reinstalled the motherboard, and that was the end of the mystery -- it ran fine for years.

                    Anyway, I sympathize with anyone having hardware issues.

                    Comment


                      #11
                      Re: Hardware failing - where to look first?

                      Yeah, the hardest part is always figuring out which exact part is causing the problem because you get no real feed-back from your hardware like you do from software. I hate jumping onto the swap-a-part bandwagon unless you really have a full set of spares.

                      I think the best clues in my case were the few seconds longer pause than a regular reboot, and the fact the system seemed to be normal (no bios errors) upon rebooting. If it had been a CPU/motherboard error, I would have expected no automatic restart and possibly a bios error on restart (like I get when I overclock too far).

                      After posting - which forces me to slow and think about what's happening, and then the sleep time which allows ones subconscious to work on it - the answer seemed more likely to be the power supply. Tanderson was a few hours ahead of me.

                      Please Read Me

                      Comment


                        #12
                        Re: Hardware failing - where to look first?

                        Originally posted by oshunluvr
                        . . . If the cooling fan doesn't do it, I'll buy a newer model PS and give it a whirl. With 8gb RAM (4 sticks), four HD's, an overclocked GPU, water pump, ssd, cd-rom and all the other accessories it would be a surprise that I'm stressing the 700 watt limit!

                        EDIT: Went to NewEgg and they have a wattage calculator. It says I'm about 5% under-powered! Granted, these are generic calculations, but it explains a lot. Hmmm, OCZ is having a sale on their 1000 watt unit...
                        Yes, 700 watts is probably impossible to use in a regular computer. I just now had the opportunity to turn on an older single core machine, so I remembered to plug in my cheater chord so I could measure the real power consumption - it maxed out at 0.77 amps on the peaks. That's 92 Watts. So yes, if you were actually using 700 Watts then, in a small office, you would probably have to open the window during most of the year. For reference, think about a normal electric heater which is typically 1000 watts. Also, you'll be plugging it into a 15 amp circuit so you won't be able to plug in two of them without tripping the breaker. That is assuming that the PS and usage numbers are correct.

                        That said; it is indeed possible to have short spikes of very high draw, but that's what capacitors are for. However, salesmen who don't know ohms law will sometimes inadvertently (or perhaps deliberately) rewrite the rules of physics.

                        Comment


                          #13
                          Re: Hardware failing - where to look first?

                          Originally posted by Ole Juul
                          However, salesmen who don't know ohms law will sometimes inadvertently (or perhaps deliberately) rewrite the rules of physics.
                          Funny. Really, that's funny. I took basic electronics in high school.
                          Windows no longer obstructs my view.
                          Using Kubuntu Linux since March 23, 2007.
                          "It is a capital mistake to theorize before one has data." - Sherlock Holmes

                          Comment


                            #14
                            Re: Hardware failing - where to look first?

                            Of course what I'm really bumping up against here is the efficiency of the PS and the additional factor of heat further reducing the output and causing the real problem.

                            I doubt I'm actually drawing the 738 watts the NewEgg watt calculator says I need, but it is clear I am drawing enough wattage to heat the power supply to the point where it shuts itself off to avoid meltdown (think Chernobyl!).

                            My choices are; Crank the cooling fan to a level that makes my office sound as though it's being constantly vacuumed, water cool the Power Supply so it doesn't over-heat, or buy a higher efficiency and higher output PS.

                            I chose the third, will attempt the second (just to say I did it 8) ), and the first is not an acceptable option (although thats where I am until the new PS arrives) :P

                            However, I will admit the following: My window IS open all year. It's hotter than heck in here on any sunny day, I deliberately ran two dedicated 20 amp circuits to my office when I moved in here because I knew I'd have most of the electronics in here!

                            I think next time I will run one 220v/15a or 20a for the computers. More efficient than 110, at least for the Power Supplies. 15 amp would likely be plenty. You electricians have an opinion on this?


                            Please Read Me

                            Comment


                              #15
                              Re: Hardware failing - where to look first?

                              I've become convinced over the years, based on painful experience, to just overkill the PSU when spec'ing a new system. Life is too short to spend time replacing and RMAing PSUs.

                              Here's the one I bought for the i7-950 system I built in November:

                              http://www.newegg.com/Product/Produc...82E16817139011

                              No regrets -- it's humming along just fine.

                              I think you might be on to something with your 220V line to the office -- I never thought of it. Is the AC frequency and phase also correct for computer PSUs?

                              Comment

                              Working...
                              X