Powers Off Mid Game/Under Load. Possible Dead GPU? (AW 17 R1, GTX780m)

Discussion in 'Alienware 17 and M17x' started by SteveMonk, Apr 4, 2017.

  1. SteveMonk

    SteveMonk Notebook Consultant

    Reputations:
    56
    Messages:
    109
    Likes Received:
    14
    Trophy Points:
    31
    ***FIXED - Replacement 780m GPU fitted, no more crashes***


    Hey guys, long time since i've posted here

    I have an Alienware 17 R1 (4700MQ, GTX 780M, 16GB Ram, 750GB WD Black + 64GB Plextor M5M mSata Cache, 1080p FHD Anti-Glare, 240 power pack, latest Nvidia drivers and A14 BIOS) that has all of a sudden started to power off when running something graphics intensive. Started a couple of days ago whilst playing PlayerUnknowns Battlegrounds (great game btw!). Played a few rounds one after the other no problem then about 10 minutes into another round the machine turns off completely, no reboot, nothing. I try powering back on and a couple of seconds later dies. I try again to power it on and it fires up and boots into windows fine. Doesn't seem to have any problems but i run a couple of stress tests on the CPU, RAM and finally GPU using DellSupportAssist. CPU tests are fine but the machine powers off again about 30 seconds into the GPU stress test. It seems now whenever the GPU is under load it dies. Temps were my first thought but all are fine (65-75 during stress tests on GPU & 75-80 in game). It will still launch games but dies after a couple of minutes, once it hits max load i'm guessing.

    Both CPU and GPU are stock speeds and have never been overclocked.

    Heres what i've tried so far:
    • Removed Nvidia drivers using DDU in safe mode and reinstalled latest driver
    • Removed driver again as above and installed an older driver version
    • Ran stress tests including Heaven Benchmark & DellSupportAssist
    • Ran BIOS hardware diagnostic (all came back clear)
    • Repasted GPU & CPU
    • Removed RAM and tested (all 4 sticks)

    Things i've not tried:
    • New power pack
    • Windows rebuild (am gonna try this later when i get home)
    • Reseating GPU

    It all points to GPU failure but i'm hoping one of you can give me any ideas/solutions I can try before I go cry a little in the corner on my own

    Thanks in advance guys and apologies for the long post!

    TLDR - Laptop dies when GPU under load. Machine runs fine otherwise. Bunch of stuff tested but still have same problem. Pretty sure its not temps.
     
    Last edited: Jun 1, 2017
  2. Zoltan@HIDevolution

    Zoltan@HIDevolution Company Representative

    Reputations:
    188
    Messages:
    199
    Likes Received:
    92
    Trophy Points:
    41
    I would recommend to test the laptop with a different power supply.

    Sent from my SAMSUNG-SM-N915A using Tapatalk
     
    deadsmiley and loafer987 like this.
  3. MickyD1234

    MickyD1234 Notebook Prophet

    Reputations:
    3,155
    Messages:
    6,469
    Likes Received:
    1,160
    Trophy Points:
    331
    Hi, it does sound like temps but that usually causes it to beep a few times before shutdown. To make sure it's not temp, start with a cold machine. Run Heaven 4 ( https://unigine.com/en/products/benchmarks/heaven/ ) and leave it running watching the temp and see what it reaches before it shuts down.

    The PSU could well be triggering this but in my experience when a PSU is overload it will not work again until you disconnect it from the mains to reset it.

    I wouldn't bother with a win re-install, it's some hardware issue I'm sure (but at least it makes you sure if you feel so inclined)...

    Good luck.
     
  4. SteveMonk

    SteveMonk Notebook Consultant

    Reputations:
    56
    Messages:
    109
    Likes Received:
    14
    Trophy Points:
    31
    Hi, thanks for the reply guys.

    I've ordered a genuine dell PSU to eliminate that as a possibility and i'll try Heaven again once i'm home. Temps were not getting silly when running it last time but worth looking back into. In fact, thinking about it now, Heaven didn't crash the system at all. The SupportAssist stress test did (as well as some games; Division, PUBG) but Heaven ran without any crashes
     
  5. MickyD1234

    MickyD1234 Notebook Prophet

    Reputations:
    3,155
    Messages:
    6,469
    Likes Received:
    1,160
    Trophy Points:
    331
    Sounds like the GPU could be on it's way out. The 780m was the 680m with more cores opened up and higher clock. I have seen a couple of failures here on NBR ,and they are said to run hot (mid-high 70's at stock)

    Do you have to unplug the PSU after a shutdown?

    With heaven running fine it could even be the CPU, check that with HWInfo and turn on logging (can be tricky not to get too much info :eek:). That way you may see something in trouble before a shutdown.

    I take it the only item logged in event manager is Unexpected Shutdown?

    The GPU test might actually be testing the on-board GPU, that can get quite hot when gaming even if the NV is active.
    If you want to go the whole hog use msi afterburner and set up the on-screen display and select each item you want to see when gaming. If it works in a game that crashes you might get a clue from what is happening as it crashes?

    After thought: remove the NV completely and run one of the games that trigger the shutdown. Hopefully it will run on the Intel, letting you know the on-board is fine.
     
  6. SteveMonk

    SteveMonk Notebook Consultant

    Reputations:
    56
    Messages:
    109
    Likes Received:
    14
    Trophy Points:
    31
    Ive not had to unplug the PSU to get it to fire back up again and yeah the event log only shows it as an unexpected shutdown unfortunately.

    The GPU used to run hot, mid 80's to 90 but was always stable. A repaste lowered that significantly though.

    I ran a test on the onboard (intel) GPU last night and it worked fine with no crashes so I think the CPU is ok but i'll run a bunch of tests when I get home and see if I can eliminate things one by one. I'll try your suggestions and see what I can find out
     
  7. MickyD1234

    MickyD1234 Notebook Prophet

    Reputations:
    3,155
    Messages:
    6,469
    Likes Received:
    1,160
    Trophy Points:
    331
    Looks like you are on the ball with all this stuff :). It could well be that those temps shortened the life of the GPU? You should check if a power state on the GPU triggers it - like a throttle that reduces the voltage. OSD while in-game is going to show that hopefully before it crashes.

    Just FYI but I have not seen a single person that gets a failure in the on-board diags, even if the GPU is clearly bad. Assuming the machine passes the test with the NV removed you can be fairly sure the NV had failed somewhere.
     
    SteveMonk likes this.
  8. SteveMonk

    SteveMonk Notebook Consultant

    Reputations:
    56
    Messages:
    109
    Likes Received:
    14
    Trophy Points:
    31
    Tried a few things last night. It seems the onboard is fine, no problems there as far as I can tell. I did notice there were later drivers available which I installed (clean install) and the machine died mid way through the driver install. I powered back on and drivers installed fine second time round. Weird as there was no draw on the GPU at the time. I also found it was actually crashing during Heaven as well so i decided to try running it again but running off battery only. The machine auto limits the GPU memory to 800mhz and it ran perfectly fine but once I plug the PSU back in and it kicks it up to 2500mhz it cuts out and dies.

    Ive found a way this morning to control the power state so i'm gonna try that later once i'm home from work.

    Tried to monitor as much info as i could but its tough to see a change as the machine cuts out all of a sudden. Its becoming a giant pain in the backside!

    As a side note, I got annoyed last night and bought a 880m to replace it but i'm still gonna keep working on trying to figure out/rectify the issue. At least throwing in the 880m will confirm whether or not its the GPU thats the problem
     
  9. MickyD1234

    MickyD1234 Notebook Prophet

    Reputations:
    3,155
    Messages:
    6,469
    Likes Received:
    1,160
    Trophy Points:
    331
    Good stuff. Just FYI but the GPU clock is also dropped on battery. I assume you're using NVInspector to set power states?

    Sure seems we are working with a dodgy NV GPU :eek:

    Good luck...
     
  10. EepoSaurus

    EepoSaurus Notebook Evangelist

    Reputations:
    240
    Messages:
    428
    Likes Received:
    301
    Trophy Points:
    76
    Just a few questions. The clocks for your card in game are what? What you are describing doesnt sound like a bad gpu it sounds like a bad overclock. Is there a chance that a program you are running has overclocked or severally undervolted your card? The gpu temps you are experiencing are well under the safe limit for your card. In fact nvidia makes your laptop shutdown around 94c and even at those temps there is no permanent damage. I have had several cards go bad in r1s that i have owned and i have never seen one go partially bad. They almost always die completey and it is a black screen and no signal from the display. It can happen but a tell tale sign is screen artifacting and hanging in game or during normal use. Its far more likely that since it is only causing trouble under load it must be an overclock setting, a power draw issue or a temperature issue. Considering the card has been repasted and you monitor it i would say it is unlikely to be heat related but an OSD would be best to diagnose. Also make sure and run with optimus disabled to make sure your card is the problem.
     
Loading...

Share This Page