M17xR4 issue with GPU 675M / 680M - Help and Advices are welcome

Discussion in 'Alienware 17 and M17x' started by spacetauren, Sep 26, 2015.

Thread Status:
Not open for further replies.
  1. spacetauren

    spacetauren Notebook Enthusiast

    Reputations:
    2
    Messages:
    37
    Likes Received:
    9
    Trophy Points:
    16
    Sticky Summary (edited October 12th 2015):

    That thread ended with some good conclusions even if other fixes had to be launched (see this other thread). I thank the people who answered questions along but would be very grateful if a lot of you can go to the end of this thread (click on Conclusion link below if no time to read ;)) and answer two questions I left opened.

    Core sections of that thread are:

    Introduction: When your baby got sick (context of GPU issues) - see below
    Chapter 1 : Preparing the patient (= various teardown steps)
    Chapter 2 : Walking Dead (= reanimating without mobo fixing)
    Chapter 3 : Brand new backbone (= mobo change and its results)
    Conclusion

    End of sticky section
    ------------------------------------------------------------------------------------------------


    Hello all,

    After weeks of fighting I am searching for advices or even fixes. I already browse extensively the threads on related topics from this forum and a bunch of others; so its time for questions and also to share that if it can help somebody in the future.

    My computer:
    2.5 year old Alienware M17xR4 - Bios A12 - icore7 - GTX 675M (120Hz - 3D LCD) - 24 GB RAM - 240 W
    My OS / GPU drivers:
    Windows 7-64 - NVIDIA 344.11

    I started to experience BSOD and freeze at the begining of August. After verifying I realized the GPU & CPU temperatures were quite hot (85 - 90°C). I did a cleaning and repasting of both CPU and GPU. All went good the 4 weeks after: temp from 50 to 75°C max and no more BSOD.

    However early this month the issues came back and with a lot more frequency. After checking all the usual culprits (RAM, drivers, Windows, battery, PSU) I focused on the 675M GPU.
    The drivers changes and all the other stuff regarding temperature cooling tricks were tried but failed. So I decided to change the 675M GPU for a 680M.

    I received the card 3 days ago and made the upgrade at lightspeed with any problem. I started the computer and it just went ok (130fps on OCCT Power Test during a mere 20 minutes). So I went to bed with a smile.

    24 hours later I started the computer to show to a friend the new beast - and then the OCCT test just broke after 4 minutes with a "pink" freeze of the screen !

    After that I tried to reboot but no way (it always hung on during the windows 7 loading with a black screen and that's all. Changing drivers and so on didn't fix anything. For sure I was able to run in W7 no-error mode but not anything else.

    Yesterday evening after checking the card and repasting it again I tried to boot but forgetting to plug the power: I was then able to boot windows. I then tried OCCT - I know that GPU on battery is weak but I am quite deseperate now - and I reached a crippled 25 fps but with no more freeze after one hour of OCCT test. Then I tried to plug and immediately I got a freeze when trying to open FF. I tested a little bit around this and I realized that if I try to launch Windows on power it fails and if I plug after started on battery it freezes also just after - for example plugging the external monitor is immediately freezing the computer on a screen of various colour but that death is without any BSOD registered and I have to shut down manually. Because I have two 240W PSU I tried both and the issue is the same - so it's not the PSU. I also moved the plug on computer side but not experiencing any short or false contact that way.

    I then tried to put back in place my 675M (kept as a souvenir) and I got quite the same conclusion: if I use on battery it works but with power the computer crashes; perhaps after a little bit longer time with the 675M - but a BSOD occurs and is registered by W7 with that GPU. I reproduced this behaviour a lot of time today and I even found that the issues are coming quicker (with the 675M) when the computer is hotter (but far below critical limits).

    I also tested all with EPSA just in case but all is detected OK (sic!)

    Long introduction but useful to understand my questions :

    1 - Do you have an explanation or an experience for that issue ? And why not a last trick I can try?
    2 - Should I change the mobo (more $ after the 680M) or buy a new computer ?

    Thank you for the feedback folks
    Spacetauren
     
    Last edited: Oct 12, 2015
  2. spacetauren

    spacetauren Notebook Enthusiast

    Reputations:
    2
    Messages:
    37
    Likes Received:
    9
    Trophy Points:
    16
    Bump! I can't imagine that any of you has even one idea about a root cause.

    Anyway, I bought a genuine refurbished mobo and I will go for a full Monty - first step is to receive it from the eBay seller...
    After that I expect that I will need some guidance for the next moves to reach an happy end.

    First request : Can somebody point me to some useful docs / video or whatever that can help me? I've seen a youtube teardown but not sure it is bullet proof?
    Second request: I have a 120Hz 3D model and I would like to know what are the difference to plug the LCD on the mobo. Can somebody explain ?

    I think some forum deities like J95 have done almost all that can be done in tuning their M17xR4 and I am still hoping they could help. Is it a way to contact one of them ?

    As a contribution I will put in that thread a full report on the next steps of my attempt to raisedead the computer. Followers and contributors for that big show will be welcome.

    Come back soon !
     
  3. Raidriar

    Raidriar ლ(ಠ益ಠლ)

    Reputations:
    1,580
    Messages:
    5,623
    Likes Received:
    4,060
    Trophy Points:
    431
    Ok well, it could be a couple things:
    1. Your motherboard has a bad MXM slot that is screwing with your GPUs
    2. Your 675M and your 680M are both bad.

    I would have said test with the Intel HD 4000 to see if you get similar results, but you have the 120Hz panel so that is not possible to do. As for your requests:
    1.
    2. I'm not sure what you mean by this. There are 3 panels for the M17x R4. The 1600x900 60Hz and 1920x1080 60Hz panel use the LVDS connector, and the 1920x1080 120Hz panel uses the eDP connector. They look completely different and there is no way you can confuse the two when you replace the system board.
     
    spacetauren likes this.
  4. spacetauren

    spacetauren Notebook Enthusiast

    Reputations:
    2
    Messages:
    37
    Likes Received:
    9
    Trophy Points:
    16
    Hi and thanks Raidriar.
    1. The MXM can be the faulty one but it is only when GPUs are running on PSU and not on battery. What does this imply? Is it possibly linked to some VRMs IC that are no more able to manage the power coming from PSU but OK with battery?
    2. This can be true - both GPUs bad :( - which is a little bit sad for the brand-new 680M I only used 20 minutes with no issue and that failed the day after.

    The only way to test more is to change the mobo I'm afraid - or to find another laptop where I can mount these GPUs and test (lucky me!).
    Let's say I will first have a very cautious look to the ICs on my GPUs and mobo this week-end and I will revert if I find something strange.

    About requests answers: I already had this youtube link but I thank you anyway. I am trying to find more detailed docs but perhaps they are not available on the web.
    Greatly appreciate your info on the 120Hz connector. If I cannot be confused then the teardown will be easier.
     
  5. Raidriar

    Raidriar ლ(ಠ益ಠლ)

    Reputations:
    1,580
    Messages:
    5,623
    Likes Received:
    4,060
    Trophy Points:
    431
    I know that on battery, the GPU will scale back to only 2D clocks and low voltage, impossible to use 3D clocks and 3D voltage, so it may not be straining the card so hard or stressing the board. Not likely to be VRM issue, but either something wrong with the internal GPU die, bad solder joint, or a dead IC or capacitor either on the MXM card or on the motherboard. It would really suck if the 680M is dead, they don't seem to die too often. 675M is a known garbage card that was a rebadge of the GTX 580M, which was notoriously unreliable.

    As you said, the only way to know is with a motherboard swap. I believe this is the eDP socket on the board:
    $_57.JPG
     
  6. spacetauren

    spacetauren Notebook Enthusiast

    Reputations:
    2
    Messages:
    37
    Likes Received:
    9
    Trophy Points:
    16
    Happy to read that both of us are sharing the same feeling Raidriar. I really cross my fingers that the 680M didn't died.
    What I find quite uneasy to understand is why the 680M were able to do a good job during a stress test at the first time. The idea about a dying capacitor could indeed fit with the behaviour I experienced but with ICs it can be more tricky unfortunately.

    Thanks for the picture with the eDP I will check but seems clear.

    I will post news about my next moves. I will copy / record all my configuration first and I will after go for a minimum system installation on one spare SSD I have (I will store my other HD & SSD currently mounted so that I can swap them at the end if I fix the mess). And also I have a spare 970M I can use at the very last option (but I will then need to change of OS for 8.1 or 10 I guess).
     
  7. Raidriar

    Raidriar ლ(ಠ益ಠლ)

    Reputations:
    1,580
    Messages:
    5,623
    Likes Received:
    4,060
    Trophy Points:
    431
    Good luck!
     
    spacetauren likes this.
  8. danyune

    danyune Notebook Evangelist

    Reputations:
    89
    Messages:
    673
    Likes Received:
    156
    Trophy Points:
    56
    I once had it boot to a black screen, I know this is going to sound weird, but hit your Fn+F6 key, then hit enter, it'll select the monitor as your regular monitor.

    I don't know why it happens, but it happens only with legacy boot and W7.

    You can test your GPU in any laptop that accepts Dell GPUs really. That is a way to guarantee it's the motherboard, but the odds of getting TWO bad GPUs is pretty low, so I will assume it's the motherboard.
     
  9. spacetauren

    spacetauren Notebook Enthusiast

    Reputations:
    2
    Messages:
    37
    Likes Received:
    9
    Trophy Points:
    16
    Thanks danuyne. Nothing is too weird for me ;) these days. I tried the Fn+F6 key after reading your post but doesn't fix my issue . But I will keep this on mind for future adventures with laptops.

    OK I will now start the reporting of my raisedead trial.

    CHAPTER 1 : Preparing the patient

    Before to replace the mobo (still not arrived) I will first remove my SSD and HDD so that I can kill 3 birds with one stone :
    1 - Be sure my 2 and a half year data and system are safe during the next teardown steps
    2 - Having a fresh OS & drivers install that will avoid (I hope) the occurrence of possible mess of two many softwares and drivers with the new mobo when mounted
    3 - No need to do a quite uneasy and hazardous backup / mirror of my system with my unstable computer

    Plan is good but then I figured out my SSD is indeed a mSata one. Looking on the web make me realize that this small crap is kept in a M17xR4 like gold in Fort Knox :eek:. I will need to do quite a first full teardown to remove it. Ok friends let's consider it like a training before to go for the teardown for the mobo I will have to do anyway.

    There we go ! So the very first thing you don't feel so good to see with a laptop is it's bottom :confused: as a lot of you I am sure know well.[​IMG]

    I will not go for an extra detailed script for this teardown because you have still a lot of sources on the forum for that. But let's have a look of the most accessible components.
    [​IMG]

    After that I have to remove the keyboard and the lcd panel assembly. Which means a good time in disconnecting a bunch of tiny connectors of any kind (quite a creepy time I swear). The below pic give a little feeling of how it can be stressful.
    [​IMG]

    Ok after all of this is done you can remove the under-keyboard layer :
    [​IMG]

    Then you can have a look of the very inside of the m17xR4 (pic below) and realize that this backbone is not as sexy as the computer you had in front of you some minutes ago (and also a lot lighter :D).

    [​IMG]

    And finally here is the holy crap :mad: mSATA SSD (pic below). It is hard to find but quite easy to unplug :).
    [​IMG]

    ....
    And now it is time for a full reassembly yeahhhh :D. I will not spend too much time to describe it but let's say it went quite smooth even if I had some butterfingers stages with the keyboard connectors. The mounting of the Samsung EVO 500 Gb SSD in the SATA III first slot (HD0) was a piece of cake.

    Next post I will explain the computer and new OS first revival (hopefully) :hi:.
     
    Last edited: Oct 4, 2015
  10. spacetauren

    spacetauren Notebook Enthusiast

    Reputations:
    2
    Messages:
    37
    Likes Received:
    9
    Trophy Points:
    16
    ... and I will need it, thanks raidriar.

    Chapter 2 : Walking Dead (= reanimating without mobo fixing)

    So after having removed the former mSATA SSD and the HDD and put in place a new SSD on SATA port it's time to check if I didn't kill the computer during the dissassembly / reassembly session.

    I have prepared an usb boot key with Windows 7. I turn on the computer and push on F2 to enter the bios. It went fine at that stage and I checked that system seemed ok. I plugged in my usb key and with a shaking finger I exited the BIOS setup. The M17xR4 ran immediately with a long set of bips :eek: that I interrupted by pressing the alien head. I panicked a little - is it a miss in the reassembly ?

    I had prepared my genuine dvd of Win 64 - always be ready with a plan B is a lesson IT told me a long time ago. I decided to remove the usb and to try a boot with the dvd. I turned on for the second time and … yeeesss !! :cool: ... it started to read the stuff and to install W7.

    I spent some hours after to update the W7 and to install the basic genuine set of drivers I need - but without installing the one for the nvidia GPU. During that time (almost 10 hours) I had three times a strange black screen - very like a sleep mode but with no way to recover from it (here I tried the Fn+F6 button as per the danuyne post but with no success). However the computer was still working in the backstage because when I stopped and restart with the alien button I had normal restarts of windows with a resume stage in two of the three cases. No clear explanation but could be a mobo issue again. Looks like the message to stop the screen is coming suddenly to the system and that it has no clue on how to come back to live after. No BSOD that times.

    At the very end I finally re-installed the original GPU driver that came with the computer 2,5 years ago (rev. 307.17). I started to use the stuff a few minutes (even changing successfully the frequency to 120 Hz) before I had the very well known BSOD with black screen and reboot :no: I have experienced a lot since this story started .

    So I am more and more sure that my mobo is not in good shape, but I am not yet sure that the two GPUs have not themselves been damaged.:wacko: Now I need just to wait for the mobo and pray it will be a fine and robust one.

    Any new inputs, advices and encouragements of people that made the same trip - mobo replacement by themselves and so on - are more than welcome.
    Next chapter soon (I hope) :hi:
     
Loading...
Thread Status:
Not open for further replies.

Share This Page