Announcement

Collapse
No announcement yet.

hard lockups... maybe once a day... ?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    hard lockups... maybe once a day... ?

    I'm getting sporadic hard lockups -- no mouse, no responsiveness at all, can't ssh in, etc. Happens maybe once a day or so. Sometimes screens go black, sometimes they just freeze. No high fan usage or anything, just a freeze. No option but to cut power.

    Any ideas? I'm using nvidia drivers (GeForce GTX 1060, 560.35.03) and three monitors. Graphics drivers were my first guess -- what would be the best first step to debug this? What's the easiest way to roll back driver versions?

    journal shows nothing in the logs around the time of the freezes.

    #2
    Originally posted by chconnor View Post
    No option but to cut power.
    Have you tried the Magic Key Sequence? Alt+Prt sc+REISUB

    If that doesn't safely shutdown and reboot your PC, then a hard restart, as you are using now. But always try the Magic Key Sequence before resorting to that.
    Windows no longer obstructs my view.
    Using Kubuntu Linux since March 23, 2007.
    "It is a capital mistake to theorize before one has data." - Sherlock Holmes

    Comment


      #3
      Oh nice, for some reason I didn't think REISUB was still around. I'll try it next time.

      I have downgraded to v535 (using software sources "additional drivers" tab) and will see how that goes...

      Comment


        #4
        Argh, still happening on v535. REISUB didn't work (it does work when the system isn't locked up.)

        However this time in the journal there is a clue (I presume):

        Code:
        Nov 19 10:41:26 mycomp kernel: BUG: unable to handle page fault for address: ffffe9f8ff000008
        Nov 19 10:41:26 mycomp kernel: #PF: supervisor read access in kernel mode
        Nov 19 10:41:26 mycomp kernel: #PF: error_code(0x0000) - not-present page​
        ...some kind of kernel bug, maybe? Any tips on how to proceed?

        I was thinking to start unplugging USB devices to see if a driver was causing it?

        /var/log/kern.log doesn't have any smoking guns... there are a few stack traces of nvidia 535 driver stack traces re: "UBSAN: array-index-out-of-bounds in build/nvidia-srv/535.216.01/build/nvidia-uvm/uvm_pmm_gpu.c:857:39"... but they don't seem to be crashing the machine. Not sure what to make of that. Guess I'll go back to the latest nvidia drivers and keep an eye on kern.log.

        Comment

        Working...
        X