• 1 Post
  • 17 Comments
Joined 3 years ago
cake
Cake day: June 22nd, 2023

help-circle


  • Checked the logs in Windows, you’re right! A corrected hardware error has occurred from PCI Express Root Port. I reseated the drive with no change.

    I should mention that this laptop has always had issues with what I assumed to be thermal throttling. It would play games fine for 10-15 minutes before becoming a slideshow. I eventually stopped trying.

    I have set that option and I am currently downloading a GPU benchmark. Is that an appropriate test? What should I be looking for in dmesg?


  • Installed Linux Mint on my old personal laptop (Dell XPS 9560) and unfortunately ran into some issues that made me switch back to Windows. I really want to make it work

    It seems to have revealed either a hardware bug or failing hardware in the NVMe drive.

    First problem was log spam that filled up the partition:

    spoiler
    2025-12-29T12:15:46.439880-05:00 redacted kernel: pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:04:00.0
    2025-12-29T12:15:46.439934-05:00 redacted kernel: nvme 0000:04:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
    2025-12-29T12:15:46.439936-05:00 redacted kernel: nvme 0000:04:00.0:   device [126f:2262] error status/mask=00000001/0000e000
    2025-12-29T12:15:46.439938-05:00 redacted kernel: nvme 0000:04:00.0:    [ 0] RxErr                  (First)
    2025-12-29T12:15:46.439939-05:00 redacted kernel: pcieport 0000:00:1d.0: AER: Multiple Correctable error message received from 0000:04:00.0
    2025-12-29T12:15:46.439940-05:00 redacted kernel: pcieport 0000:00:1d.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Transmitter ID)
    2025-12-29T12:15:46.439941-05:00 redacted kernel: pcieport 0000:00:1d.0:   device [8086:a118] error status/mask=00001000/00000000
    2025-12-29T12:15:46.439943-05:00 redacted kernel: pcieport 0000:00:1d.0:    [12] Timeout               
    2025-12-29T12:15:46.439944-05:00 redacted kernel: nvme 0000:04:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
    2025-12-29T12:15:46.439945-05:00 redacted kernel: nvme 0000:04:00.0:   device [126f:2262] error status/mask=00000001/0000e000
    2025-12-29T12:15:46.439946-05:00 redacted kernel: nvme 0000:04:00.0:    [ 0] RxErr                  (First)
    

    Some forum posts I found (example) suggested that this was a hardware bug and I could set pcie_aspm=off in grub to work around it. This stopped the log spam and everything seemed to be working fine.

    Later while I was doing some programming, everything froze for a while. When it came back, the partition was set to readonly. It wouldn’t boot on restart and loaded up busybox instead. I was able to set it to writable, but it happened again soon after.

    I decided to switch back to Windows where there doesn’t seem to be any issues.

    I really want to make it work. If it’s failing hardware then I have no choice but to replace the drive, but if it’s just a bug then I want to find a fix without buying new hardware. That would kind of defeat the point for me and I don’t want to spend the money.

    I would appreciate any help. I booted into Mint again to grab the logs and I really want to keep using it.