Announcement

Collapse
No announcement yet.

KDE - Processor is always at full load

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    #16
    Ok, the SMART data is not as nice as I thought, but it does not seem the worst situation ever, either.

    The voice labeled "Current Pending Sector Count" (ID number 197) is marked red, and the value is 3. I have 3 bad sectors, and if I'm not wrong, this kind of error can be solved. Anyway, I'm waiting for the advice of someone more expert than me. If there is some way to solve the problem, please let me know.

    Below is a SMART summary, it is collected from within Windows using a program named Defraggler, but the data, obviously, is the same as seen on Disk Utility.

    Code:
    1	Read Error Rate	0	198	198	51	0x00000000149F
    3	Spin-Up Time	1425 ms	191	187	21	0x000000000591
    4	Start/Stop Count	2.856	98	98	0	0x000000000B28
    5	Reallocated Sectors Count	0	200	200	140	0x000000000000
    7	Seek Error Rate	0	100	253	51	0x000000000000
    9	Power-On Hours (POH)	200d 7h	94	94	0	0x0000000012C7
    10	Spin Retry Count	0	100	100	51	0x000000000000
    11	Recalibration Retries	0	100	100	51	0x000000000000
    12	Power Cycle Count	2.682	98	98	0	0x000000000A7A
    192	Power-off Retract Count	314	200	200	0	0x00000000013A
    193	Load Cycle Count	62.342	180	180	0	0x00000000F386
    194	Temperature	46 °C	101	89	0	0x00000000002E
    196	Reallocation Event Count	0	200	200	0	0x000000000000
    197	Current Pending Sector Count	3	200	200	0	0x000000000003
    198	Uncorrectable Sector Count	0	100	253	0	0x000000000000
    199	UltraDMA CRC Error Count	0	200	200	0	0x000000000000
    200	Multi-Zone Error Rate	0	100	253	51	0x000000000000

    Now, I must also say that my original problem of KDE running sluggish does not seem to me closely related to this particular hardware issue (but I can be wrong, I'm just speculating here):

    My HD is partitioned as follow:
    sda1 - A partition used by the Windows Vista Recovery, never used by me
    sda2 - Windows XP installation
    sda3 - NTFS archiving partition
    sda4 - extended partition, containing:
    sda5 - Linux Root
    sda6 - Linux Home
    sda7 - swap

    The thing that look very strange to me is that, judging by the sector numbers and by a report from Gparted, the bad sectors are located exclusively on sda2 (Windows) that is not even mounted when Kubuntu starts; it uses just sda5 and sda6, now checked multiple times and surely clean and without any problem. And Windows, that runs on the damaged partition, has not a single problem from this point of view.
    This makes me think that, even if I will be forced to clone my HD on a brand new one, these issues won't go away. Again, I'm waiting for your opinions and advices.

    And many thanks to all of you, your suggestions are all very valuable.

    Comment


      #17
      "checked multiple times" - are you referring to fsck? That only checks the integrity of the data structures - it doesn't test anything to do with recoverable disk errors or slow reads due to retries.

      I agree that if all the bad sectors fall within sda2 it seems like it shouldn't affect Linux - but then again bad sectors that are already remapped shouldn't slow things down at all - the disk will skip them. It's the marginal sectors that may require multiple reads that could be affecting performance.

      Are there column headings for your smart data output? If the third column is "VALUE" then it looks like you have 0 remapped sectors and 0 read error rate so no problem at all! But I get confused by SMART values.

      Let's assume it's not the disk (but always keep (verified) backups in any case ... on more than one external device). A bad disk itself shouldn't be causing processor load all the time.

      I don't think we've done the basics for diagnosing high CPU load ... let's see which processes are causing the load.

      Post the output of
      Code:
      top -n 2 -d 10
      (it will wait 10 seconds and give one screenful of info)
      and
      Code:
      sudo iotop -b -n 2 -d 10 -o
      (install the package iotop if you don't already have it).
      I'd rather be locked out than locked in.

      Comment


        #18
        A couple suggestions.

        First, let's see a more thorough output of your S.M.A.R.T. stats. Please install the smartmontools package:
        Code:
        sudo apt-get install --no-install-recommends smartmontools
        Then, run the reporting tool, capture the output, and show us:
        Code:
        sudo smartctl -x /dev/sda
        Also, please show us the output of:
        Code:
        sudo fdisk -l /dev/sda

        Comment


          #19
          When I said to have checked these filesystems multiple times I meant with the tools provided with Disk Utility, Gpparted and other tools on the Parted Magic LiveCD.

          Here are the output you requested.

          From the "top" command:

          Code:
          top - 16:19:36 up 12 min,  2 users,  load average: 5.53, 5.59, 3.68
          Tasks:135 total,1 running, 134 sleeping,   0 stopped,   0 zombie
          Cpu(s):1.7%us,  0.9%sy,  0.0%ni,  8.0%id, 89.3%wa,  0.0%hi,  0.0%si,  0.0%st
          Mem:  3096576k total,   736744k used,  2359832k free,    53956k buffers
          Swap:  1028124k total,        0k used,  1028124k free,   310664k cached
          
           PID USER       PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
           1080 root      20   0  171m  85m  35m S    2  2.8   0:11.22 Xorg
           1681 stefano   20   0  184m  38m  27m S    2  1.3   0:03.74 kwin
           1697 stefano   20   0  2704  1176  816 S    2  0.0   0:00.20 ksysguardd  
           1898 stefano   20   0  112m  23m  17m S    2  0.8   0:00.61 konsole   
              1 root      20   0  3516   2008 1352 S    0  0.1   0:00.55 init     
              2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd   
              3 root      20   0     0    0    0 S    0  0.0   0:00.10 ksoftirqd/0   
              6 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/0    
              7 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/0     
              8 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/1  
             10 root      20   0     0    0    0 S    0  0.0   0:00.37 ksoftirqd/1       
             11 root      20   0     0    0    0 S    0  0.0   0:00.28 kworker/0:1        
             12 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/1         
            13 root       0 -20     0    0    0 S    0  0.0   0:00.00 cpuset            
          .....  
          
          top - 16:19:46 up 12 min,  2 users,  load average: 5.31, 5.54, 3.69
          Tasks: 136 total,   2 running, 134 sleeping,   0 stopped,   0 zombie
          Cpu(s):  2.1%us,  1.3%sy,  0.0%ni,  0.0%id, 96.6%wa,  0.0%hi,  0.0%si,  0.0%st
          Mem:   3096576k total,   737620k used,  2358956k free,    53964k buffers
          Swap:  1028124k total,        0k used,  1028124k free,   310688k cached
          
          PID USER        PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
           1080 root        20   0  171m   85m  35m S    5  2.8   0:11.72 Xorg               
           1681 stefano   20   0  184m   38m  27m S    2  1.3   0:03.95 kwin               
           1692 stefano   20   0  353m   96m  41m S    1  3.2   0:06.96 plasma-desktop     
             42 root          20   0     0          0       0 S    0  0.0   0:00.04 scsi_eh_0          
            288 root         20   0     0          0       0 D    0  0.0   0:00.01 jbd2/sda5-8        
           1898 stefano   20   0  112m   23m  17m S    0  0.8   0:00.62 konsole         
           2311 stefano   20   0  2836   1172  892 R    0  0.0   0:00.01 top               
              1 root          20   0  3516   2008 1352 S    0  0.1   0:00.55 init              
              2 root          20   0       0         0      0 S    0  0.0   0:00.00 kthreadd          
          .....
          from "iotop"
          Code:
          Total DISK READ:       0.00 B/s | Total DISK WRITE:       0.00 B/s
            TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
          Total DISK READ:       0.00 B/s | Total DISK WRITE:       6.38 K/s
            TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
            863 be/3 root        0.00 B/s 1225.54 B/s  0.00 %  1.45 % [jbd2/sda6-8]
           2575 be/4 root        0.00 B/s  408.51 B/s  0.00 %  0.00 % python /usr/sbin/iotop -b -n 2 -d 10 -o
            933 be/4 syslog      0.00 B/s  817.03 B/s  0.00 %  0.00 % rsyslogd -c5
          from "smartctl"
          Code:
          smartctl 5.41 2011-06-09 r3365 [i686-linux-3.2.6-pmagic] (local build)
          Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
          
          === START OF INFORMATION SECTION ===
          Model Family:     Western Digital Scorpio Blue Serial ATA
          Device Model:     WDC WD2500BEVS-22UST0
          Serial Number:    WD-WXC108958075
          LU WWN Device Id: 5 0014ee 055d4347d
          Firmware Version: 01.01A01
          User Capacity:    250,059,350,016 bytes [250 GB]
          Sector Size:      512 bytes logical/physical
          Device is:        In smartctl database [for details use: -P show]
          ATA Version is:   8
          ATA Standard is:  Exact ATA specification draft version not indicated
          Local Time is:    Tue May 22 11:40:31 2012 UTC
          SMART support is: Available - device has SMART capability.
          SMART support is: Enabled
          
          === START OF READ SMART DATA SECTION ===
          SMART overall-health self-assessment test result: PASSED
          
          General SMART Values:
          Offline data collection status:  (0x00)    Offline data collection activity
                              was never started.
                              Auto Offline Data Collection: Disabled.
          Self-test execution status:      ( 121)    The previous self-test completed having
                              the read element of the test failed.
          Total time to complete Offline 
          data collection:         ( 9180) seconds.
          Offline data collection
          capabilities:              (0x7b) SMART execute Offline immediate.
                              Auto Offline data collection on/off support.
                              Suspend Offline collection upon new
                              command.
                              Offline surface scan supported.
                              Self-test supported.
                              Conveyance Self-test supported.
                              Selective Self-test supported.
          SMART capabilities:            (0x0003)    Saves SMART data before entering
                              power-saving mode.
                              Supports SMART auto save timer.
          Error logging capability:        (0x01)    Error logging supported.
                              General Purpose Logging supported.
          Short self-test routine 
          recommended polling time:      (   2) minutes.
          Extended self-test routine
          recommended polling time:      ( 110) minutes.
          Conveyance self-test routine
          recommended polling time:      (   5) minutes.
          SCT capabilities:            (0x303f)    SCT Status supported.
                              SCT Error Recovery Control supported.
                              SCT Feature Control supported.
                              SCT Data Table supported.
          
          SMART Attributes Data Structure revision number: 16
          Vendor Specific SMART Attributes with Thresholds:
          ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
            1 Raw_Read_Error_Rate     POSR--   198   198   051    -    5279
            3 Spin_Up_Time            PO----   191   187   021    -    1425
            4 Start_Stop_Count        -O--CK   098   098   000    -    2857
            5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
            7 Seek_Error_Rate         -OSR--   100   253   051    -    0
            9 Power_On_Hours          -O--CK   094   094   000    -    4811
           10 Spin_Retry_Count        PO--C-   100   100   051    -    0
           11 Calibration_Retry_Count -O--C-   100   100   051    -    0
           12 Power_Cycle_Count       -O--CK   098   098   000    -    2683
          192 Power-Off_Retract_Count -O--CK   200   200   000    -    314
          193 Load_Cycle_Count        -O--CK   180   180   000    -    62358
          194 Temperature_Celsius     -O---K   121   089   000    -    26
          196 Reallocated_Event_Count -O--CK   200   200   000    -    0
          197 Current_Pending_Sector  -O--C-   200   200   000    -    3
          198 Offline_Uncorrectable   ----C-   100   253   000    -    0
          199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
          200 Multi_Zone_Error_Rate   P--R--   100   253   051    -    0
                                      ||||||_ K auto-keep
                                      |||||__ C event count
                                      ||||___ R error rate
                                      |||____ S speed/performance
                                      ||_____ O updated online
                                      |______ P prefailure warning
          
          General Purpose Log Directory Version 1
          SMART           Log Directory Version 1 [multi-sector log support]
          GP/S  Log at address 0x00 has    1 sectors [Log Directory]
          SMART Log at address 0x01 has    1 sectors [Summary SMART error log]
          SMART Log at address 0x02 has    5 sectors [Comprehensive SMART error log]
          GP    Log at address 0x03 has    6 sectors [Ext. Comprehensive SMART error log]
          SMART Log at address 0x06 has    1 sectors [SMART self-test log]
          GP    Log at address 0x07 has    1 sectors [Extended self-test log]
          SMART Log at address 0x09 has    1 sectors [Selective self-test log]
          GP    Log at address 0x10 has    1 sectors [NCQ Command Error]
          GP    Log at address 0x11 has    1 sectors [SATA Phy Event Counters]
          GP/S  Log at address 0x80 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x81 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x82 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x83 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x84 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x85 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x86 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x87 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x88 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x89 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x8a has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x8b has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x8c has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x8d has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x8e has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x8f has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x90 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x91 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x92 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x93 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x94 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x95 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x96 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x97 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x98 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x99 has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x9a has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x9b has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x9c has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x9d has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x9e has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0x9f has   16 sectors [Host vendor specific log]
          GP/S  Log at address 0xa0 has   16 sectors [Device vendor specific log]
          GP/S  Log at address 0xa1 has   16 sectors [Device vendor specific log]
          GP/S  Log at address 0xa2 has   16 sectors [Device vendor specific log]
          GP/S  Log at address 0xa3 has   16 sectors [Device vendor specific log]
          GP/S  Log at address 0xa4 has   16 sectors [Device vendor specific log]
          GP/S  Log at address 0xa5 has   16 sectors [Device vendor specific log]
          GP/S  Log at address 0xa6 has   16 sectors [Device vendor specific log]
          GP/S  Log at address 0xa7 has   16 sectors [Device vendor specific log]
          GP/S  Log at address 0xa8 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xa9 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xaa has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xab has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xac has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xad has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xae has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xaf has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xb0 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xb1 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xb2 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xb3 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xb4 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xb5 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xb6 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xb7 has    1 sectors [Device vendor specific log]
          GP/S  Log at address 0xc0 has    1 sectors [Device vendor specific log]
          GP    Log at address 0xc1 has   12 sectors [Device vendor specific log]
          GP/S  Log at address 0xe0 has    1 sectors [SCT Command/Status]
          GP/S  Log at address 0xe1 has    1 sectors [SCT Data Transfer]
          
          SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
          Device Error Count: 4717 (device log contains only the most recent 24 errors)
              CR     = Command Register
              FEATR  = Features Register
              COUNT  = Count (was: Sector Count) Register
              LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
              LH     = LBA High (was: Cylinder High) Register    ]   LBA
              LM     = LBA Mid (was: Cylinder Low) Register      ] Register
              LL     = LBA Low (was: Sector Number) Register     ]
              DV     = Device (was: Device/Head) Register
              DC     = Device Control Register
              ER     = Error register
              ST     = Status register
          Powered_Up_Time is measured from power on, and printed as
          DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
          SS=sec, and sss=millisec. It "wraps" after 49.710 days.
          
          Error 4717 [12] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
            When the command that caused the error occurred, the device was in an unknown state.
          
            After command completion occurred, registers were:
            ER -- ST COUNT  LBA_48  LH LM LL DV DC
            -- -- -- == -- == == == -- -- -- -- --
            40 -- 00 01 00 00 51 00 00 40 01 00 00  
          
            Commands leading to the command that caused the error were:
            CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
            -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
            00 19 60 2b 4f 00 00 00 00 00 00 00 01     00:00:00.000  NOP [Reserved subcommand]
            00 b0 01 00 01 2b 52 1c 4d 60 00 00 00     00:00:58.296  NOP [Don't abort queued commands]
            0f 37 60 2b 4d 00 a8 00 7d 01 00 00 0d     00:47:17.834  [RESERVED]
            00 98 01 00 00 2b 37 00 4d 60 00 00 00     00:00:17.568  NOP [Don't abort queued commands]
          
          Error 4716 [11] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
            When the command that caused the error occurred, the device was in an unknown state.
          
            After command completion occurred, registers were:
            ER -- ST COUNT  LBA_48  LH LM LL DV DC
            -- -- -- == -- == == == -- -- -- -- --
            40 -- 00 01 00 00 51 00 00 40 01 00 00  
          
            Commands leading to the command that caused the error were:
            CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
            -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
            00 f0 60 2b 3c 00 00 00 00 00 00 00 01     00:00:00.000  NOP [Reserved subcommand]
            00 58 01 00 01 2b 29 1c 3b 60 00 00 00     00:00:58.208  NOP [Don't abort queued commands]
            0f 0e 60 2b 3b 00 50 00 7d 01 00 00 0d     00:47:13.185  [RESERVED]
            00 40 01 00 00 2b 0e 00 3b 60 00 00 00     00:00:17.480  NOP [Don't abort queued commands]
          
          Error 4715 [10] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
            When the command that caused the error occurred, the device was in an unknown state.
          
            After command completion occurred, registers were:
            ER -- ST COUNT  LBA_48  LH LM LL DV DC
            -- -- -- == -- == == == -- -- -- -- --
            40 -- 00 01 00 00 51 00 00 40 01 00 00  
          
            Commands leading to the command that caused the error were:
            CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
            -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
            00 57 60 2b 2b 00 00 00 00 00 00 00 01     00:00:00.000  NOP [Reserved subcommand]
            00 10 01 00 01 2b 90 1c 29 60 00 00 00     00:00:58.136  NOP [Don't abort queued commands]
            0f 75 60 2b 29 00 08 00 7d 01 00 00 0d     00:47:08.680  [RESERVED]
            00 f8 01 00 00 2b 75 00 29 60 00 00 00     00:00:17.408  NOP [Don't abort queued commands]
          
          Error 4714 [9] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
            When the command that caused the error occurred, the device was in an unknown state.
          
            After command completion occurred, registers were:
            ER -- ST COUNT  LBA_48  LH LM LL DV DC
            -- -- -- == -- == == == -- -- -- -- --
            40 -- 00 01 00 00 51 00 00 40 01 00 00  
          
            Commands leading to the command that caused the error were:
            CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
            -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
            00 39 60 2b 19 00 00 00 00 00 00 00 01     00:00:00.000  NOP [Reserved subcommand]
            00 c8 01 00 01 2b 72 1c 17 60 00 00 00     00:00:58.320  NOP [Don't abort queued commands]
            0f 57 60 2b 17 00 c0 00 7d 01 00 00 0d     00:47:04.042  [RESERVED]
            00 b0 01 00 00 2b 57 00 17 60 00 00 00     00:00:17.592  NOP [Don't abort queued commands]
          
          Error 4713 [8] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
            When the command that caused the error occurred, the device was in an unknown state.
          
            After command completion occurred, registers were:
            ER -- ST COUNT  LBA_48  LH LM LL DV DC
            -- -- -- == -- == == == -- -- -- -- --
            40 -- 00 01 00 00 51 00 00 40 01 00 00  
          
            Commands leading to the command that caused the error were:
            CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
            -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
            00 ce 60 2b 07 00 00 00 00 00 00 00 01     00:00:00.000  NOP [Reserved subcommand]
            00 80 01 00 e2 2b ce 01 07 60 00 00 00     00:00:58.248  NOP [Don't abort queued commands]
            01 ce 60 2b 07 00 78 00 e1 01 00 00 01     00:47:00.046  [RESERVED]
            00 68 01 00 df 2b cd 01 07 60 00 00 00     00:00:57.456  NOP [Don't abort queued commands]
          
          Error 4712 [7] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
            When the command that caused the error occurred, the device was in an unknown state.
          
            After command completion occurred, registers were:
            ER -- ST COUNT  LBA_48  LH LM LL DV DC
            -- -- -- == -- == == == -- -- -- -- --
            40 -- 00 01 00 00 51 00 00 40 01 00 00  
          
            Commands leading to the command that caused the error were:
            CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
            -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
            00 8b 60 2a f7 00 00 00 00 00 00 00 01     00:00:00.000  NOP [Reserved subcommand]
            00 00 01 00 01 2a c5 1c f5 60 00 00 00     00:00:00.008  NOP [Don't abort queued commands]
            0f aa 60 2a f5 00 f8 00 7d 01 00 00 0d     00:46:55.421  [RESERVED]
            00 e8 01 00 00 2a a9 00 f5 60 00 00 00     00:00:17.648  NOP [Don't abort queued commands]
          
          Error 4711 [6] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
            When the command that caused the error occurred, the device was in an unknown state.
          
            After command completion occurred, registers were:
            ER -- ST COUNT  LBA_48  LH LM LL DV DC
            -- -- -- == -- == == == -- -- -- -- --
            40 -- 00 01 00 00 51 00 00 40 01 00 00  
          
            Commands leading to the command that caused the error were:
            CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
            -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
            00 e8 60 2a e5 00 00 00 00 00 00 00 01     00:00:00.000  NOP [Reserved subcommand]
            00 b8 01 00 01 2a 21 1c e4 60 00 00 00     00:00:00.192  NOP [Don't abort queued commands]
            0f 06 60 2a e4 00 b0 00 7d 01 00 00 0d     00:46:50.905  [RESERVED]
            00 a0 01 00 00 2a 05 00 e4 60 00 00 00     00:00:17.576  NOP [Don't abort queued commands]
          
          Error 4710 [5] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
            When the command that caused the error occurred, the device was in an unknown state.
          
            After command completion occurred, registers were:
            ER -- ST COUNT  LBA_48  LH LM LL DV DC
            -- -- -- == -- == == == -- -- -- -- --
            40 -- 00 01 00 00 51 00 00 40 01 00 00  
          
            Commands leading to the command that caused the error were:
            CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
            -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
            00 4f 60 2a d4 00 00 00 00 00 00 00 01     00:00:00.000  NOP [Reserved subcommand]
            00 70 01 00 01 2a 88 1c d2 60 00 00 00     00:00:00.120  NOP [Don't abort queued commands]
            0f 6d 60 2a d2 00 68 00 7d 01 00 00 0d     00:46:46.400  [RESERVED]
            00 58 01 00 00 2a 6c 00 d2 60 00 00 00     00:00:17.504  NOP [Don't abort queued commands]
          
          SMART Extended Self-test Log Version: 1 (1 sectors)
          Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
          # 1  Short offline       Completed: read failure       90%      4811         26840547
          # 2  Extended offline    Completed: read failure       90%      4811         26840547
          # 3  Conveyance offline  Completed: read failure       90%      4811         26840547
          # 4  Short offline       Completed: read failure       90%      4811         26840547
          # 5  Short offline       Completed: read failure       90%      4810         26840547
          # 6  Short offline       Completed: read failure       90%      4810         26840547
          # 7  Short offline       Completed: read failure       90%      4810         26840547
          # 8  Short offline       Completed: read failure       90%      4807         26840547
          # 9  Conveyance offline  Completed: read failure       90%      4807         26840547
          #10  Short offline       Completed: read failure       90%      4807         26840547
          #11  Extended offline    Completed: read failure       90%      4807         26840547
          #12  Short offline       Completed: read failure       80%      4806         26840547
          #13  Short offline       Completed: read failure       90%      4804         26840547
          #14  Short offline       Completed without error       00%      2326         -
          #15  Short offline       Completed without error       00%      1318         -
          
          SMART Selective self-test log data structure revision number 1
           SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
              1        0        0  Not_testing
              2        0        0  Not_testing
              3        0        0  Not_testing
              4        0        0  Not_testing
              5        0        0  Not_testing
          Selective self-test flags (0x0):
            After scanning selected spans, do NOT read-scan remainder of disk.
          If Selective self-test is pending on power-up, resume after 0 minute delay.
          
          SCT Status Version:                  2
          SCT Version (vendor specific):       258 (0x0102)
          SCT Support Level:                   1
          Device State:                        Active (0)
          Current Temperature:                    26 Celsius
          Power Cycle Min/Max Temperature:     15/26 Celsius
          Lifetime    Min/Max Temperature:     16/58 Celsius
          Under/Over Temperature Limit Count:   0/0
          SCT Temperature History Version:     2
          Temperature Sampling Period:         1 minute
          Temperature Logging Interval:        1 minute
          Min/Max recommended Temperature:      5/60 Celsius
          Min/Max Temperature Limit:            1/85 Celsius
          Temperature History Size (Index):    128 (7)
          
          Index    Estimated Time   Temperature Celsius
             8    2012-05-22 09:33    39  ********************
           ...    ..( 14 skipped).    ..  ********************
            23    2012-05-22 09:48    39  ********************
            24    2012-05-22 09:49    40  *********************
            25    2012-05-22 09:50    41  **********************
           ...    ..(  3 skipped).    ..  **********************
            29    2012-05-22 09:54    41  **********************
            30    2012-05-22 09:55    40  *********************
           ...    ..(  8 skipped).    ..  *********************
            39    2012-05-22 10:04    40  *********************
            40    2012-05-22 10:05    39  ********************
           ...    ..( 42 skipped).    ..  ********************
            83    2012-05-22 10:48    39  ********************
            84    2012-05-22 10:49    40  *********************
            85    2012-05-22 10:50    41  **********************
            86    2012-05-22 10:51    41  **********************
            87    2012-05-22 10:52    42  ***********************
            88    2012-05-22 10:53    42  ***********************
            89    2012-05-22 10:54    43  ************************
            90    2012-05-22 10:55    43  ************************
            91    2012-05-22 10:56    44  *************************
           ...    ..(  2 skipped).    ..  *************************
            94    2012-05-22 10:59    44  *************************
            95    2012-05-22 11:00    43  ************************
            96    2012-05-22 11:01    43  ************************
            97    2012-05-22 11:02    44  *************************
            98    2012-05-22 11:03    43  ************************
            99    2012-05-22 11:04    43  ************************
           100    2012-05-22 11:05    43  ************************
           101    2012-05-22 11:06    42  ***********************
           ...    ..(  2 skipped).    ..  ***********************
           104    2012-05-22 11:09    42  ***********************
           105    2012-05-22 11:10    41  **********************
           ...    ..( 10 skipped).    ..  **********************
           116    2012-05-22 11:21    41  **********************
           117    2012-05-22 11:22    40  *********************
           ...    ..(  2 skipped).    ..  *********************
           120    2012-05-22 11:25    40  *********************
           121    2012-05-22 11:26     ?  -
           122    2012-05-22 11:27    15  -
           123    2012-05-22 11:28    16  -
           124    2012-05-22 11:29    17  -
           125    2012-05-22 11:30    18  -
           126    2012-05-22 11:31    19  -
           127    2012-05-22 11:32    20  *
             0    2012-05-22 11:33    21  **
             1    2012-05-22 11:34    22  ***
             2    2012-05-22 11:35    23  ****
             3    2012-05-22 11:36    24  *****
             4    2012-05-22 11:37    25  ******
             5    2012-05-22 11:38    25  ******
             6    2012-05-22 11:39    26  *******
             7    2012-05-22 11:40    39  ********************
          
          SCT Error Recovery Control:
                     Read: Disabled
                    Write: Disabled
          
          SATA Phy Event Counters (GP Log 0x11)
          ID      Size     Value  Description
          0x0001  2            0  Command failed due to ICRC error
          0x0002  2            0  R_ERR response for data FIS
          0x0003  2            0  R_ERR response for device-to-host data FIS
          0x0004  2            0  R_ERR response for host-to-device data FIS
          0x0005  2            0  R_ERR response for non-data FIS
          0x0006  2            0  R_ERR response for device-to-host non-data FIS
          0x0007  2            0  R_ERR response for host-to-device non-data FIS
          0x000a  2            3  Device-to-host register FISes sent due to a COMRESET
          0x8000  4          761  Vendor specific
          from "fdisk"
          Code:
          Disk /dev/sda: 250.1 GB, 250059350016 bytes
          255 heads, 63 sectors/track, 30401 cylinders, total 488397168 sectors
          Units = sectors of 1 * 512 = 512 bytes
          Sector size (logical/physical): 512 bytes / 512 bytes
          I/O size (minimum/optimal): 512 bytes / 512 bytes
          Disk identifier: 0x34fe34fd
          
             Device Boot      Start         End      Blocks   Id  System
          /dev/sda1            2048    20482047    10240000   27  Hidden NTFS WinRE
          /dev/sda2   *    20482875   122881184    51199155    7  HPFS/NTFS/exFAT
          /dev/sda3       122881185   225279494    51199155    7  HPFS/NTFS/exFAT
          /dev/sda4       225279556   488392064   131556254+   5  Extended
          /dev/sda5       225279558   266245244    20482843+  83  Linux
          /dev/sda6       266245308   486335744   110045218+  83  Linux
          /dev/sda7       486335808   488392064     1028128+  82  Linux swap
          Last edited by GreyGeek; May 22, 2012, 11:28 AM. Reason: clean up data

          Comment


            #20
            From your top and iotop output, it does not look like the processor is always at full load - user and system CPU load are low, and there's not much I/O.

            It looks like your percent-wait (%wa) is very high, though, but I can't make out your top output clearly. Your version seems to have lots of formatting control codes that mine doesn't - how did you capture it?

            Can you run it again, either to the screen and copy/paste, or redirected to a file, so that the output is more like this?

            Code:
            top - 13:44:06 up 5 days, 17:51,  7 users,  load average: 0.02, 0.09, 0.09
            Tasks: 213 total,   2 running, 211 sleeping,   0 stopped,   0 zombie
            Cpu(s): 16.3%us,  7.4%sy,  0.7%ni, 73.5%id,  1.6%wa,  0.0%hi,  0.4%si,  0.0%st
            Mem:   4056476k total,  3627244k used,   429232k free,    91320k buffers
            Swap:  4200960k total,   543368k used,  3657592k free,   753320k cached
            
              PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                
            17120 joe       20   0 2389m 143m  20m S    8  3.6  99:51.03 amarok                                                                 
            24054 joe       20   0  840m 136m  18m S    6  3.4  32:40.21 kwin                                                                   
             1332 root      20   0  582m 378m  74m S    4  9.6 328:58.78 Xorg                                                                   
             2019 joe       20   0  726m  13m 6444 S    4  0.3  80:51.28 knotify4                                                               
             4811 joe       20   0  695m  94m  28m S    4  2.4  10:56.99 plugin-containe                                                        
             1887 joe       20   0  322m 3316 1788 S    2  0.1   0:02.90 kdeinit4                                                               
             2022 joe       20   0 1064m 116m  31m S    2  2.9 100:35.07 plasma-desktop                                                         
             4630 joe       20   0 1418m 406m  53m S    2 10.3  10:42.12 firefox                                                                
             5850 root      20   0     0    0    0 S    2  0.0   0:00.94 kworker/1:3                                                            
             6000 joe       20   0 21464 1404  972 R    2  0.0   0:00.01 top                                                                    
                1 root      20   0 24188 1924 1100 S    0  0.0   0:01.33 init
            Only the top 10 lines or so matter.


            I'll leave someone else to advise on the smartctl output but I think it means you have some issues.
            I'd rather be locked out than locked in.

            Comment


              #21
              I captured all the outputs in the same way: given-command > filename. I tried several times but I always get these formatting codes, just on the "top" report.

              Keep in mind that I have strong difficulties working with the issued system, running commands from a console is somewhat slow but is more or less doable; but using anything with a GUI takes forever just to open the windows. If it is ok, I can post a photo of the terminal, where the data is well readable.

              I d'nt know why the reports show little CPU usage: the whole system is practically unusable; if I open System Monitor, I can see on the system load tab my two processors are both shown at 100%, and on the processes tab several ones are listed as "disk wait" in the CPU usage column. Plus, the HDD led is always on.

              In a couple of minutes I'll post the picture with the "top" output.

              Comment


                #22
                Originally posted by sekhemty View Post
                I captured all the outputs in the same way: given-command > filename. I tried several times but I always get these formatting codes, just on the "top" report.
                Try
                Code:
                top -n 2 -d 10 -b | head -n 20 > /tmp/dump.log

                Comment


                  #23
                  Sorry but that didn't work well either.

                  I try with some pictures (the GUI is in italian; on the second screen, in the CPU column, "attesa disco" means "disk wait").

                  Also, I don't know if it helps, but everytime I login on the issued system, the clock is always set a couple of hours in the future.





                  Comment


                    #24
                    The stats look like they are 1.6%us 0.6%sy 0.0%ni 22.5%id 75.3%wa

                    75% disk wait (75% of the time the CPU is "busy" waiting for disk I/O to complete) is definitely what's making your system slow. You should only see that for short periods - or when doing heavy I/O - in a properly functioning system. And your iotop output does not show any significant I/O rates.

                    There is definitely something wrong with the disk - although I suppose it might be a kernel / driver parameter issue rather than failing hardware.
                    Last edited by SecretCode; May 22, 2012, 10:08 AM.
                    I'd rather be locked out than locked in.

                    Comment


                      #25
                      Originally posted by SecretCode View Post
                      75% disk wait (75% of the time the CPU is "busy" waiting for disk I/O to complete) is definitely what's making your system slow. You should only see that for short periods - or when doing heavy I/O - in a properly functioning system. And your iotop output does not show any significant I/O rates.
                      Very well. So, since if I rename the .kde folder everything runs smoothly, what in its content is more likely to cause this odd I/O behaviour?

                      Comment


                        #26
                        Have you run drive diagnostic tests through smartctl or Disk Utility?

                        I think the following command is quick and can be used while the drive is online:
                        Code:
                        sudo smartctl -t short /dev/sda
                        but you will only see the results some time later via
                        Code:
                        sudo smartctl -l selftest /dev/sda
                        This test will give results immediately:
                        Code:
                        sudo smartctl -t offline /dev/sda
                        I've got these from man smartctl and I've tested them on my own disk. As far as I can see the worst that can happen with smartctl tests is that "-t long" tests can take hours, and if you include the "-c" "--captive" parameter the drive will be inaccessible for the duration, and should therefore not be used with mounted partitions.
                        I'd rather be locked out than locked in.

                        Comment


                          #27
                          Originally posted by sekhemty View Post
                          Very well. So, since if I rename the .kde folder everything runs smoothly, what in its content is more likely to cause this odd I/O behaviour?
                          I forgot about this detail. OK, ignore my previous post, there's not much point in testing the hard drive!

                          Let me have a rummage through ~/.kde ...
                          I'd rather be locked out than locked in.

                          Comment


                            #28
                            Originally posted by sekhemty View Post
                            Is there any other tool I can use to try to find of the HD has physical errors?
                            Originally posted by SecretCode View Post
                            "checked multiple times" - are you referring to fsck? That only checks the integrity of the data structures - it doesn't test anything to do with recoverable disk errors or slow reads due to retries.
                            A more thorough disk check utility is e2fsck. This tool can force a check even if the file system appears clean, and can also call the badblock utility to scan for bad blocks. Boot a live CD/USB (or some other recovery tool) and try this:
                            Code:
                            sudo e2fsck -cfv /dev/sda
                            Originally posted by sekhemty View Post
                            The thing that look very strange to me is that, judging by the sector numbers and by a report from Gparted, the bad sectors are located exclusively on sda2 (Windows) that is not even mounted when Kubuntu starts; it uses just sda5 and sda6
                            I'm not sure that's a correct analysis. Your original screenshot shows the end_request error at sector 266593325. That sector falls within the boundaries of /dev/sda6, according to your fdisk -l output:
                            Code:
                               Device Boot      Start         End      Blocks   Id  System
                            dev/sda6       266245308   486335744   110045218+  83  Linux
                            Older versions of fdisk relied on cylinder-head-sector geometry for reporting logical block numbers. This is no longer the case: it now reports sector numbers. But smartctl's extended self test is still using LBAs, which is why you see the error in 26840547.

                            IMHO, you have bad blocks in partition /dev/sda6.

                            Comment


                              #29
                              The output of the e2fsck is as follows, it seems pretty useless as it is:

                              Code:
                              root@PartedMagic:~# e2fsck -cfv /dev/sda
                              e2fsck 1.42 (29-Nov-2011)
                              e2fsck: Superblock invalid, trying backup blocks...
                              e2fsck: Bad magic number in super-block while trying to open /dev/sda
                              
                              The superblock could not be read or does not describe a correct ext2
                              filesystem.  If the device is valid and it really contains an ext2
                              filesystem (and not swap or ufs or something else), then the superblock
                              is corrupt, and you might try running e2fsck with an alternate superblock:
                                  e2fsck -b 8193 <device>
                              
                              root@PartedMagic:~# e2fsck -cfv -b 8193 /dev/sda
                              e2fsck 1.42 (29-Nov-2011)
                              e2fsck: Bad magic number in super-block while trying to open /dev/sda
                              
                              The superblock could not be read or does not describe a correct ext2
                              filesystem.  If the device is valid and it really contains an ext2
                              filesystem (and not swap or ufs or something else), then the superblock
                              is corrupt, and you might try running e2fsck with an alternate superblock:
                                  e2fsck -b 8193 <device>
                              Anyway, even if now I'm sure that the HDD has some problems and I'm considering getting a new one to get it cloned, I'm also pretty confident that my problems with the system being slow is not hardware related, at least not in a explicit way. Renaming the .kde folder and letting the system create a new one at login proved that is something onto it that caused the slugginess; and restoring the original one makes the problems come back.
                              Since renaming, as far as i know, don't physically move the files within the HDD, I also eve tried to backup the whole folder on a portable USB key, then completely deleted it on the main hard drive, and logged in. As expected, it was running smoothly.
                              Then I logged off, deleted the newly created .kde folder and restored the one from the USB key. Now I'm sure that the files don't necessarily reside on the same sectors on the hard drive, and the problem persists, so it must be some wrong configuration file.

                              The main problem now is to identificate it. Like I said in a previous post, the directory contains hundreds of files, manually trying to rename or remove them one by one would takes a large amount of time.

                              I'm attaching a zipped text file listing all the content of my .kde folder. Bebause of its lenght I was unable to post as normal post text. kdelist.zip

                              Comment


                                #30
                                Why is EVERYTHING in your .kde folder owned by "partitionmag" and with a group "partmagi"

                                All that stuff should be owned by you (your account name) and your group (account name).
                                "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
                                – John F. Kennedy, February 26, 1962.

                                Comment

                                Working...
                                X