Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 26

Thread: free-dc: Drive crash

  1. #11
    Advisor - Stateside Division
    Bok's Avatar
    Join Date
    October 14th, 2010
    Location
    Wake Forest, NC
    Posts
    1,211

    Re: free-dc: Drive crash

    Well, it just happened again, but at least I'm home this time.

    I'm still puzzled as to what it is though. Rebooted and the drives again are just fine.

    This is the log excerpt.

    Code:
    Jul 20 03:47:02 dbase kernel: imklog 4.6.2, log source = /proc/kmsg started.Jul 20 03:47:02 dbase rsyslogd: [origin software="rsyslogd" swVersion="4.6.2" x-pid="1380" x-info="http://www.rsyslog.com"] (re)start
    Jul 21 15:21:54 dbase kernel: ata8.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:00:87:03:53/00:00:0c:00:00/40 tag 0 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:04:3f:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:08:07:d5:52/00:00:0c:00:00/40 tag 1 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:10:4f:e5:52/00:00:0c:00:00/40 tag 2 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:18:9f:dc:52/00:00:0c:00:00/40 tag 3 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:20:f7:ec:52/00:00:0c:00:00/40 tag 4 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:28:a7:e4:52/00:00:0c:00:00/40 tag 5 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:30:1f:d9:52/00:00:0c:00:00/40 tag 6 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/10:38:bf:e4:52/00:00:0c:00:00/40 tag 7 ncq 8192 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:40:4f:d7:52/00:00:0c:00:00/40 tag 8 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    Jul 21 15:21:54 dbase kernel: ata8.00: status: { DRDY }
    Jul 21 15:21:54 dbase kernel: ata8.00: failed command: WRITE FPDMA QUEUED
    Jul 21 15:21:54 dbase kernel: ata8.00: cmd 61/08:48:67:da:52/00:00:0c:00:00/40 tag 9 ncq 4096 out
    Jul 21 15:21:54 dbase kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
    <snip>
    Jul 21 15:22:41 dbase kernel: ata8.00: device reported invalid CHS sector 0
    Jul 21 15:22:41 dbase kernel: ata8: hard resetting link
    Jul 21 15:22:41 dbase kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
    Jul 21 15:22:41 dbase kernel: ata8: EH complete
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] Unhandled error code
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] CDB: Write(10): 2a 00 0c 52 fd 8f 00 00 08 00
    Jul 21 15:22:41 dbase kernel: end_request: I/O error, dev sdd, sector 206765455
    Jul 21 15:22:41 dbase kernel: Buffer I/O error on device sdd1, logical block 25845674
    Jul 21 15:22:41 dbase kernel: lost page write due to I/O error on sdd1
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] Unhandled error code
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] CDB: Write(10): 2a 00 0c 52 db 6f 00 00 08 00
    Jul 21 15:22:41 dbase kernel: end_request: I/O error, dev sdd, sector 206756719
    Jul 21 15:22:41 dbase kernel: Buffer I/O error on device sdd1, logical block 25844582
    Jul 21 15:22:41 dbase kernel: lost page write due to I/O error on sdd1
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] Unhandled error code
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] CDB: Write(10): 2a 00 0c 52 b2 bf 00 00 08 00
    Jul 21 15:22:41 dbase kernel: end_request: I/O error, dev sdd, sector 206746303
    Jul 21 15:22:41 dbase kernel: Buffer I/O error on device sdd1, logical block 25843280
    Jul 21 15:22:41 dbase kernel: lost page write due to I/O error on sdd1
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] Unhandled error code
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
    Jul 21 15:22:41 dbase kernel: sd 7:0:0:0: [sdd] CDB: Write(10): 2a 00 0c 52 da 57 00 00 08 00
    Jul 21 15:22:41 dbase kernel: end_request: I/O error, dev sdd, sector 206756439
    Jul 21 15:22:41 dbase kernel: Buffer I/O error on device sdd1, logical block 25844547
    Lots of that then similar for sdc1

    I'm going to at least put in the other 2 drives I have spare right now.

  2. #12
    Friend of SETI.USA
    Join Date
    November 15th, 2010
    Posts
    2,452

    Re: free-dc: Drive crash

    Again it looks like, Bok's on it though ...

  3. #13
    Advisor - Stateside Division
    Bok's Avatar
    Join Date
    October 14th, 2010
    Location
    Wake Forest, NC
    Posts
    1,211

    Re: free-dc: Drive crash

    Running through some tests, drives appear totally fine, smartctl show no errors on any of the drives.

    Got to be the board/controllers right ?

    *EDIT* Could it possibly be a PSU issue?

  4. #14
    Administrator
    Bryan's Avatar
    Join Date
    October 27th, 2010
    Location
    CO summer, TX winter
    Posts
    6,457

    Re: free-dc: Drive crash

    Quote Originally Posted by Bok View Post
    Running through some tests, drives appear totally fine, smartctl show no errors on any of the drives.

    Got to be the board/controllers right ?

    *EDIT* Could it possibly be a PSU issue?
    Aren't PSUs always guilty until proven innocent?


  5. #15
    Platinum Member
    Mumps's Avatar
    Join Date
    October 28th, 2010
    Location
    Milwaukee, WI
    Posts
    4,002

    Re: free-dc: Drive crash

    Personally I'd be most suspicious of the controller. Possibly drivers for same. Do you keep the Linux patches current? May be a problem that's been seen by others or fixed. (Or a regression if you recently updated.)

    I did have one BOINC/Crunching only system that kept doing something similar. It would declare the drive Read-Only after a flurry of I/O errors. I replaced the hard drive multiple times with ones I had on hand, yet the problem kept recurring. Only when I finally went from HD to SSD did the problem go away. (Knock on wood.) Didn't even help to try moving to a different SATA port. But still using the same SATA controller for the SSD without issue now for many months.

  6. #16
    Advisor - Stateside Division
    Bok's Avatar
    Join Date
    October 14th, 2010
    Location
    Wake Forest, NC
    Posts
    1,211

    Re: free-dc: Drive crash

    I only really update if there is anything that need serious fixing. These have been running just fine since the last round of failures almost a year ago though. I just yanked the machine out and it certainly needs a good clean, so I will do that at the same time I add two spare drives in tonight.

  7. #17
    Silver Member
    myshortpencil's Avatar
    Join Date
    May 13th, 2012
    Location
    NY
    Posts
    961

    Re: free-dc: Drive crash

    Don't know if it will help, but when Free-DC starts going down, the stats that stop working first for me are the country stats from Portugal. The table won't load, yet individual and team stats do.

  8. #18
    Advisor - Stateside Division
    Bok's Avatar
    Join Date
    October 14th, 2010
    Location
    Wake Forest, NC
    Posts
    1,211

    Re: free-dc: Drive crash

    Quote Originally Posted by myshortpencil View Post
    Don't know if it will help, but when Free-DC starts going down, the stats that stop working first for me are the country stats from Portugal. The table won't load, yet individual and team stats do.
    The pages are cached on the web server for a short time, so that's likely a manifestation of that only.

  9. #19
    Administrator
    Bryan's Avatar
    Join Date
    October 27th, 2010
    Location
    CO summer, TX winter
    Posts
    6,457

    Re: free-dc: Drive crash

    There is one minor problem in the reporting. If you look at Combined Team stats it is missing position/team #10. I noticed that a week ago and forgot about it, but it is still a problem I think that position is held by Taiwan.


  10. #20
    Advisor - Stateside Division
    Bok's Avatar
    Join Date
    October 14th, 2010
    Location
    Wake Forest, NC
    Posts
    1,211

    Re: free-dc: Drive crash

    Quote Originally Posted by Bryan View Post
    There is one minor problem in the reporting. If you look at Combined Team stats it is missing position/team #10. I noticed that a week ago and forgot about it, but it is still a problem I think that position is held by Taiwan.
    Ah, good spot. This is a bug, it's a secondary table which I use for the #projects a team has done, simple with teamname and count, and it's recreated often from the main data.

    BUT, and it's something I don't really like about mysql, the default is for fields to be non case sensitive and this is one of them. There is a BOINC@Taiwan team who are in 10th place and there is also a BOINC@TAIWAN, this one was taking precedence and being created in this table, the other was not as it appeared to be the same. The sql for listing the top teams joins to this table and didn't find an entry for BOINC@Taiwan so did not display it.

    Fixed, so it should show up fairly soon (as long as the server holds up!)

Page 2 of 3 FirstFirst 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •