[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]
Grant Sewell wrote:
> Hi all,
>
> I have started to have 'problems' with my file/web/mail server. I am
> getting the following message several times over in dmesg output:
>
> hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hdd: dma_intr: error=0x84 { DriveStatusError BadCRC }
> ide: failed opcode was: unknown
>
> Occasionally I will also get:
>
> ide1: reset: master: error (0x7f?)
>
> fdisk shows hdd to be the following (which is correct):
>
> Disk /dev/hdd: 163.9 GB, 163928604672 bytes
> 255 heads, 63 sectors/track, 19929 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
>
> Device Boot Start End Blocks Id System
> /dev/hdd1 1 19929 160079661 83 Linux
>
> and "smartctl -a /dev/hdd":
>
> Model Family: Maxtor DiamondMax Plus 9 family
> Device Model: Maxtor 6Y160P0
> Serial Number: Y46CSYAE
> Firmware Version: YAR41BW0
> User Capacity: 163,928,604,672 bytes
> Device is: In smartctl database [for details use: -P show]
> ATA Version is: 7
> ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
> Local Time is: Sat Jan 26 16:42:22 2008 GMT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> The above errors don't seem to affect general use of the machine,
> however quite concern-making is that recently I have also been getting
> these whenever I try to access *some* parts of the file-system on hdd
> (mounted as /home):
>
> end_request: I/O error, dev hdd, sector 202454167
> end_request: I/O error, dev hdd, sector 202454663
> end_request: I/O error, dev hdd, sector 202454671
> end_request: I/O error, dev hdd, sector 202454167
> end_request: I/O error, dev hdd, sector 63
> Buffer I/O error on device hdd1, logical block 0
> lost page write due to I/O error on hdd1
> end_request: I/O error, dev hdd, sector 127
> Buffer I/O error on device hdd1, logical block 8
> lost page write due to I/O error on hdd1
> end_request: I/O error, dev hdd, sector 86507599
> Buffer I/O error on device hdd1, logical block 10813442
> lost page write due to I/O error on hdd1
> end_request: I/O error, dev hdd, sector 86507695
> Buffer I/O error on device hdd1, logical block 10813454
> lost page write due to I/O error on hdd1
> end_request: I/O error, dev hdd, sector 86507703
> Buffer I/O error on device hdd1, logical block 10813455
> lost page write due to I/O error on hdd1
> end_request: I/O error, dev hdd, sector 202454167
> end_request: I/O error, dev hdd, sector 86507695
> EXT3-fs error (device hdd1): ext3_get_inode_loc: unable to read inode
> block - inode=5407135, block=10813454 Aborting journal on device hdd1.
> end_request: I/O error, dev hdd, sector 4303
> Buffer I/O error on device hdd1, logical block 530
> lost page write due to I/O error on hdd1
> end_request: I/O error, dev hdd, sector 63
> Buffer I/O error on device hdd1, logical block 0
> lost page write due to I/O error on hdd1
> EXT3-fs error (device hdd1) in ext3_reserve_inode_write: IO failure
> end_request: I/O error, dev hdd, sector 63
> Buffer I/O error on device hdd1, logical block 0
> lost page write due to I/O error on hdd1
> EXT3-fs error (device hdd1) in ext3_dirty_inode: IO failure
> end_request: I/O error, dev hdd, sector 63
> Buffer I/O error on device hdd1, logical block 0
> lost page write due to I/O error on hdd1
> ext3_abort called.
> EXT3-fs error (device hdd1): ext3_journal_start_sb: Detected aborted
> journal Remounting filesystem read-only
>
> Upon dropping to runlevel 1, then performing "umount /home" I
> immediately get:
>
> end_request: I/O error, dev hdd, sector 4303
> Buffer I/O error on device hdd1, logical block 530
> lost page write to I/O error on hdd1
>
> (or something like that)
>
> Then a fsck /dev/hdd1 returns with:
> end_request: I/O error, dev hdd, sector 69
> (repeated lots, different sectors)
>
> fsck.ext3: Attempt to read block from filesystem resulted in short read
> whilst trying to open /dev/hdd1 Could this be a zero-length partition?
>
> Indeed, now an "fdisk -l /dev/hdd" shows:
> end_request: I/O error, dev hdd, sector 0
> printk: 30 messages suppressed.
> Buffer I/O error on device hdd, logical block 0
> (blah blah)
>
> Reboot and all is file again, until I try to do this again... then I
> get errors again.
>
> I'm not really sure where to begin. I've disabled DMA by adding a
> kernel boot parameter of ide=nodma, but that doesn't seem to affect
> this problem at all. Booting from another medium and fscking both hda1
> and hdd1 come back fine. When the disks are removed and attached to
> anther machine via a USB-ATA adapter, all is OK, so I'm inclined to
> think it might be the PATA controller on this motherboard (don't ask me
> what it is, I have no idea), however this machine has been working fine
> for ages... and more concerning I used to get these sorts of errors on
> my "old" server before I retired it and performed a
> harddrive-transplant to this "new" computer, and all was fine for a
> while.
>
> Thanks for reading. Any ideas?
>
> Cheers.
> Grant.
>
Well for a start I'd get the data off the drive. Then you could try
downloading some Maxtor diagnostics software from www.seagate.com
(Seagate own Maxtor). Try running a complete test on the drive (you
should be able to run a complete read test). If as you say you think
there is a controller problem, try it on two machines. Another thing
you could maybe try too is double check the jumper settings (is
everything set to Cable Select or Master/Slave) and could the cable be
faulty?
Hopefully this might give you something to go on.
Rob
--
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html