[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]
Grant Sewell wrote: > Hi all, > > I have started to have 'problems' with my file/web/mail server. I am > getting the following message several times over in dmesg output: > > hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error } > hdd: dma_intr: error=0x84 { DriveStatusError BadCRC } > ide: failed opcode was: unknown > > Occasionally I will also get: > > ide1: reset: master: error (0x7f?) > > fdisk shows hdd to be the following (which is correct): > > Disk /dev/hdd: 163.9 GB, 163928604672 bytes > 255 heads, 63 sectors/track, 19929 cylinders > Units = cylinders of 16065 * 512 = 8225280 bytes > > Device Boot Start End Blocks Id System > /dev/hdd1 1 19929 160079661 83 Linux > > and "smartctl -a /dev/hdd": > > Model Family: Maxtor DiamondMax Plus 9 family > Device Model: Maxtor 6Y160P0 > Serial Number: Y46CSYAE > Firmware Version: YAR41BW0 > User Capacity: 163,928,604,672 bytes > Device is: In smartctl database [for details use: -P show] > ATA Version is: 7 > ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 > Local Time is: Sat Jan 26 16:42:22 2008 GMT > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > The above errors don't seem to affect general use of the machine, > however quite concern-making is that recently I have also been getting > these whenever I try to access *some* parts of the file-system on hdd > (mounted as /home): > > end_request: I/O error, dev hdd, sector 202454167 > end_request: I/O error, dev hdd, sector 202454663 > end_request: I/O error, dev hdd, sector 202454671 > end_request: I/O error, dev hdd, sector 202454167 > end_request: I/O error, dev hdd, sector 63 > Buffer I/O error on device hdd1, logical block 0 > lost page write due to I/O error on hdd1 > end_request: I/O error, dev hdd, sector 127 > Buffer I/O error on device hdd1, logical block 8 > lost page write due to I/O error on hdd1 > end_request: I/O error, dev hdd, sector 86507599 > Buffer I/O error on device hdd1, logical block 10813442 > lost page write due to I/O error on hdd1 > end_request: I/O error, dev hdd, sector 86507695 > Buffer I/O error on device hdd1, logical block 10813454 > lost page write due to I/O error on hdd1 > end_request: I/O error, dev hdd, sector 86507703 > Buffer I/O error on device hdd1, logical block 10813455 > lost page write due to I/O error on hdd1 > end_request: I/O error, dev hdd, sector 202454167 > end_request: I/O error, dev hdd, sector 86507695 > EXT3-fs error (device hdd1): ext3_get_inode_loc: unable to read inode > block - inode=5407135, block=10813454 Aborting journal on device hdd1. > end_request: I/O error, dev hdd, sector 4303 > Buffer I/O error on device hdd1, logical block 530 > lost page write due to I/O error on hdd1 > end_request: I/O error, dev hdd, sector 63 > Buffer I/O error on device hdd1, logical block 0 > lost page write due to I/O error on hdd1 > EXT3-fs error (device hdd1) in ext3_reserve_inode_write: IO failure > end_request: I/O error, dev hdd, sector 63 > Buffer I/O error on device hdd1, logical block 0 > lost page write due to I/O error on hdd1 > EXT3-fs error (device hdd1) in ext3_dirty_inode: IO failure > end_request: I/O error, dev hdd, sector 63 > Buffer I/O error on device hdd1, logical block 0 > lost page write due to I/O error on hdd1 > ext3_abort called. > EXT3-fs error (device hdd1): ext3_journal_start_sb: Detected aborted > journal Remounting filesystem read-only > > Upon dropping to runlevel 1, then performing "umount /home" I > immediately get: > > end_request: I/O error, dev hdd, sector 4303 > Buffer I/O error on device hdd1, logical block 530 > lost page write to I/O error on hdd1 > > (or something like that) > > Then a fsck /dev/hdd1 returns with: > end_request: I/O error, dev hdd, sector 69 > (repeated lots, different sectors) > > fsck.ext3: Attempt to read block from filesystem resulted in short read > whilst trying to open /dev/hdd1 Could this be a zero-length partition? > > Indeed, now an "fdisk -l /dev/hdd" shows: > end_request: I/O error, dev hdd, sector 0 > printk: 30 messages suppressed. > Buffer I/O error on device hdd, logical block 0 > (blah blah) > > Reboot and all is file again, until I try to do this again... then I > get errors again. > > I'm not really sure where to begin. I've disabled DMA by adding a > kernel boot parameter of ide=nodma, but that doesn't seem to affect > this problem at all. Booting from another medium and fscking both hda1 > and hdd1 come back fine. When the disks are removed and attached to > anther machine via a USB-ATA adapter, all is OK, so I'm inclined to > think it might be the PATA controller on this motherboard (don't ask me > what it is, I have no idea), however this machine has been working fine > for ages... and more concerning I used to get these sorts of errors on > my "old" server before I retired it and performed a > harddrive-transplant to this "new" computer, and all was fine for a > while. > > Thanks for reading. Any ideas? > > Cheers. > Grant. > Well for a start I'd get the data off the drive. Then you could try downloading some Maxtor diagnostics software from www.seagate.com (Seagate own Maxtor). Try running a complete test on the drive (you should be able to run a complete read test). If as you say you think there is a controller problem, try it on two machines. Another thing you could maybe try too is double check the jumper settings (is everything set to Cable Select or Master/Slave) and could the cable be faulty? Hopefully this might give you something to go on. Rob -- The Mailing List for the Devon & Cornwall LUG http://mailman.dclug.org.uk/listinfo/list FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html