[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]
On Wed, 25 Nov 2009, Sam Grabham wrote:
HiI have had a few boxes start to degrade be giving me a "Segmentation fault" errors while trying to use "vi" or "ls", these servers have been in constant use for around 3 years. i would like to repair the install to minimize downtime as other command tools seem to work OK.I have reformatted and rebuilt in the past, but these servers wouldn't reboot due to this error.I can't see why this is happening as they all have ECC RAM and are of high end build and are on good dual UPS feeds.does any one else get these sort of problems?
Unlike Simon, I have had servers segfault due to hardware problems - mostly memory. Even ones with ECC RAM. ECC is good, but not perfect... (Although you usually get error messages on the console if the ECC kicks in) Recently a "high end" server failed after 3 years, 8 months - it was the memory controller than gave up, according to the BIOS beep codes...
If you can, get memtest86+ and give it an overnight run. That doesn't always find everything though, but if it does, there are patches to the kernel (may even be in, depending on your kernel sources) which can map-out bad areas of RAM.
But there are other things that can affect it - trouble is, there's not that good a set of diagnostics - however if you have been applying patches, s/w upgrades, etc, then maybe start down that route before poking at hardware..
Gordon -- The Mailing List for the Devon & Cornwall LUG http://mailman.dclug.org.uk/listinfo/list FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html