Wednesday, 15 May 2013

HP mini 110 1006TU upgrade fun - or - problem solving a dying HDD swap partition under linux

Random pauses, screen greying out, and sudden drops in network traffic throughput on a stock standard HP Mini 110 1006TU with 1GB of RAM and running Ubuntu 12.04 LTS.

Maybe this troubleshooting summary will help someone out there with their linux box. So here it is.

I assumed it was just Ubuntu bloat, but it did seem to come on most predictably when switching quickly between the many running applications, or if the annoying update manager decided to do its thing automatically in the background.

During a particularly long, grinding pause with fairly continuous HDD activity, I brought up a console and noticed a lot of errors in dmesg, like this:
 
May 13 22:38:47 esh-laptop kernel: [31006.614642] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
May 13 22:38:47 esh-laptop kernel: [31006.614652] ata1.00: BMDMA stat 0x5
May 13 22:38:47 esh-laptop kernel: [31006.614659] ata1.00: failed command: READ DMA EXT
May 13 22:38:47 esh-laptop kernel: [31006.614675] ata1.00: cmd 25/00:08:85:ec:51/00:00:12:00:00/e0 tag 0 dma 4096 in
May 13 22:38:47 esh-laptop kernel: [31006.614678]          res 51/40:00:85:ec:51/40:00:12:00:00/e0 Emask 0x9 (media error)
May 13 22:38:47 esh-laptop kernel: [31006.614685] ata1.00: status: { DRDY ERR }
May 13 22:38:47 esh-laptop kernel: [31006.614691] ata1.00: error: { UNC }
May 13 22:38:47 esh-laptop kernel: [31006.636491] ata1.00: configured for UDMA/100
May 13 22:38:47 esh-laptop kernel: [31006.636523] ata1: EH complete


Clearly, the kernel was having trouble with the HDD.

After deciding to do a scheduled "once every 3.5 years if it really seems necessary" backup to an external HDD, and managing to do so without difficulty, I had a play with the internal HDD.

I turned swap off with:

$sudo swapoff

I then ran gparted.

$sudo gparted

The swap partition was 2.8GB, and using the /dev/sda5 partition.

gparted had no trouble re-formatting the partition and threw up no errors... which didn't really tell me much about the health of the swap partition.

After quitting out of gparted, the next thing to try was a format of the swap partition with actual checking of the sectors:

$sudo mkswap -c /dev/sda5

About 3 hours later with innumerable syslog errors, the mkswap failed claiming too many errors were found.

Hmmm. Not good.

I decided that the next thing to do was play with the location of the swap partition. To do so required booting off a different drive - but a netbook has no CD/DVD drive.

The easiest way to do this is to download an ubuntu install .iso for the platform you are using, and find a USB drive to turn into a bootable live Ubuntu boot drive.

In this case, for the x86 netbook, I downloaded:

ubuntu-12.04.2-desktop-i386.iso

and then ran:

$usb-creator-gtk

This allowed me to choose the image .iso and install it onto a spare 2GB+ USB drive.

I was then able to restart the netbook and boot off the USB stick by hitting F9 on booting to tell the BIOS to boot from the USB drive instead of the internal HDD.

I then ran the disk utility from the Dash home, and found, as expected, that there were some bad sectors:



I ran gparted again, while booted off the USB boot drive, and was able to shrink the main root and usr partition /dev/sda1 by about 3GB.

Having done so, I then grew the /dev/sda5 swap partition by 3GB into the freed up space.

I then shrank the /dev/sda5 partition by 3GB leaving the space formally occupied by the (apparently somewhat trashed) swap partition unallocated.


The next test was to format the swap partition:

$sudo mkswap -c /dev/sda5

This completed the formatting without complaint within only a few minutes. So, the problem was indeed the bit of disk with the swap partition, which was now unallocated.

The next thing was to see the new UUID for the swap partition:

$sudo blkid
/dev/sda1: UUID="06a5ce9f-60e8-4df4-bf8e-44996bee3601" TYPE="ext4"
/dev/sda5: UUID="87559234-ffb2-43bc-bfd0-5a53026ae558" TYPE="swap"

and see how it compared to /etc/fstab:

$more /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid -o value -s UUID' to print the universally unique identifier
# for a device; this may be used with UUID= as a more robust way to name
# devices that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    defaults        0       0
# / was on /dev/sda1 during installation
UUID=06a5ce9f-60e8-4df4-bf8e-44996bee3601 /               ext4    errors=remount-ro 0       1
# swap was on /dev/sda5 during installation
UUID=72b5bb58-3d8a-4dc9-888e-2b86db41c81c none            swap    sw              0       0

So, the UUID had changed and now the fstab needed updating by changing fstab's entry for the swap partition to the new value.

$sudo gedit /etc/fstab

a quick cut and paste and save, and fstab had the new UUID in it.

swap was then turned back on with:

$sudo swapon /dev/sda5

On rebooting, the new swap area with its new UUID came up automatically, as expected.

Here are before and after shots showing swap off and then back on, by running the following command:

$top



No more syslog errors either.

Part of the reason the swap partition had been trashed is that Ubuntu has become a bit bloated and 1GB of RAM isn't a lot of RAM these days for running a few applications at once. Ubuntu struggles to even install on a box with 512MB of RAM, sometimes hanging.

To avoid excessive swap use, it seemed time to max out the RAM to 2GB.

The 1GB DDR2 667MHz 200 pin SDRAM SODIMM was removed (with due attention to static precautions) by undoing the two screws on the panel under the keyboard to access the SODIMM. It was the following part:

SPS-MEM, 1G, PC5300 537664-001

A 2GB DDR2 800MHz 200 pin SDRAM SODIMM was found at msy.com.au for around AUD$30.
It was not PC5300, but PC6400, i.e. it was 800MHz rated not 667MHz rated. Not a problem. It's working a treat, as evidenced by the screenshot above, showing "Mem:   2052568k total".

It reminds me of a rule of thumb for un*x boxes... keep adding RAM until the HDD light goes out.....

If you've made it this far, you're probably wondering "why spend money on an old machine?". Well, it's compact, does the job, and linux allows me to make do with ancient hardware that would otherwise get thrown out. I'm happy to spend a bit to keep a machine running for another year or two or three.

The next issue is the hard drive. Many would argue that a hard drive with increasing bad sectors is probably in a bit of death spiral. So the next job is to replace the HDD. These have become crazy cheap too these days. AUD$50 for a 5400RPM 500GB Seagate at msy.com.au as well. $50 is cheap insurance if it avoids a catastrophic data loss.

No comments:

Post a Comment