4. Troubleshooting
Though FreeBSD, OpenBSD and GNU/Linux operating systems are very fine pieces of
software, yet when dual or multi-booting PCs with all (or combinations) of them,
one can often run into problems. It may range from easy ones to detect and
troubleshoot to really weird ones. Troubleshooting, as known (and appreciated)
by all UNIX-veterans is not an exact art. In this Chapter, I have listed some
problems that I have had (and continue facing) while dual and multi-booting
systems with varying operating systems and hardware.
Readers must note that depending on the particular hardware and
distributions/releases used, they may experience slightly different versions of
the problems listed below. If you encounter a new problem altogether or can
contribute to this Guide by providing a better solution to one of the
problems listed below, make sure you drop me a line at: subhasish_ghosh@linuxwaves.com.
My personal experience with multi-booting systems saw some weird kinds of failure and
often erratic behavior, but mostly from Linux's fdisk and installation
procedure. I often made notes of them for diagnosing and troubleshooting later.
As I can conclude, amongst Linux distributions (which includes Red Hat,
Slackware, SuSE and Mandrake), OpenBSD and FreeBSD releases, FreeBSD's fdisk,
disklabel and overall installation procedure (/stand/sysinstall) is the best of
them all with absolutely no errors and abrupt crashes. Initially, I always
thought that there must be some exact way of installing all these operating
systems or an exact order that needs to be followed, but with passing years, I
have learnt the truth: Concerned the hardware support is fine, installation
CD-ROMs okay, the erratic behavior of installation programs is due to unerased data from previous installs.
So before committing yourself to creating a multi-booting system, make sure you
start installation on a freshly formatted hard disk. Read on to follow the
details of the procedure! Some of the common questions from readers
include:
Q1. I created an extended partition with logical disk drives or sub
partitions within it. But when I installed FreeBSD, it completely ignored the inner partitions
within that partition. Is that normal?
Ans1: Yes, it is absolutely normal. FreeBSD operating system software
can only detect and represent primary hard disk partitions which it calls
"slices". An extended partition, though would be represented in
FreeBSD fdisk will not represent the constituents of the container.
Q2. I installed FreeBSD and OpenBSD operating systems on my PC
successfully. When I tried installing Red Hat Linux, Linux's fdisk displayed a "too many partitions" error
message. What does it mean? What can I do now? or,
Q3. I deleted earlier installs of FreeBSD, OpenBSD and/or NetBSD operating
systems on my PC. When I tried installing Red Hat Linux, Linux's fdisk displayed a "too many partitions" error
message. What am I supposed to do now?
Ans2 and Ans3: As the readers can see, this error message flagged by Red Hat
Linux fdisk tool can occur under a variety of conditions. I myself encountered
both of them while multi-booting PCs. When this message gets displayed Linux's fdisk
cannot make any changes to the disk. Personally speaking, I strongly feel that this message:
"too many partitions (16, maximum is 8)" usually appears whenever you have
unerased data from previous installs of FreeBSD and/or OpenBSD and/or NetBSD
operating systems. This usually happens whenever those data partitions where
deleted, but the Master Partition Table (MPT) contained in the Master Boot
Record (MBR) was not overwritten. Since this problem does not seem to occur on
clean hard disks, I conclude that it only appears if you have old BSD disk label
data on your hard disk.
Once this error gets flagged, there is nothing much you can do about it
because whatever you do or whatever you type in, nothing gets written to disk.
If it so happens that you are performing a clean install and you get this error,
do a "fdisk /mbr" from the MS-DOS prompt, and try installing Red Hat
Linux once again. If it so happens that you have FreeBSD/OpenBSD/NetBSD
installed on your system already, then (sorry dude!) erase all earlier installs,
wipe the disk clean using "dd" utility or whatever you have at your
disposal and start installing everything all over again.
Q4. I was performing a new install of Red Hat Linux on my PC which earlier
had multiple installs of FreeBSD and/or OpenBSD and/or NetBSD. I got a "too many partitions" error
from the Red Hat Linux fdisk. Can't I perform a full install of Red Hat Linux on
my hard disk to remove the BSD disklabel data on my hard disk?
Ans4: No, you cannot. Once this error gets flagged, even fdisk's 'o' option which normally clears all partition information
would not function. Even if you are successful in doing a complete Linux only "full disk" install
using the Disk Druid disk-partitioning tool, this would not erase the old and
already exiting BSD disklabel information. Sooner or later, you would land in
trouble.
Q5. I deleted all exiting partitions containing data on my PC. But I want
to be sure and wipe my disk clean with some "professional" disk-wiping
utility. Which ones can I use? How would I use them?
Ans5: Third-party partition managers and security software which are
specifically designed to erase disks can be used for this purpose. Personally, I
do not have any experience using any third-party tools for wiping hard disks
because I prefer using the UNIX-based dd utility.
UNIX's dd can also be used to wipe the disk clean absolutely. It writes zeroes to the disk
surface which is accessible from the FreeBSD, OpenBSD and Linux install CDs. This is easily accessed from the OpenBSD install CD by selecting the shell option rather than install or using [Ctrl+C] to exit the OpenBSD install at any time. The OpenBSD command to clear the first IDE hard disk is "#dd if=/dev/zero of=/dev/wd0c". Red Hat Linux's install CD's rescue mode is a single user prompt.
You do not have to mount the system as you are going to erase it, not "rescue" it. Once you get the "#" prompt "dd if=/dev/zero of=/dev/hda" will clear the first disk. FreeBSD's "Fixit" option from the main install menu provides access to a single user prompt; the second of the four install CDs is needed. The command "dd if=/dev/zero of=/dev/ad0" appears to clear the first disk.
Once this done, you can be rest assured that your hard disk is wiped clean
absolutely!
Q6. I wiped my hard disk clean using UNIX dd utility. I then performed a
fresh install of Red Hat Linux on my PC followed by a FreeBSD install. I
installed the FreeBSD boot manager into the MBR. Now my system does not boot!
What happened? Where did I go wrong? or,
Q7. I wiped my hard disk clean using UNIX dd utility. I then
performed a fresh install of Red Hat Linux on my PC followed by an OpenBSD
install. I installed the OpenBSD boot manager into the MBR. Now my system does
not boot! What happened? Where did I go wrong?
Ans6 and Ans7: Nothing went wrong anywhere. But you made a small yet fatal
mistake at the very beginning. Always remember that when you write zeroes to the
hard disk surface using UNIX dd utility, it includes the Master Boot Record
(MBR) area also. The MBR area which has address (0,0,1): Cylinder address
0, Head address 0 and Sector address 1, that is, in other words, the first 512
bytes of the hard disk which contains the Master Partition Table (MPT) and the
Initial Program Load (IPL) code. When this area gets overwritten with zeroes,
different operating systems react and behave in different ways. They react
strangely because they expect the standard IPL code, but instead they find
none.
For example, Red Hat Linux fdisk reports the zeroed partition table as invalid and can recreate an empty one; the only IPL code it will install is LILO or GRUB and not standard IPL code.
OpenBSD simply shows an empty partition table and it causes a hell lot of
problems booting on such hard disks. The only operating system that which is
reported to boot flawlessly from a completely zeroed disk without error is
FreeBSD. But, I personally did encounter a situation in which (owing to strange
reasons) FreeBSD booting got locked and it simply would not boot.
To prevent all this from happening, just boot from a Windows 9x or DOS v6
boot disk with FDISK.EXE on it. Once the system is booted to an MS-DOS prompt,
enter the following: A:\>fdisk /mbr. It reinitializes the MBR to its normal
state and rewrites the MPT and creates the standard IPL. Continue installing the
operating systems as usual.
Q8. I installed Red Hat Linux on my PC, but while installing LILO to the Master Boot Record
(MBR), I got an error that said the installation program could not write this info to the MBR. What
has happened?
Ans8: You have either locked the MBR of your hard disk on your PC or a
virus-scan software is enabled which prevents writing to the MBR. You have to access your
computer system's BIOS and verify that the MBR is not write-protected. Depending on the system, you may already have another boot loader in the MBR that conflicts with
LILO. Try performing a "fdisk /mbr" from the MS-DOS command prompt and
installing LILO.
Q9. I was installing OpenBSD 3.2 on my PC the other day. After creating
partitions in OpenBSD fdisk, when I saved the changes and exited from fdisk
using the option "quit", it flagged some warning as: "wd0: no disk label" in bright white on
blue. What does that mean? Did I do something wrong?
Ans9: Absolutely no. You did nothing wrong. When you exit the OpenBSD fdisk
(saving the changes using the option "quit"), and the OpenBSD slice is not at the same offset as a previously installed OpenBSD system,
then fdisk displays "wd0: no disk label" message. Though it looks like a warning
message or an apparent error, rather it is an important message flagged by
OpenBSD installation procedure which assures you that though the data and
OpenBSD specific partitions have been created, yet the OpenBSD disklabel has to
be setup. The disklabel defines the layout of the OpenBSD file systems on the
OpenBSD partitions on the hard disk.
In dual and multi-booting systems, this message is almost a prerequisite for
a correct OpenBSD installation. If it so happens that after saving changes to
OpenBSD fdisk, you were not flagged this message, then it means that OpenBSD
disklabel is reading and using information from a previous install. This is a
dangerous situation because if it so happens that the disklabel is using old disklabel data and any partition information has
changed since its creation initially, then disklabel's behavior becomes erratic
and you may encounter strange problems. For preventing this from happening, make
sure you wipe the disk clean before performing a clean install of OpenBSD.
Q10. I have FreeBSD 4.8-RELEASE and OpenBSD 3.2-RELEASE operating systems
installed on my PC. I tried installing Red Hat Linux on the free hard disk space
that I have. But each time I tried using fdisk or Disk Druid disk-partitioning
tool, it flagged strange error messages. Why is this happening? What can I do to
install Red Hat Linux on my system?
Ans10: An honest answer to this question is: "Nobody exactly knows
why!". Over the years while performing dual and multi-booting
installations, I have noticed that Red Hat Linux's fdisk and Disk Druid flags
the most number of error messages whenever OpenBSD operating system is in the
vicinity. OpenBSD fdisk handling of the partition table is different than FreeBSD and Linux and does not conform with the standards.
This maybe one of the many possible explanations.
For example, let me describe a real situation. This happened a couple of
weeks back (at the time of writing this Guide). I installed OpenBSD 3.2
followed by FreeBSD 4.8 on a test PC. Then I tried installing Red Hat Linux 8.0
(Psyche edition) on to the remaining hard disk space on the computer.
Readers must note: I do not have any faulty hardware on my system, I meet all
hardware compatibility requirements, I wiped my disk clean before starting the
installs using the UNIX dd utility, the memory (physical RAM) available on my
system is absolutely 100% correct and the Red Hat Linux 8.0 installation CD-ROMs
are okay. Yet, each time when I tried to install Red Hat onto my system (each
time using different boot-time command line options), Linux's fdisk started with several dialog boxes. The first said "Invalid partition on
/tmp/hda". When I ignored the error, a new dialog displayed "Unable to align partition properly. This probably means that another partitioning tool generated an incorrect partition table, because it didn't have the correct BIOS geometry. It is safe to ignore, but ignoring may cause (fixable) problems with some boot
loaders". Fdisk then displayed the "too many partitions" error. Error,
error and error was all that I got! So, I switched to Red Hat Disk Druid to
fancy my luck!
Red Hat's Disk Druid displayed an "Invalid partition on /tmp/hda" message and
when ignored, showed the OpenBSD partition as unused and /dev/hda as a BSD/386
partition. I forced the "auto partitioning" process but it crashed
miserably after some time with an "unhandled exception". I did not save the crash dump
to a floppy because I had no plans to send a bug report.
This happened with Red Hat Linux 8.0 (Psyche). I met with same problems with
Red Hat Linux distributions 7.3 and 7.1. But when I tried installing Red Hat
Linux 7.0, it installed painlessly without any shouts and cries. Could someone
out there (or more specifically at Red Hat Inc.) please explain me what exactly
happened or what is so special about Red Hat Linux 7.0 that the others do not
have? I am still trying to figure out this problem!
Q11. I read the PR (problem report) above. I have FreeBSD and OpenBSD
operating systems installed on my PC as well. But I would like to install a
Linux distribution other than Red Hat on my system. Which one do you suggest?
Ans11: If you have already read and understood the PR above, then it makes no
sense at all which Linux distribution you try installing on your system. I met
with similar problems when I tried installing Mandrake 9.0 as well as SuSE 7.0
on my test computer. Mandrake Linux managed creating partitions, formatting
them, but it too miserably crashed while installing packages. SuSE 7.0 crashed
at the very beginning only! In the near future, I would like to test Red Hat Linux 9's
integrity on such systems!
Q12. While installing FreeBSD 4.X-RELEASE on my computer the FreeBSD fdisk
(or disklabel) used an 'X' partition name instead of a /dev/ad[0-3]s[1-4]n
labeling scheme. Hey! what exactly happened? What does that 'X' represent?
Ans12: FreeBSD operating system software allows 7 partitions per slice (c: is
reserved for the whole hard disk space). FreeBSD fdisk uses an 'X' partition instead of a valid partition name such as
"/dev/ad0s1a" if you have created too many of them on your PC. If you
have a partition labeled as 'X', you must delete it immediately without
proceeding any further. If you do not delete this partition, FreeBSD disklabel
will let you proceed with the entire installation, and then when you are finally
done with the full installation, when your system comes up, you will encounter
an error and would be dropped into single-user mode for maintenance. Thus, in
other words, if you do not delete a partition which is labeled as 'X', FreeBSD will not complete the boot sequence. Instead it displays error messages and drops into single user mode.
Q13. I installed OpenBSD on my system which also has Windows running on
it. After rebooting, when I pressed F2 on the screen, which reads "BSD", OpenBSD
would not boot. I received a "Bad Magic" error message. What does it mean? What
can I do?
Ans13: It means that you have successfully installed OpenBSD on the hard disk
of your PC, but made a fatal mistake while doing so. The boot files required for
booting OpenBSD fall outside the 1024th Cylinder on your hard disk. Check to see
whether the first operating system on your PC (whether Windows, FreeBSD, NetBSD,
Linux) is so configured that it takes more space by crossing the 1024th limit.
If it is so, then you cannot install OpenBSD on the same hard disk. You may
install successfully, but OpenBSD would not boot as it requires boot files to
lie within the first 1024 Cylinders of the hard disk.
Technically speaking, bad magic means the following: The magic number
is a short integer, which identifies a file as a load module and thereby enables
the kernel to distinguish run-time characteristics about it. For example, use of
particular magic numbers on a PDP 11/70 informed the kernel (of UNIX SVR2) that
processes could use up to 128K bytes of memory instead of the usual 64K bytes,
but the magic number still plays an important role in current paging systems.
The values of the magic numbers were the values of PDP 11 jump instructions;
original versions of the system executed the instructions, and the program
counter (pc) register jumped to various locations depending on the size of the
header and of the type of executable file being executed. OpenBSD operating
system software, which is an actual BSD-derivative uses the same style for
booting. When the jump instructions are missing, or in other words, they lie
beyond 1024 Cylinders on the hard disk, you receive bogus values for jump
instructions, thereby setting bogus values for the magic number. Hence, you
receive a "bad magic" error and OpenBSD does not boot.
Try installing OpenBSD on another hard disk. Or better try working with
FreeBSD or NetBSD operating systems which do not have the 1024th Cylinder
Inconsistency. Or try shrinking the already existing partition (caution: You may
loose invaluable data!). Best of all, get another PC, and dump OpenBSD on it and
work!