The (unofficial) RedHat 7 Customised Installer mini-HOWTO

This document describes the basic methodology for creating customised network and cdrom installation images for redhat 7.2 and 7.3.
Home for this web page is http://www.linuxworks.com.au/redhat-installer-howto.html
Tony Nugent <tony@linuxworks.com.au>
27/28th January, 20th February 2002 - Initial release
23rd, 24th March 2002 - Many updates, additions, etc.
11/12th May 2002 - Version 0.4 (beta) - rewrite and updated for redhat 7.2 and 7.3
28th May 2002 - Version 0.5 (not made public) - corrections, clarifications and additions. - major rewriting of many sections. - more streamlined presentation.
31st May - 2nd June 2002 - Version 0.6 - many cleanups, added more materials, etc. - this version will hopefully become public real soon :)
6th June 2002 - Version 0.7 - more cleanups, corrections, updates, re-arrangements, additions.

October 2002 - This document does NOT mention redhat 8.0. Updates are pending...

Meanwhile, Luigi Bitonti has created an updated document, Burning a RedHat CD HOWTO.

This document is under occasional (and irregular) development. Sometimes links to local resources are broken or missing... yes, I know about them and there are (usually) reasons why they aren't working (yet).

All comments, additions, errata and other such feedback will be gratefully accepted.
Contributions are especially welcome (please!)

Many thanks to others (both mentioned and not mentioned below here) for their valuable feedback and (sometimes unsolicited) contributions (taken from mailing lists etc).
Especially (and in no particular order): Chuck Moss <mossc@mossc.com>, Michael McGillick <mmcgillick@attbi.com>, Seth Vidal <skvidal@phy.duke.edu>, Jeremy Katz <katzj@redhat.com>, Peter Bowen <pzb@datastacks.com>, Forrest Taylor <forrestx.taylor@ intel.com>, Brian Ipsen <Brian.Ipsen@andebakken.dk>, Martin Stricker <shugal@gmx.de>, Alf Wachsmann <alfw@slac.stanford.edu>, John Sheahan <jrsheahan@optushome.com.au>, James Olin Oden <joden@lee.k12.nc.us>, Scott Sharkey <ssharkey@linux-no-limits.com>, Dongwon Kim <prdd@finf.net>.
(FIXME: complete this lists of people who need to be acknowlegement)
Special thanks to Scott Sharkey for originally pointing me in the right direction with how to do all this.

I wrote this. It might all be lies, you take the risk.
It works for me, if you are lucky it might work for you too.
If anything you find here is broken or it breaks other things, you get to keep all the pieces.
If you fix or improve anything, please let me know and I'll definitely consider adding it to the clutter.
Permission is given to redistribute this document, I just ask you to please let me know about it.
If you do reproduce or use any part of this document, kindly give me (and others) due credit.

Ten Easy (??) steps:

  1. Prepare the build system
  2. Prepare the build source tree
  3. Generate a new hdlist
  4. Rebuild the anaconda installer runtime images
  5. Generate a package order file (if necessary)
  6. Split the installation image into several CDROM images
  7. Re-generate new hdlist files (again)
  8. Create the iso9660 filesystem images
  9. Burn (and test) the cdroms
  10. Enjoy! (putting it all together)

Here is a tarball of everything you will find here.


Introduction

It is possible to create highly customised redhat-based linux installation disks, and people are doing this (especially in educational campus or enterprise/corporate environments) where there is a need to deploy linux distributions and have them installed in customised ways.

If you are attempting to create your own customised redhat installation disks, then this unofficial mini-howto will (hopefully) prove to be a useful resource.

Fortunately RedHat has provided the anaconda tools that are needed to completely (or partially) recreate new installation disks.
Unfortunately, there is a distinct lack of documentation that describes how to use anaconda to create the actual installation images.

This document is an attempt to fill this information gap, to reveal to all just how the magic is supposed to work.
Hopefully the need for this howto will be short-lived, depreciated when it is eventually replaced by the anaconda documentation itself.

Even with the arrival of rh7.3, the documentation that comes with anaconda refers only to the installation options that can be passed to the installer at installation time, and how to use kickstart.
While this is valuable - and essential - reading, there is still no description there for how to create modified installation images using the anaconda tools.

"Use the source" - which is fair-enough advice in this case. Comments in the source code are few and far between, which is unfortunate for the casual observer. But if you know a little about shell and perl script programming, then python is a relatively easy language to read and understand what is going on. (Hint: python relies heavily on formatting, such as indenting, to deliniate program structures such as functions, loops and conditionals).
(FIXME: links to resources for python documentation. Python Home Page. There is a python-docs rpm, but it is in .tex format and you need to produce readable docs from that).
Most of the anaconda tools are either python or shell scripts, so get out your favourite editor start hacking at what you'll find in the /usr/lib/anaconda/ and /usr/lib/anaconda-runtime/ directories. (If you go ahead and do what is to follow, you'll need to do this anyway! :-)

My basic approach and aim here is to create customised rh7.x installation cdroms that have already had all the update packages added to them.
This avoids having to go through the tedious and often perilous hassle of applying all the updates after doing an installation using the original release cdrom images - it is not necessary since the updates are already there!

As of May 2002, the total size of the all the i386 update packages for rh72 had bloated to almost 1100Mb (including the .src.rpms).
And less than two weeks out from its official release, there are already 191Mb of updates to redhat 7.3 (kernel, evolution updates).
Repeatedly applying all that lot to every new installation is a real headache... kernels (generally) need to be installed and not upgraded or freshened (although this has recently changed), some packages are new and need installing before some of the updates, you need to choose between freshening with the glibc-i386 or glibc-i686 package and so on. Yuk! Much better to have as many of the updates in there right at the initial installation.

The resulting modified installer disks can also be used to "freshen" (rpm -Fva *.rpm) an existing running installation. (Although this too can be a perilous task, and if you need to do this on a number of similar computers it can become tedious if it isn't scripted in a sane way).

Other people have created utilities for doing these updates in a fairly seemless way. Here are some references that may prove useful for going about doing updates on running installations in "sane" ways (eg, taking care of dependencies, new packages, special requirements and so on).

up2date
Redhat offers its own solution, aka up2date and the rhn* (Red Hat Network) utilities. This suits some people's needs and requirements very nicely. But not mine. I very much want (among other things) to have much more control over the updates, when they are applied, where they are downloaded from, what happens to the downloaded files (eg, I care about their timestamps), and when and how much bandwidth the utility uses.
My own approach to managing updated is to download them all (once) into a local location (preserving their timestamp), and have all the other boxes update from there. I also want to "reuse" the downloaded rpms (to rebuild updated installers! :) Last time I looked at it, up2date didn't allow you to do it anything like that. (Perhaps things have changed...)
If rhn and up2date work for you, then that's great - use it.

apt4rpm
apt4rpm is described at http://freshmeat.net/projects/apt4rpm/?topic_id=147%2C257 as...
the superb Debian package installer APT has since some time become available for RPM-based distributions. However, to install RPM packages with apt, an apt repository is needed. apt4rpm creates the apt repository from an RPM repository. The rpm repository can be located locally or remotely. Once the apt repository has been created with apt4rpm, the apt tools can be used to install RPMs.
It has a homepage at http://sourceforge.net/projects/apt4rpm/, and rpms can be obtained from freshrpms.net, where they have an excellent web page describing apt and how to use it.

autoupdate
autoupdate (by Gerald Teschl <autoupdate@mat.univie.ac.at>) "is a Perl script which performs a task similar to RedHat's up2date or autorpm. It can be used to automatically download and upgrade rpms from different (s)ftp or http(s) sites. Moreover, it can also be used to keep a server with a customized (RedHat) distribution plus all clients up to date. I have tried to write it in such a way that it is not RedHat specific and hence it should work with any other rpm based distribution as well."
It can be downloaded from its home page at http://www.mat.univie.ac.at/~gerald/ftp/autoupdate/index.html.

other contributions
rh-update.pl is a perl script taken from the kickstart HOWTO at linuxdoc.org described as useful for "munging updated RPMs into the RedHat distribution area". It has been described as a "very good basic idea... but unfortunately it seems to lack a few features (like understanding different archs and comparison between version is done by strings)".
I have never used it, you might find it useful for your own needs.

rpmupdates.sh is a small #!/bin/bash shell script used for updating an RPM repository. I have never used it, it doesn't look very "smart but might be a good starting place. Apologies to its now anonymous author.

FIXME: This is where more links to other tools and contributions will appear.

mailing lists

If you are seriously interested in building customised redhat-based installation cdroms, then I strongly urge you to subscribe to two (moderately low-volume) redhat mailing lists... anaconda-devel-list and kickstart-list.
To subscribe, send email to {kickstart,anaconda-devel}-list-request@redhat.com with "subscribe" in the Subject: line of the message.
The web pages and archives for the mailing lists can be found here, but unfortunately, the mailing list archives are not searchable there.
Kickstart-list mailing list <Kickstart-list@redhat.com>
https://listman.redhat.com/mailman/listinfo/kickstart-list
Anaconda-devel-list mailing list <anaconda-devel-list@redhat.com>
https://listman.redhat.com/mailman/listinfo/anaconda-devel-list
Go to the above URLs (where you can subscribe), go to the archives for either kickstart-list or anaconda-devel-list, download the mbox format mail files, then grep through or import them into your mail and search with your mailer.

Kickstart and Anaconda

Many (most? all?) people who build customised installations also want to be able to automate the process so that there is no need for any manual intervention at all for it to do its job. This is exactly what kickstart allows you to do.

Kickstart allows you totally automate an installation while giving a very high degree of control over what the installer does (how partitions are created, what filesystem types to use, what packages are selected, how the box is configured, etc) by providing installer directives and hooks to run customised scripts and so on... basically, boot into the installer, have a coffee while you let it go and do its stuff, and next time you look it will be booted up all installed, configured and running ready to go exactly how you would like it. Repeat as many times as you like. Very cool.

Kickstart is an integral part of the anaconda installer, they go hand-in-hand. If you use kickstart, you are using anaconda. If you are building deployment distributions with the anaconda tools, you are neglecting some of its most powerful features if you don't consider using kickstart to script its behaviour.

The official Red Hat kickstart documentation can be found in the redhat 7.3 anaconda documentation directory.
This is the definitive guide for the current 7.3 release.

John Sheahan <jrsheahan@optushome.com.au> has an interesting document "How to make a new kickstart Redhat bootdisk" at
http://www.reptechnic.com.au/kickstart.html. (Local copy here). This document is especially interesting as it shows how to manually "pull apart" the installer images, modify them for your own purposes, then re-package and put them all back again.
His page refers to another "RedHat Linux KickStart HOWTO". that can be found at the RedHat Linux KickStart Information Page.
Beware however, that this HOWTO is dated 11th Jan 1999, so the document is getting rather aged (redhat 6.x vintage).
Interesting that this page refers to a howto-kickstart mailing list, details at http://mail.gnu.org/mailman/listinfo/howto-kickstart (I have never been subscribed, the list archives can be found here: http://mail.gnu.org/pipermail/howto-kickstart/).

FIXME: add more resource links for kickstart

Previous versions of RedHat - 7.0 and 7.1

The basic principles described here are likely to apply to earlier redhat 7.x distributions.
It would be interesting to know if the anaconda packages from redhat 7.3 or 7.2 can be adapted to work to rebuild installers for earlier releases. If you are able to adapt things, then that's great (please let me know your wonder!)
Wouldn't it be brilliant if current and future versions of anaconda could successfully ("out of the box") recognise and build any previous redhat distributions?
Wishful thinking I'm sure. But it would be a nice goal to aim for :)

These two redhat distributions have a reputation of being rather buggy and problematic in places. Personally, I would recommend that if you are building 7.x installers, then use either 7.2 or 7.3.

I have had no direct experience using anaconda to create rh70 and rh71 installers. I had no idea that anaconda could do any of this so comprehensively until rh72 arrived, so I am not "supporting" these platforms. When I did try to build customised installers for them (without using anaconda - except for genhdlist), the resulting disks either completely failed to work or required much annoying disk-swapping during the installation.

Anaconda has been developing over 7.x distros, so there are differences and per-release customisations. Essentially, it means that you need to use the anaconda package that came with the distribution you are building, and best (perhaps not a requirement) that it is done on a box that has that particular distro installed.

There are known bugs, problems and deficencies with these earlier versions of anaconda. My recommendation would be to check the mail archives and bugzilla for details. (FIXME: provide some links).

One example of such a difference is that prior to 7.2, the "splitdistro" utility did not exist in anaconda. (Does anyone know how this was done in releases prior to rh7.2?)

Previous versions of RedHat - 5.x and 6.x

It really needs to be recognised that these distributions of redhat are getting aged (especially 5.x). If you are building new or modified installers, then you should be using the latest and greatest and not wasting time working with old distributions.

RedHat no longer supports 5.x, so you are on your own with that. They still support the 6.2 distribution (probably for not all that much longer), but now only for things like essential critial security updates. Since its release, there have been many updates issued for 6.2 (approx 700Mb in the current collection).

Any redhat 6.x box built from the original release installation disks is a total security hazard on a network unless it has been updated and properly configured.

Also, due to some fundamental changes to tools that have been updated -- like rpm itself -- it is now very difficult to get the 6.2 installer working with these without some patching to the installer.

I have had considerable experience using redhat 6.2, and I must praise it (patched!) for being one of the most stable and reliable server "grunt box" platforms that I have ever seen.
If you are looking for a proven stable hassle-free server box and don't need support for all the latest fancy hardware or software features, then redhat 6.2 will certainly qualify for the job.
With notable exceptions, it is possible to rebuild many rpm packages released for redhat 7.x on redhat 6.2 and they will work just fine.

Details for how to create customised install disks for previous versions of redhat (6.2 and before - "2000/03/02 16:28:37 Revision 1.3") can be found here:
http://ha.redhat.com/docs/CD-HOWTO/RedHat-CD-HOWTO.html
While many of the things described in these pages are still valid (and I strongly urge you to refer to it), it does not work for rehat 7.x because the new multi-disk distribution format has changed many things. Anaconda itself has also improved dramatically, and become central to the disk image creation process.

Peter Bowen (<pzb@datastacks.com>) wrote a small but useful howto for creating redhat 6.2 installers. It references a perl script that "overlays a set of RPMS on another directory of RPMs deleting old versions as necessary."
(For prosperity, the howto is duplicated here, the perl script is here).

FIXME: A couple of years ago I also wrote a fairly comprehensive howto on this same subject based on my experiences with redhat 5.x and 6.x... if I find a copy of this in my archives then I will include it here as a historical reference.


Step 1 - Prepare the build system

Start off by doing this on a "standard" redhat 7.2 or 7.3 installation. Apply all the current updates to it, including the latest kernel rpm. Do it, you have been warned.
It is highly recommended that you build the new installer disks on a system built from the same one you are trying to build. (eg, build a rh72 distribution on a rh72 box using anaconda-7.2).

Install the anaconda and anaconda-runtime packages. Most of the good stuff gets put into /usr/lib/anaconda/ and /usr/lib/anaconda-runtime/. If you are using rh73, then you would do well to read the new additions to the anaconda documentation on your hard drive.

This next step is not strictly necessary...
If an updated installer disk exists (eg, for redhat 7.2), then it could be useful to use the installer updates disk to update your installed version of anaconda.
Loop-mount the disk image and replace the files in your /usr/lib/anaconda/* tree with the corresponding new versions contained in the update. (Some of them go into the anaconda subdirectories, easy to find them).
Note that this will update the running version of anaconda on your hard drive, but it has no effect on the installer itself... what gets put into the installer is taken directly from the anaconda packages in the RedHat/RPMS/ directory (along with other things like the kernel-BOOT package). (It would be (technically) possible to repackage the anaconda rpms patched with the updates, and put that in the RedHat/RPMS/ directory so it will be used).
Keep that update disk image, it is needed for the next step.

If you plan to make a lot of changes to anaconda, and you don't want to "destroy" or "meddle with" the original installed version, then one idea would be to copy all the original files to a new location and use this copy for running your hacked and otherwise modified version of anaconda:

  
	# cd /redhat
	# cp -ar /usr/lib/anaconda .
	# cp -ar /usr/lib/anaconda-runtime .
	# perl -pi.bak -e 's,/usr/lib/anaconda-runtime,/redhat/anaconda-runtime,g'
	# perl -pi.bak -e 's,/usr/lib/anaconda,/redhat/anaconda,g'
	# cd anaconda
	# echo make changes to any anaconda files
	# cd ../anaconda-runtime
	# echo make changes to any anaconda-runtime files
  
Warning: in theory this is likely to work, but it has not been tested by me!
The perl commands will change all references to /usr/lib/anaconda* in the anaconda scripts to refer to those in your new location. If any binary files or programs get destroyed in the process, don't blame me:)
FIXME: I really need to test to see if this works, or how to make it work if it doesn't (short of recompiling it with patches).

BTW, if you don't know how to loop-mount a floppy or iso disk image:
  
	# mkdir img/
	# mount -o loop file.img img/
If mount complains about an unknown filesystem type, then you might need to add a "-t vfat", "-t iso9660" or -t ext2" switch to specify what you are using. (Recent versions of mount are cleaver enough to probe for and auto-detect the filesystem type).
You can now directly access the contents of the disk image file.img in the directory img/.


Step 2 - Prepare the build source tree

This is the trickiest part of the whole process. This is where you do your own customisations.

Despite being a primary motive for creating redhat installation images, the nitty-gritty involved with designing customised redhat-based installation images is beyond the immediate scope of this discussion. It is assumed that the reader has some fundamental awareness of what is involved, such as the roles of the various files in the RedHat/base/ directory like comps. Full details can be found elsewhere, such as the URL at ha.redhat.com referred to above.
What is described here are the steps required to produce working installation disks from your customised install image.

FIXME: provide a URL that points to the most recent description of the comps file (et.al.)

To get started with a new fresh installation image, copy the entire contents of all of the original redhat 7.x installation cdroms into a directory called "i386/" on a partition that has plenty of space available for what you will need to do (several gigabytes).
For example purposes, I'll use /redhat/i386/ (where /redhat/ could be a symbolic link to its real location).

For more information about doing this, see the README file in the base directory of the distribution cdroms.
All the binary RedHat/RPMS/*.rpm files from all the installation disks should now be together in the same directory, like it should be for setting up an NFS export image for network installs, exactly as described in the README.

What you should have at this point is the installation root image directory trees organised in this manner:
|-- redhat
    |-- SRPMS
    |   `-- SRPMS
    |-- i386
    |   |-- RedHat
    |   |   |-- RPMS
    |   |   `-- base
    |   |-- dosutils
    |   |   |-- autoboot
    |   |   |-- fips15c
    |   |   |   |-- restorrb
    |   |   |   `-- source
    |   |   |-- fips20
    |   |   |   |-- restorrb
    |   |   |   `-- source
    |   |   |-- fipsdocs
    |   |   `-- rawritewin
    |   |-- images
    |   |   |-- de
    |   |   |-- es
    |   |   |-- fr
    |   |   |-- it
    |   |   |-- ja
    |   |   `-- pxeboot

I would like to see the SRPMS directory become optional so that if the SRPMS/ directory does not exist, then they are simply not processed. This is very useful for creating binary-only installer images. (The version of splitdistro below here works like this).

The RedHat/, RedHat/base/, RedHat/RPMS/, images/ and dosutils/ directories (as found on the CDROM) are standard (and expected) in the i386/ build source tree.

If you add or remove any additional directories to your main /redhat/i386/ directory, they they will be added to the first installation disk image exactly as is. (When the install image trees are created, account will be taken for the size fo the files in these directories).
Note that if you want to add files to the base directory, then it is likely that you will have to patch splitdistro to include them too. These files usually get put onto each image, not only the first one). Or put them there yourself once the images are created.

This is a very useful feature! With rh7.2 I had been including a "software/" directory that included additional packages such as openoffice. For redhat 7.3, there exists a 56Mb documentation cdrom. I have created a i386/doc/ directory and copied the contents of this iso into it... it will automatically end up on the first of the installation cdrom images (in i386-disc1/).
    |   |
    |   `-- docs
    |       |-- RH-DOCS
    |       |   |-- maximum-rpm-1.0
    |       |   |-- pdf-en
    |       |   |-- rhl-cg-en-7.3
    |       |   |-- rhl-gsg-en-7.3
    |       |   |-- rhl-ig-x86-en-7.3
    |       |   `-- rhl-rg-en-7.3
    |       |-- RedHat
    |       |   `-- RPMS
    |       |-- SRPMS
    |       `-- figs

Note: small point, but if you do this then you may want to delete the (empty) i386/docs/.disc5-i386 file that exists in this iso image.
They are ok for me, but you may want to verify that all the html links work in the documentation.
Beware that adding additional content to the first disk image will "push" more of the rpm packages off the first disk, putting more of them onto the third (last) disk image (where there is currently enough space to accomodate it). This may have consequences if you are attempting to build a installers that avoid using the third disk during the installation (see below).

I then have a directory structure on the same physical filesystem and level as the i386/ tree, organised so that it resembles the structure of the redhat updates on the ftp mirror sites:
    |
    |-- updates
    |   |-- i386
    |   |-- i486
    |   |-- i586
    |   |-- i686
    |   |-- SRPMS
    |   |-- noarch
    |   |-- images
    |   `-- athlon
This allows me to hard-link the updates from there into the i386/RedHat/RPMS directory of the installation tree, which saves a lot of hard drive real estate by not having physically duplicated files on that (or any other) disk.
For example: ln /redhat/updates/i386/package.i386.rpm /redhat/i386/RedHat/RPMS/
cp -al does the same thing, it is how splitdistro does it. (Hint: midnight commander, /usr/bin/mc is a very useful utility for managing large numbers of files in these sorts of ways).

Also in the /redhat/ "working" directory, I have some other dirs (or symlinks of these names to the real directories)...
    |
    |-- memtest86
    |-- tomsrtbt
    |   `-- addons
    |-- mindi
    |   `-- (whatever)
I use these as source directories for adding bootdisk images to the normally non-bootable installers. It is an advantage here to have these files on the same filesystem as the others, files can be hard-linked (rather than copied) which, again, can save resources.
More on this below.

To show you were this is all going, the end result of the process described here will be the creation of a series of directories beside the main i386/ directory, populated with the RedHat/RPMS/* files distributed across them so that they can be used for creating installation CDROM images:
    |
    |-- i386-disc1
    |   |-- RedHat
    |   |   |-- RPMS
    |   |   `-- base
    |   |-- dosutils
    |   |   |-- ... etc ...
    |-- i386-disc2
    |   |-- RedHat
    |   |   `-- RPMS
    |-- i386-disc3
    |   |-- RedHat
    |   |   `-- RPMS
    |   `-- SRPMS
    |-- i386-disc4
    |   `-- SRPMS
    `-- i386-disc5
        `-- SRPMS
(Note that for redhat 7.2, there are only four i386-disc?/ directory trees created, and there are some minor differences in the directory structure).

Now get rid of any messy "leftovers" that came with the cdrom...
The ugly TRANS.TBL files can go, so can boot.cat (on bootable cdroms) and any rr_moved/ directories that may have been copied over:
	# find /redhat/i386 -name TRANS.TBL -o -name rr_moved -o -name boot.cat -exec rm -rf {} \;

Now go ahead and do your customisations... carefully and attentively replace, add or remove packages as you like to the i386/RedHat/RPMS/* directory.

For example, people have built customised "slim" installation disks designed to fit everything that is needed onto only one cdrom image, often adding kickstart functionality for using with it.
These installations have obviously been trimmed to the "essentials" and usually contain a highly specific package selection and with customised software packages.
Such installations can be made to run to automatically use a kickstart ks.cfg file, even off the cdrom itself.

The RULE Project
The aims of The RULE Project - Run Up2date Linux Everywhere - are to:
  • Modify the current Red Hat Linux installer so that it runs in less than 32 MB of RAM, or create a new one if needed.
  • Select, test, and (if needed) package the system and desktop applications which give the greatest real functionality with the smallest consumption of CPU and RAM resources
  • Create another installation option of the Red Hat Linux distribution (not another distribution, see the FAQ), containing all and only the packages above, optimized to run either a server, or a basic desktop on obsolete hardware with very little RAM and HD space.
  • Promote and support (especially in developing countries) the use of this install option with schools, public and private organizations.
This project has the potential to offer many useful ideas and resources for those building highly customised installers, especially if the one aim is to make it work on low-end, low-resourced, low-memory computers, and with an installation that is trimmed down to the bare essentials.
For example, it is possible to use a third CD to boot the installer, then let anaconda prompt for the original RedHat CDROMs (or anything else).
(This approach is also used by SGI to make their XFS installer http://oss.sgi.com/projects/xfs/).
(Thanks to Martin Stricker <shugal@gmx.de> for the pointer to this project).

Whenever there are any changes in (for example) missing, changed or added dependency requirements, appropriate modifications need to be made to the RedHat/base/comps file to make it all work as expected.
Package dependencies must be satisfied in two ways:
  1. within the set of all the rpm packages in the RedHat/RPMS/ directory, and
  2. within the RedHat/base/comps file (with respect to the specific dependencies of the selected packages).
If you don't ensure that all your changes are consistent, then the installer will either fail to build as an installation, or (less likely but much more heart-breaking) it will crash and burn during runtime installation.
If package dependencies are not specified correctly in the comps file, you may find that at the point when the installer is checking the package selection for dependency requirements, the install may actually work - but with a hitch. Since its default package selection is taken from comps, any unmentioned packages that satisfy dependency requirements will will tend to force the installer to prompt you if it is ok to install the package in question.

One thing I have found useful is that if you do change the comps file, first copy the original distribution version to (say) comps.orig (or whatever) and let that end up preserved in the images you create.

Checking the integrity of the rpm packages

Dealing with the SRPMS

Dealing with the anaconda installer updates

Bugfix updates to the anaconda installer are periodically released. They come as floppy disk images, used from the installer if you tell the bootloader to use it by giving it the appropriate parameters. The installer then asks for the floppy disk, you put it into the floppy drive and away it goes.
A useful functionality, but very inconvenient for all the manual intervention that is required.

Fortunately, there are several ways to deal with this so that the updates are aready there and used automatically...

Known issues for redhat 7.2
Mention of the Omni and Omni-foomatic packages needs to be added to the comps file. These are new packages for the distribution, required after the update for ghostscript which has a dependency on Omni.
A similar situation exists for the March 2002 perl updates... some of the perl modules included with the original perl package have been split out into separate (new) packages These may also need mentioning in the comps file (due to package dependencies).
The big Jan 2002 KDE update packages do not fully replace rpm-by-rpm the original distribution packages. Worse - the packages that are installed after the update have different names. Look for mention of these in the comps file and make alterations accordingly. (eg, the kde-i18n-Chinese package no longer exists).
Here is a copy of the updated redhat 7.2 comps file that I have been using successfully for a while now (as of May 2002). It contains most of the changes mentioned above. There is a diff file available if you want to see the changes.

Known issues for redhat 7.3

You may have noticed that the mkkickstart utility has quietly vanished.
The good news is that it will probably be replaced by a new anaconda tool.
FIXME: follow up the status of this.

The only updates to redhat 7.3 so far involve replacement packages, no new ones. The comps file does not need to be modified from its default (unless, of course, you want to modify it for your own needs and preferences).

That being said, there appears to be a bug in the comps file related to using pkgorder (Bugzilla #64995).
Things have been designed for rh7.3 so that doing a custom install and choosing all components (except Everything) should only require the use of the first two installation discs, not all of them.
The problem has proved difficult to find, and appears to be related to a dependency loop in the comps file.
The packages that are affected are: w3c-libwww-devel, wine-devel, and libesmtp-devel (all in the "Software Development" group).
The quick hack is to remove the X Window System/GNOME sub dependency.
The third installer will also be needed if you have an older video card that requires the legacy drivers from XFree v3.3.x.

To avoid having to use the third disk, force the "earlier" inclusion of the packages you require by adding an entry to the bottom of the comps file similar to this:
  
0 --hide Old X Windows System {
   XFree86-3DLabs
   XFree86-8514
   XFree86-AGX
   XFree86-FBDev
   XFree86-Mach32
   XFree86-Mach64
   XFree86-Mach8
   XFree86-Mono
   XFree86-P9000
   XFree86-S3
   XFree86-S3V
   XFree86-SVGA
   XFree86-VGA16
   XFree86-W32
   XFree86-compat-modules
}
0 means that it is not installed by default,
--hide means that you cannot choose it during the Custom installation, but will allow it to be included in the first two discs by pkgorder/splitdistro.

Further tuning can be done, for example to add an entry for any extra rpms that you specifically want to install:
  
0 --hide Extras {
   numlock
   xcdroast
   taper
   etc...
}
The example here mentions numlock (a package that keeps the numlock key on in X and VT). You may want it included in the installation, perhaps because you find that it is useful for laptop installs, (laptop users often seem to get confused with numlock... they try to type a username and it comes out numbers).
For kickstart installs, just include it in the %package section of the non-laptop ks.cfg files.

(Thanks to Forrest Taylor <forrestx.taylor@ intel.com> for the good description of this trick).

Anaconda and related contributions
Anaconda offers some tools that help to ensure that things are ok.

The python script /usr/lib/anaconda-runtime/check-repository.py can be used to check the RedHat/base/comps file to ensure that it is consistent with the contents of the RedHat/RPMS/ directory.
However, comment out the "import todo" entry (around line 38) before you use it. (Thanks to Forrest Taylor <forrestx.taylor@ intel.com> for this hint).

depchecktree.py is a python script written by Seth Vidal (<skvidal@phy.duke.edu>), posted to the anaconda-devel-list on Sat 25th May 2002.
FIXME: provide a link to the original message in the mailing list archives.
It takes a parameter of the path to the directory where you have a collection of rpms (ie: i386/RedHat/RPMS), and checks that all their dependencies are satisfied by others in the collection. (Thanks Seth).

Seth added that "this does not do anything with the comps file, at all. Jeremy mentioned doing something like that. If I thought I had any shot of intelligently understanding the comps parsing I would work on it, but I'm not that smart yet.
So it would (technically) be possible to extend this script to include an examination of the comps file (which it currently ignores), checking it for errors or inconsistencies, or even create a new one based on the packages it finds and their inter-dependencies.

Seth has some more gems available at http://www.dulug.duke.edu/treetools/...
FIXME: provide a link to this message in the archives (3rd Jun 2002).
In kickstart-list Seth said about these scripts...
comps-check.pl is "not terribly pretty but it seems to work... for MOST situations. You pass it an architecture a comps file and a dir of rpms - it hands you back whats in the comps file thats missing from the dir of rpms for that arch.
I'm planning on rewriting it in python (or rather liberally stealing code from anaconda :)
the other two scripts are programs that I and jack neely put together to help maintain distribution trees - the
add-rpm.py adds updated rpms to an install tree.
depchecktree.py just takes the dirs you give it, look for all the rpms and tells you if they fully satisfy their dependencies.
(Here are links to local copies of add-rpm.py and comps-check.pl).

dumphdrlist.py is another python script, this one written by Jeremy Katz (<katzj@redhat.com>) and posted to kickstart-list on 15th May 2002.
(FIXME: link to original message in mail archives).
This script looks at the RedHat/base/hdlist file and lists which installation disk a particular rpm is on. From Jeremy: "Usage is simple enough -- 'dumphdrlist.py /path/to/hdlist' and it will then print the NEVRA of all of the packages as well as the disc it's on in the format
E:N-V-R.A disc

FIXME: here (?) add details of more scripts and utilites that:
- can be used to easily and sanely update (replace, add or remove packages) an installation image from a local (or remote) updates repository.
- will make package dependency checks on the RedHat/RPMS/ directory and then create a new comps file based on the original one, and on what has been added or removed from the RPMS/ installation directory.
Such scripts would be very useful for designing and testing highly customised installations.


Step 3 - Generate a new hdlist file

Note that I have a shell script that I use to automate much of what is to follow. More on this below.

At this point, do the following to make it much more convenient for using the anaconda tools from the command line:
  
	# export PYTHONPATH=/usr/lib/anaconda
	# export PATH="$PATH:/usr/lib/anaconda-runtime"
If you are using a customised version of anaconda copied to another place (as suggested above), change these paths to reflect their new location... for example:
  
	# export PYTHONPATH=/redhat/anaconda
	# export PATH="$PATH:/redhat/anaconda-runtime"

If you get errors running any of the anaconda tools (genhdlist, pkgorder, buildinstall, splitdistro), then the first things to investigate are inconsistency problems in the comps file and missing or corrupt packages and/or dependencies in the RPMS/ directory.
Do this to refresh the i386/RedHat/base/{hdlist,hdlist2} files so that they reflect the changed contents of the i386/RedHat/RPMS/ directory:
  
	# genhdlist /redhat/i386
Note: give no options to genhdlist (they are not needed at this time). (It will be run again later with a full set of options).

At this point, it should now be (theoretically) possible to use /redhat/i386/ as an nfs export for network installs using images/netboot.img as the boot image on clients (either via PXE/bootp, or a boot floppy).

This is a good way to test highly customised installations, as you don't need to waste time and money potentially burning a stack of useless cdrom coffee coasters in the process of trying to get things working right. :-)

The hdlist and hdlist2 files created by genhdlist in RedHat/base don't need to contain accurate information about package ordering if all the rpm packages are available in one place for the installer.

If the contents of i386/RedHat/RPMS/ changes, then genhdlist needs to be run for sure.

But unless you want or need to rebuild the installer itself, this is all that is needed to prepare it for use as a network installation image.


Step 4 - Rebuild the anaconda installer runtime images

Note that you do not need to do this if you do not need or want to rebuild the actual anaconda runtime installer in any of the anaconda installation boot and runtime images.
Also note that if you do skip it, then it is essential that a package order file is created (as described below).

The following command is a small but powerful tool that calls on several other scripts to do a lot of work:
  
	# buildinstall --pkgorder /redhat/pkgorder.txt --comp dist-7.2 --version 7.2 /redhat/i386
or for redhat 7.3...
  
	# buildinstall --pkgorder /redhat/pkgorder.txt --comp dist-7.3 --version 7.3 /redhat/i386

Notes:

/usr/lib/anaconda-runtime/buildinstall does a lot of things... in essence, its job is to completely rebuild the anaconda installer boot and runtime images using the packages it finds in the RedHat/RPMS directory.

The run-time anaconda installer, once it is all packaged, exists in several forms as boot disk and installer images.
So what buildinstall does is rebuild from scratch all of the boot images found in the i386/images/ and i386/dosutils/autoboot/ directories, as well as the anaconda installer runtime images (hdstg1.img, netstg1.img and stage2.img) in the i386/RedHat/base/ directory.
It does this by directly extracting with rpm2cpio a selection of the RedHat/RPMS/*.rpm packages into a temporary build root tree in the i386/ directory.
(So beware that you'll need a few Mb free to accomodate this temporary workspace).
For example, the contents of the kernel-BOOT package in the installation directory is extracted (among many others) into this temporary build tree and (selected parts are) used as the installer's boot kernel and initrd-included modules.

Much of this work is done by /usr/lib/anaconda-runtime/mk-images (which happens to be an easily readable/hackable #!/bin/bash shell script).

It is possible just before this point to reconfigure the buildinstall to include things like your own customised install boot drivers if you need to do that.
This would be useful for, eg, a kickstart install designed for a big bunch of cloned machines that all happen to have some weird hardware that needs some non-standard special drivers for accessing the installation media.
(This was doable on redhat 5.x and 6.x installs, I have not done this myself on redhat 7.x. If someone would like to email me to confirm the same for redhat 7.3, I will update this document).

Hint: read /usr/lib/anaconda-runtime/buildinstall - it is a #!/bin/bash shell script.

The output generated by running buildinstall is quite verbose, and it is usual to see some error messages go by as it does its work.
These include a series of non-fatal errors, such as complaints about missing files, bad links and missing kernel modules...
  
Running mkfontdir...
/usr/X11R6/bin/mkfontdir: failed to create directory in
/redhat/i386/RedHat/instimage/usr/share/fonts/ISO8859-2/*
rm: cannot remove `/redhat/i386/RedHat/instimage/usr/X11R6/bin/mkfontdir': No such file or directory
Scrubbing trees... /redhat/i386/image-templateusr/sbin/ldconfig: /usr/lib/libbz2.so.1 is not a symbolic link
 
Scrubbing trees... /redhat/i386/RedHat/instimageusr/sbin/ldconfig: /usr/lib/libbz2.so.1 is not a symbolic link
When generating all the boot images, there can be many (repeated) instances of:
  
module xxxx not found in kernel rpm
These are non-fatal warnings, not errors. If it runs to completion (it may take some time) without the script itself falling in a heap, then all is (usually) well.
If a temporary buildinstall.xxxxx directory exists in your i386/ directory, then zap it with "rm -rf" and try to track down the reason why it failed.

If you have just rebuilt the installer and intend to use it only for network installations and not for creating cdrom images, then your install image is now ready for use at this point.


Step 5 - Generate a package order file (if necessary)

The /usr/lib/anaconda-runtime/pkgorder utility is used like this:
  
	# pkgorder /redhat/i386 i386 > /redhat/pkgorder.txt
The output redirection (">") creates the file "silently".
I suggest creating it with the tee utility so that you can check for error messages in the output...
  
	# pkgorder /redhat/i386 i386 | tee /redhat/pkgorder.txt
(tee is a simple standard unix utility that redirects any input to both a file and to the current tty).

Notes:


Step 6 - Split the installation image into several CDROM images

One of anaconda's tools does this otherwise quite tricky task for you... the python script /usr/lib/anaconda-runtime/splitdistro uses the installation source tree in i386/ to automagically create a populated series of directories alongside the i386/ and SRPMS/ directories, named
	i386-disc1/
	i386-disc2/
	i386-disc3/
	i386-disc4/
	i386-disc5/

The contents of these directories have the distribution packages split sic up, sized and sorted in a manner exactly suitable for using as cdrom images. At the same time, it does it quicly and in a way that consumes only a small amount of actual hard drive space.
This is exactly what we want. Magic.

There is a lot to say about this very useful python script.
  
	# splitdistro --fileorder /redhat/pkgorder.txt /redhat i386
Note that the syntax has been slightly extended for redhat 7.3...
  
        # RELEASE="RedHat 7.3 (Valhalla) with updates to $(date '+%Y-%M-%d %H:%m')"
	#
	# splitdistro --fileorder /redhat/pkgorder.txt --release "$RELEASE" /redhat i386

In both redhat 7.2 and 7.3, there are known problems with the as-released splitdistro script in the anaconda-runtime package. See below for working, enhanced versions.
Note the following:

Alternative versions of the splitdistro (with patches)

The as-released version of splitdistro in both redhat 7.2 and 7.3 have known problems, and both need patching before they will work correctly.

Aside:
It is a pity that the tendency for splitdistro so far is to hack it to work specifically for each distribution.
For example, the 7.3 version of splitdistro will fail to work for correctly creating redhat 7.2 images because the directory structures, number and contents of the installers are hard-coded into the scripts. Command-line parameters have also changed.
Perhaps this is the nature of the beast, but IMHO it would be good to make changes to anaconda so that it is able to automatically cope both with older distributions (eg, 7.0, 7.1 and 7.2), and for any future distributions that may require the management of any number of binary and source/doc installation disks.

Here are versions of splitdistro that I am currently using for my own purposes...

The diffs are included as they can be useful for quickly analysing the modifications that have been made (see the man pages for diff(1) and patch(1).
Keep an eye on these links, they may change from time to time as further changes are made. There should be a description of the patch at the top of the file.

BOTH of these enhanded versions now have exactly the same syntax...
  
        # RELEASE="RedHat 7.3 (Valhalla) with updates to $(date '+%Y-%M-%d %H:%m')"
	#
	# splitdistro --fileorder /redhat/pkgorder.txt --isosize 680 --fudge 0.8 --release "$RELEASE" /redhat i386

The changes to these rh7.2 and rh7.3 versions of splitdistro do the following:

  • As mentioned above, you can set the size limit of the final cdrom images in splitdistro.
    This is handy... after the updates and other additions I tend to add, for redhat 7.2 the final result is close to two full 700Mb install images.
    For redhat 7.3, I am putting the contents of the 55Mb redhat 7.3 documentation iso image onto the first installation image.
    This is no problem with blank 700Mb disks but too much for 640Mb (the default). 640Mb just doesn't do it any longer :-)
    (The day when distributions are released on DVD as standard is coming quickly... the European release of redhat 7.2 was commercially available on DVD).


    Step 7 - Re-generate new hdlist files

    We are almost there, but there is still some final tweaking needed in the new installer image trees.

    The hdlist* files in i386-disc1/RedHat/base/ are hard-linked copies of the original ones in the i386/RedHat/base/ directory, and are very likely (certain!) not to work properly for cdrom installs.
    So they need to be re-created.

    genhdlist is used again to create new hdlist files, but this time referencing the "package order catalogue" and the newly-created i386-disc? directory trees...
      
    	# rm -f /redhat/i386-disc1/RedHat/base/hdlist*
    	# genhdlist --withnumbers --fileorder /redhat/pkgorder.txt /redhat/i386-disc[123]
    

    Once you have come this far... congradulations, this completes the preparation of the cdrom source image directory trees.


    Step 8 - Create the iso9660 filesystem images

    What needs to be done now is to create the cdrom iso9660 images using the i386-disc?/ directory trees, and burn them to blank cdrom disks.
    For the more experienced, things should be "plain sailing" from this point, but to complete the discussion here and to give some help and tips for those not so experienced, this process is explored in some detail.

    Also presented here are some suggestions for making the other cdroms (besides the first one) bootable in useful ways. Bonus :)

    Use whatever tools you like to create the iso images, using your own options.
    mkisofs seems to be what is generally available, it comes standard with redhat, RedHat themselves use it, and it works very well.
    I create the iso image in a manner similar to this (example for redhat 7.3):

    	# cd /redhat/
    	# myname='Tony Nugent <tony@linuxworks.com.au>'
    	# bootimg="images/boot.img"
    	# bootimg="dosutils/autoboot/cdboot.img"
    	# bootcat="RedHat/base/boot.cat"
    	# distname="valhalla"
    	# distvers="7.3"
    	# mkisopts="-r -N -L -d -D -J"
    	# today="$(date '+%d %b %Y')"
    	# mkisofs $mkisopts		\
    	  -V "RedHat $distver ($distname) UPDATED Disk 1"	\
    	  -A "RedHat $distver ($distname) update created on $today"	\
    	  -P "$myname"			\
    	  -p "$myname"			\
    	  -b "$bootimg"			\
    	  -c "$bootcat"			\
    	  -x lost+found			\
    	  -o "$distname"-1.iso		\
    	  i386-disc1
    
    	# for i in 2 3 ; do
    	#   mkisofs $mkisopts		\
    	    -V "RedHat $distver ($distname) UPDATED Disk $i"	\
    	    -A "RedHat $distver ($distname) update created on $today"	\
    	    -P "$myname"		\
    	    -p "$myname"		\
    	    -x lost+found		\
    	    -o "$distname"-${i}.iso	\
    	    i386-disc${i}
    	  done
    
    Notes: