Emerge/Exclude categories
From Gentoo Linux Wiki
| Terminals / Shells • Network • X Window System • Portage • System • Filesystems • Kernel • Other |
Contents |
[edit] Introduction
It is possible to avoid synchronizing packages that will never be needed in your installation of Gentoo.
This article explains how to arrange to have emerge use a feature of rsync to exclude categories and even specific packages from updating their ebuilds and metadata in the portage tree, saving time and bandwidth for you and your Gentoo mirror.
The information in this article is not about managing which packages are updated once they are installed (by emerge -u). Please see package masking in this case.
[edit] Managing the Synchronization operation
rsync is used to transfer files between computers over a network. Once a group of files is transfered rsync may be subsequently used to keep the various sites in order, which means deleting, adding and changing files on the target site(s) according to what is done to them on the original (or master) site. See also the rsync man page.
rsync provides two command-line options for excluding certain files from updates:
- --exclude=PATTERN - exclude files matching PATTERN
- --exclude-from=FILE - use patterns in FILE to exclude files
The --exclude-from names a file containing a list of exclusion patterns. Its use is recommended.
emerge sync can be configured to use an exclusion file when it runs rsync, which is how we get packages and entire categories excluded from the sync.
[edit] Exclude patterns
The format of an exclude pattern is a shell glob-style syntax, with a few variations:
- Patterns are matched in a recursive manner, so the pattern "foo" will match any name foo at any depth in the transferred directory tree (/usr/portage in our case).
- Patterns beginning with a slash, /, match exactly from the top of the transferred directory tree. Note that this does not mean absolute path to the file in the original file system.
- Patterns ending in a slash, /, will match directories and thus prune them and all contents out of the transfer.
- The * (asterisk), ? (question mark) and [ (left square bracket) shell wildcards have their usual meaning
- A double star wildcard, ** for example, includes directory-separating slashes in the match, thus matching to the files at the end of the path. The single star, *, stops matching at a directory separator).
- Patterns beginning with a + (plus) are changed into include patterns (The opposite of exclusions) to be able to define exemptions to exclusions.
More details and many examples can be found in the rsync man page.
[edit] Writing an exclusion file
Patterns are mostly straight forward, except when writing inclusions in an excluded parent directory. You cannot exclude the parent directory when trying to include a selection of the apps in a directory.
We need to review some features of portage's directory structure so we can write effective exclusion patterns. The tree is located (by default) in /usr/portage. There is a directory for each category, a metadata directory, and some additional directories (distfiles, licences, etc.).
For example, in a desktop workstation at the office, we can exclude all packages for laptops and gaming. Thus, we're not interested in any package in the app-laptop category, or anything in games-*. However, since we're not completely cold-hearted, we'll allow updates to nethack. Below is an example exclusion file which we've put in the /etc/portage directory (you may need to create this directory):
The problem with the games-* directories is that you cannot create a exclusion pattern using games-*/ , because that will exclude all the games-* directories. Therefore rsync will not include any games you do want to include, because it was told to exclude all games-* directories. Adding nethack to the include list won't work, because it's parent directory (games-*, which covers games-roguelike also) was excluded. So if you do want to exclude all games, then use games-*/ , otherwise if you want to only exclude some games, you need to do it on a by directory instance (you can though combine like named directories; see below).
| File: /etc/portage/rsync_excludes |
app-laptop/ + games-roguelike/nethack** games-roguelike/** games-a*/ games-board/ games-e*/ games-fps/ games-kids/ games-m*/ games-puzzle/ games-rpg/ games-s*/ games-util/ |
There are 3 keys here to make this work:
- nethack is re-included by prefixing the pattern with an + sign (and a SPACE after it).
- For nethack to be on the include list, it has to appear before the Parent folder that contains it (ie games-roguelike). Otherwise nethack will not be included.
- For all the file/folders in nethack to be included also you need to append ** on the end of the package to include and it's parent directory. So nethack and games-roguelike need ** appended to them.
Additionally, categories are appended with a / to indicate that they are directories, but specific packages are not so that metadata associated with the excluded packages is also excluded (have a look in your metadata/cache/games-roguelike to see why). This is also why our patterns do not begin with a / since that would require the excluded files to be in the main category directories. Alternatively we could write separate rules for the category's main directory and metadata but that would be too much work, right?
The exclusions file does not necessarily have to be in /etc/portage and you're allowed to call the file anything you like, as long as the portage user/group can read it. However it's best to stick to something sensible.
[edit] Testing exclusion file
Since portage doesn't allow multiple sync's to see how your exclusion list is working. You can do it locally.
- Create a directory in your home directory called, for example, portagetest.
- Create the /etc/portage/rsync_excludes file, as detailed above, and save it.
- Run the following command (edit per your portagetest directory name and path-to rsync_excludes file. Change username to your user name): rsync -av --exclude-from=/etc/portage/rsync_excludes /usr/portage/ /home/username/portagetest
This will rsync /usr/portage to your /home/username/portagetest directory where you can look to see if your patterns from /etc/portage/rsync_excludes are working. If you change the filter, delete the contents of /home/username/portagetest then run the rsync command above again and test it out. You won't be limited to how many times portage will let you emerge --sync.
[edit] Configuring emerge
The final step is now to tell emerge to use the exclusion file. This is accomplished by adding the following to /etc/make.conf:
| File: Add to /etc/make.conf |
# Portage tree exclusions PORTAGE_RSYNC_EXTRA_OPTS="--exclude-from=/etc/portage/rsync_excludes" |
emerge --sync will now pass your exclusions file to rsync when you update portage.
[edit] Cleaning up
You'll now want to clear out all the directories you've added to rsync_excludes. A quick way to do this is to run the following command: rm -rf a* dev-* g* k* m* n* perl-* r* sci-* sec-* sys-* w* x*
Note that you shouldn't just rm -rf * as this may cause problems because it also deletes the profiles directory. It will also delete the distfiles (where portage stores downloaded files), local directory (where layman overlays are stored) and packages (where portage stores binary packages you've created with --buildpkg) directories.
Another method is to use a perl script that reads from the rsync excludes file you created. Note that this script can only handle categories, not anything else (package level directories or profiles for example): perl -e 'while(<>) { chomp; s"/$""; print `find /usr/portage -wholename \*/$_ | xargs rm -rf ` unless /^#/; }' /etc/portage/rsync_excludes
[edit] Performing a full update
Temporarily commenting out the line above from /etc/make.conf will cause emerge --sync to resume its default behavior and update the ebuilds for the entire portage tree.
[edit] Adding all categories without installed items
This script lists all the categories with no item installed. Simply put the output into /etc/portage/rsync_excludes:
| File: update_rsync_excludes |
#!/usr/bin/perl
foreach (</usr/portage/*>) {
next unless (-d $_);
s#^/.+/##;
next if (-d '/var/db/pkg/'.$_);
print $_.' ';
}
|
make it executable and run it as root: chmod u+x update_rsync_excludes && ./update_rsync_excludes
If you think some categories might be empty because you uninstalled some packages, run: rmdir --ignore-fail-on-non-empty /var/db/pkg/* (this will not remove any non-empty directories).
