This page describes a way to mirror the Linux kernel BKCVS repository to a GNU Arch one. BKCVS is the name for a CVS repository mirroring the BitKeeper one currently used by some (most important) of the Linux kernel developers.
The Linux BKCVS repository can be retrieved using rsync:
rsync -az --delete rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.5 /path/to/cvsrepo/linux-2.6 cvs -d /path/to/cvsrepo co linux-2.6
There is a ChangeSet,v file in the repository which contains the log messages of every commit. The CVS commit logs of the kernel files contain a "Logical change X" text, X being the ChangeSet,v CVS revision number.
The script below rsync's the Linux CVS repository, generates a log with the changeset information, generates a patch and a GNU Arch log for every new changeset and applies/commits them one by one. The script does not pollute the Linux tree with CVS directories. It uses some files/directories named ",bkcvs*" in the tree root.
The following steps should be performed:
- Download the 2.6.10 (for example, a newer version is recommended) Linux tarball from kernel.org and import it into the GNU Arch repository
Store the BKCVS revision number of this kernel version in the ,bkcvs-last-rev file (look in the ChangeSet,v file in the Linux CVS repository - 1.24782 for 2.6.10
- Run the script listed below in this directory (might take a few minutes to rsync the BKCVS repository)
The script always removes the last library revisions since keeping all of them is using too much hdd space (there are around 50 changesets a day). The script also uses the ,bkcvs-home directory in the current tree as the $HOME one because it modifies the .arch-params/\=id file according to the author of the BKCVS commit (since GNU Arch doesn't have an option to set this).
The scripts generates some X-BKCVS* headers in the arch log (they can be easily removed/modified). The X-BKCVS-Rev header is actually useful in case the entire directory is lost - just checkout the Linux GNU Arch repository and copy this number in the ,bkcvs-last-rev file (the only file needed for the script).
The command line options:
bkcvs-arch-sync.sh \
--working-dir|-d <dir> cd to <dir> before running (current directory without this option)
--dont-rsync|-n don't rsync with the Linux repository (useful for debugging)
--help|-h print this message
bkcvs-arch-sync.sh
#!/bin/sh
#
# Generates BKCVS changesets and synchronises them with the
# GNU Arch repository
#
# Copyright (C) 2005 ARM Limited
# Written by Catalin Marinas <catalin.marinas@arm.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
# Just exit if any error occurs
set -e
#set -x
# Print a help message
function print_help()
{
echo "Usage:"
echo "$1 \\"
echo " --working-dir|-d <dir> # cd to <dir> before running"
echo " --dont-rsync|-n # don't rsync with the Linux repository"
echo " --help|-h # print this message"
}
# Parse the command line options
GNUARCH_LINUX_DIR=`pwd`
DONT_RSYNC=n
while [ $# != 0 ]; do
case $1 in
--working-dir|-d)
shift
GNUARCH_LINUX_DIR=$1
shift
;;
--dont-rsync|-n)
DONT_RSYNC=y
shift
;;
--help|-h)
print_help $0
exit 0
;;
?*)
print_help $0
exit 1
;;
*)
break
;;
esac
done
# cd to the GNU Arch Linux tree
cd $GNUARCH_LINUX_DIR
# Different variables
FULL_LOG=,bkcvs.log
BKCVS_LAST_REV_FILE=,bkcvs-last-rev
BKCVS_HOME=,bkcvs-home
TLA_MYID_FILE=$BKCVS_HOME/.arch-params/\=id
CVSROOT=$GNUARCH_LINUX_DIR/,bkcvs-rsync
CVSREPO=linux-2.6
PATCH_PREFIX=,patch
BKCVS_RSYNC=rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.5
TLA_TREE_VERSION=`tla tree-version $GNUARCH_LINUX_DIR`
# grep regexps
RCS_FILE="^RCS file: .*,v$"
LOG_CHANGE="^}\?(Logical change [0-9\.]\+)$"
CVS_REV="^revision [0-9\.]\+$"
# Build the patch
function build_cset()
{
BKCVS_REV=$1
BKCVS_REV_RE=`echo $BKCVS_REV | sed -e 's/\./\\\./g'`
PATCH_FILE=$2
rm -f $PATCH_FILE
touch $PATCH_FILE
# For too long logs, just grep-out the unimportant lines
# (awk is slower than grep)
cat $FULL_LOG \
| grep -B1 -e "^file .*$" \
-e "^bkcvsrev $BKCVS_REV_RE$" \
| grep -v -e "^--$" \
| grep -B2 -e "^bkcvsrev $BKCVS_REV$" \
| grep -v -e "^--$" \
| gawk ' \
/^file .*$/ { file = $2 }
/^revision [0-9\.]+$/ { revision = $2 }
/^bkcvsrev '$BKCVS_REV_RE'$/ { print file, revision }' \
| while read file cvsrev; do
# just remove the leading '1.'
new_rev=${cvsrev#*.}
old_rev=$((new_rev - 1))
cvs -q -d $CVSROOT rdiff -u -r1.$old_rev -r1.$new_rev \
$CVSREPO/$file >> $PATCH_FILE
done
}
# Generate a tla-compatible log file
function build_log()
{
BKCVS_REV=$1
rlog -N -r$BKCVS_REV $CVSROOT/ChangeSet,v | gawk ' \
BEGIN {
search = "-"
FS = "[ \t;]+"
}
search == "-" && /^----------------------------$/ {
search = "r"
next
}
search == "r" && /^revision [0-9\.]+$/ {
search = "d"
next
}
search == "d" && /^date:.*; author:.*$/ {
date = $2 " " $3
author = $5
summary = ""
search = "s"
next
}
search == "s" && /^.+$/ {
if (summary == "")
summary = $0
else
summary = summary " " $0
next
}
search == "s" && /^$/ {
changelog = ""
search = "l"
next
}
search == "l" && /^=============================================================================$/ {
print "Summary: " summary
print "Keywords: "
print "X-BKCVS-Date: " date
print "X-BKCVS-Author: " author
print "X-BKCVS-Rev: '$BKCVS_REV'"
print "X-BKrev: " bkrev
print changelog
exit
}
search == "l" && /^BKrev: .*$/ {
bkrev = $2
}
search == "l" {
changelog = changelog "\n" $0
next
}'
}
# Generate the temporary home directory. We use it for generating the
# author of the patch
rm -rf $BKCVS_HOME
mkdir -p $BKCVS_HOME
cp -R $HOME/.arch-params $BKCVS_HOME/.arch-params
if [ $DONT_RSYNC != y ]; then
# rsync with the kernel BKCVS repository
mkdir -p $CVSROOT/$CVSREPO
cvs -q -d $CVSROOT init
rsync -az --delete --exclude /BitKeeper/ --exclude /ChangeSet,v \
$BKCVS_RSYNC/ $CVSROOT/$CVSREPO
rsync -az --delete $BKCVS_RSYNC/ChangeSet,v $CVSROOT
# generate the full BKCVS log (only keep the filename, revision number
# and the cset number)
# Make sure ChangeSet,v is not in the repository since it doesn't
# follow the rules
echo "Generating the full BKCVS log"
cvs -q -d $CVSROOT rlog $CVSREPO | gawk ' \
BEGIN {
search = "f";
}
/^RCS file: .*,v$/ {
print
search = "r"
next
}
search == "r" && /^revision [0-9\.]+$/ {
print
search = "b"
next
}
search == "b" && /^\}?\(Logical change [0-9\.]+\)$/ {
print
search = "r"
next
}' \
| sed -e "s%^RCS file: $CVSROOT/$CVSREPO/\(.*\),v$%file \1%" \
-e "s/^}\?(Logical change \([0-9\.]\+\))$/bkcvsrev \1/" \
> $FULL_LOG
fi
# Generate revisions one by one ("1." is removed from start and end)
start=`cat $BKCVS_LAST_REV_FILE | sed -e "s/^1\.//"`
end=`rlog -N -r1.$start $CVSROOT/ChangeSet,v \
| grep -e "^head: 1\.[0-9]\+$" \
| sed -e "s/^head: 1\.\([0-9]\+\)$/\1/"`
# check the last patch (is rsync atomic?)
BKCVS_REV=1.$start
PATCH_FILE=$PATCH_PREFIX-$BKCVS_REV
echo "Checking the last applied changeset"
echo "-- Building the $BKCVS_REV changeset"
build_cset $BKCVS_REV $PATCH_FILE
patch --dry-run -R -s -p1 < $PATCH_FILE
rm $PATCH_FILE
((start++))
if [ $((start > end)) == 1 ]; then
echo "No changesets to be applied"
exit 0
fi
# add the missing changesets
echo "Adding BKCVS changesets between 1.$start and 1.$end"
while [ $((start <= end)) == 1 ]; do
BKCVS_REV=1.$((start++))
PATCH_FILE=$PATCH_PREFIX-$BKCVS_REV
echo "-- Building the $BKCVS_REV changeset"
build_cset $BKCVS_REV $PATCH_FILE
TLA_LOG_FILE=`tla make-log`
build_log $BKCVS_REV > $TLA_LOG_FILE
echo -n " "; head -n1 $TLA_LOG_FILE | sed -e "s/^Summary: //"
# modify the local =id file (for tla my-id)
cat $TLA_LOG_FILE | grep -e "^X-BKCVS-Author: .*$" \
| sed -e "s/^X-BKCVS-Author: \([^ \t]*\)$/\1 <\1@invalid-address.com>/" \
> $TLA_MYID_FILE
# patch the source with the new changeset (-E is needed since cvs diff
# does not produce a proper timestamp)
patch -f -s -E -p1 < $PATCH_FILE
# Add ids for the new files
for i in `tla tree-lint -t`; do
find ./$i -not -path "*/.arch-ids*" -a -not -path "*/{arch}*" \
-a -not -path "*/,*" \
-exec tla add-id {} \;
done
# Remove the ids for the missing files
for i in `tla tree-lint -m`; do
rm $i
done
# Prune the empty directories (containing maybe only .arch-ids)
# We actually need to test them in the reverse order to cope with
# empty subdirs
unset STACK
for i in `find . -type d -not -path "*/.arch-ids*" \
-a -not -path "*/{arch}*" \
-a -not -path "*/,*"`; do
STACK="$i $STACK"
done
for i in $STACK; do
ls $i/* &> /dev/null || rm -rf $i
done
# the actual commit (HOME changed to preserve the original author)
HOME=$BKCVS_HOME tla commit
# cleanup
rm $PATCH_FILE
# remove the previous library revision (saves a lot of space)
tla library-remove $TLA_TREE_VERSION--`tla library-revisions | tail -n2 | head -n1`
echo "$BKCVS_REV" > $BKCVS_LAST_REV_FILE
done
echo
echo "-- Update completed successfully --"
