How-to:
Problems:
Getting help:
If you need to convert certain CVS modules (in one large repository) to Subversion now and other modules later, you may want to convert your repository one module at a time. This situation is typically encountered in large organizations where each project has a separate lifecycle and schedule, and a one-step conversion process is not practical.
First you have to decide whether you want to put your converted projects into a single Subversion repositories or multiple ones. This decision mostly depends on the degree of coupling between the projects and is beyond the scope of this FAQ. See the Subversion book for a discussion of repository organization.
If you decide to convert your projects into separate Subversion repositories, then please follow the instructions in How can I convert part of a CVS repository? once for each repository.
If you decide to put more than one CVS project into a single Subversion repository, then please follow the instructions in How can I convert separate projects in my CVS repository into a single Subversion repository?.
This is easy: simply run cvs2svn normally, passing it the path of the project subdirectory within the CVS repository. Since cvs2svn ignores any files outside of the path it is given, other projects within the CVS repository will be excluded from the conversion.
Example: You have a CVS repository at path /path/cvsrepo with projects in subdirectories /path/cvsrepo/foo and /path/cvsrepo/bar, and you want to create a new Subversion repository at /path/foo-svn that includes only the foo project:
$ cvs2svn -s /path/foo-svn /path/cvsrepo/foo
cvs2svn supports multiproject conversions, but you have to use the options file method to start the conversion. In your options file, you simply call ctx.add_project() once for each sub-project in your repository. For example, if your CVS repository has the layout:
/project_a /project_b
and you want your Subversion repository to be laid out like this:
project_a/ trunk/ ... branches/ ... tags/ ... project_b/ trunk/ ... branches/ ... tags/ ...
then you need to have a section like this in your options file:
ctx.add_project( Project( 'my/cvsrepo/project_a', 'project_a/trunk', 'project_a/branches', 'project_a/tags', symbol_transforms=[ #...whatever... ], ) ) ctx.add_project( Project( 'my/cvsrepo/project_b', 'project_b/trunk', 'project_b/branches', 'project_b/tags', symbol_transforms=[ #...whatever... ], ) )
If foo is the only project that you want to convert, then either run cvs2svn like this:
$ cvs2svn --trunk=foo/trunk --branches=foo/branches --tags=foo/tags CVSREPO/foo
or use an options file that defines a project like this:
ctx.add_project( Project( 'my/cvsrepo/foo', 'foo/trunk', 'foo/branches', 'foo/tags', symbol_transforms=[ #...whatever... ], ) )
If foo is not the only project that you want to convert, then you need to do a multiproject conversion; see How can I convert separate projects in my CVS repository into a single Subversion repository? for more information.
This is an example of how the cvs2svn conversion can be customized using Python.
Suppose you want to write symbol transform rules that are more complicated than "replace REGEXP with PATTERN". This can easily be done by adding just a little bit of Python code to your options file.
When a symbol is encountered, cvs2svn iterates through the list of SymbolTransform objects defined for the project. For each one, it calls symbol_transform.transform(cvs_file, symbol_name). That method can return any legal symbol name, which will be used in the conversion instead of the original name.
To use this feature, you will have to use an options file to start the conversion. You then write a new SymbolTransform class that inherits from RegexpSymbolTransform but checks the path before deciding whether to transform the symbol. Add the following to your options file:
from cvs2svn_lib.symbol_transform import RegexpSymbolTransform class MySymbolTransform(RegexpSymbolTransform): def __init__(self, path, pattern, replacement): """Transform only symbols that occur within the specified PATH.""" self.path = path RegexpSymbolTransform.__init__(self, pattern, replacement) def transform(self, cvs_file, symbol_name): # Is the file is within the path we are interested in? if cvs_file.cvs_path.startswith(path + '/'): # Yes -> Allow RegexpSymbolTransform to transform the symbol: return RegexpSymbolTransform.transform( self, cvs_file, symbol_name) else: # No -> Return the symbol unchanged: return symbol_name # Note that we use a Python loop to fill the list of symbol_transforms: symbol_transforms = [] for subdir in ['project1', 'project2', 'project3']: symbol_transforms.append( MySymbolTransform( subdir, r'^release-(\d+)_(\d+)$', r'%s-release-\1.\2' % subdir)) # Now register the project, using our own symbol transforms: ctx.add_project( Project( 'your_cvs_path', 'trunk', 'branches', 'tags', symbol_transforms=symbol_transforms))
This example causes any symbol under "project1" that looks like "release-3_12" to be transformed into a symbol named "project1-release-3.12", whereas if the same symbol appears under "project2" it will be transformed into "project1-release-3.12".
CVSNT is a version control system that started out by adding support for running CVS under Windows NT. Since then it has made numerous extensions to the RCS file format, to the point where CVS compatibility does not imply CVSNT compatibility with any degree of certainty.
cvs2svn might happen to successfully convert a CVSNT repository, especially if the repository has never had any CVSNT-only features used on it, but this use is not supported and should not be expected to work.
If you want to experiment with converting a CVSNT repository, then please consider the following suggestions:
Patches to support the conversion of CVSNT repositories would, of course, be welcome.
Background: Normally, if you have a file called path/file.txt in your project, CVS stores its history in a file called repo/path/file.txt,v. But if file.txt is deleted on the main line of development, CVS moves its history file to a special Attic subdirectory: repo/path/Attic/file.txt,v. (If the file is recreated, then it is moved back out of the Attic subdirectory.) Your repository should never contain both of these files at the same time.
This cvs2svn error message thus indicates a mild form of corruption in your CVS repository. The file has two conflicting histories, and even CVS does not know the correct history of path/file.txt. The corruption was probably created by using tools other than CVS to backup or manipulate the files in your repository. With a little work you can learn more about the two histories by viewing each of the file.txt,v files in a text editor.
There are four straightforward approaches to fixing the repository corruption, but each has potential disadvantages. Remember to make a backup before starting. Never run cvs2svn on a live CVS repository--always work on a copy of your repository.
# You did make a backup, right? $ rm repo/path/Attic/file.txt,v
# You did make a backup, right? $ rm repo/path/file.txt,v
# You did make a backup, right? $ mv repo/path/Attic/file.txt,v repo/path/Attic/file-from-Attic.txt,v
# You did make a backup, right? $ mv repo/path/file.txt,v repo/path/file-not-from-Attic.txt,v
If you run cvs2svn on a case-insensitive operating system, it is possible to get this error even if the filename of the file in Attic has different case than the one out of the Attic. This could happen, for example, if the CVS repository was served from a case-sensitive operating system at some time. A workaround for this problem is to copy the CVS repository to a case-sensitive operating system and convert it there.
By default, cvs2svn uses the "co" program from RCS to read the contents of files in your archive. (See the requirements section of the documentation.) The solution to this problem is either to install RCS, or to ensure that CVS is installed and use cvs2svn's --use-cvs option.
There are several sources of help for cvs2svn:
cvs2svn is an open source project that is largely developed and supported by volunteers in their free time. Therefore please try to help out by reporting bugs in a way that will enable us to help you efficiently.
The first question is whether the problem you are experiencing is caused by a cvs2svn bug at all. A large fraction of reported "bugs" are caused by problems with the user's CVS repository, especially trying to convert a CVSNT repository with cvs2svn. Please also double-check the manual to be sure that you are using the command-line options correctly.
A good way to locate potential repository corruption is to use the shrink_test_case.py script (which is located in the contrib directory of the cvs2svn source tree. This script tries to find the minimum subset of files in your repository that still shows the same problem. Warning: Only apply this script to a backup copy of your repository, as it destroys the repository that it operates on! Often this script can narrow the problem down to a single file which, as often as not, is corrupt in some way. Even if the problem is not in your repository, the shrunk-down test case will be useful for reporting the bug. Please see "How can I produce a useful test case?" and the comments at the top of shrink_test_case.py for information about how to use this script.
Assuming that you still think you have found a bug, the next step is to investigate whether the bug is already known. Please look through the issue tracker for bugs that sound familiar. If the bug is already known, then there is no need to report it (though possibly you could contribute a useful test case or a workaround).
If your bug seems new, then the best thing to do is report it via email to the dev@cvs2svn.tigris.org mailing list. Be sure to include the following information in your message:
If you need to report a bug, it is extremely helpful if you can include a test repository with your bug report. In most cases, if we cannot reproduce the problem, there is nothing we can do to help you. This section describes ways to overcome the most common problems that people have in producing a useful test case. When you have a reasonable-sized test case (say under 1 MB--the smaller the better), you can just tar it up and attach it to the email in which you report the bug.
You don't want to send us your proprietary information, and we don't want to receive it either. Short of open-sourcing your software, here is a way to strip out most of the proprietary information and simultaneously reduce the size of the archive tremendously.
The destroy_repository.py script tries to delete as much information as possible out of your repository while still preserving its basic structure (and therefore hopefully any cvs2svn bugs). Specifically, it tries to delete all file descriptions and text content, all nontrivial log messages, and all author names. (It does not affect the directory and file names or the number and dates of revisions to those files.)
# You did make a backup, right? /path/to/config/destroy_repository.py /path/to/copy/of/repo
This step is a tiny bit more work, so if your repository is already small enough to send you can skip this step. But this step helps narrow down the problem (maybe even point you to a corrupt file in your repository!) so it is still recommended.
The shrink_test_case.py script tries to delete as many files and directories from your repository as possible while preserving the cvs2svn bug. To use this command, you need to write a little test script that tries to convert your repository and checks whether the bug is still present. The script should exit successfully (e.g., "exit 0") if the bug is still present, and fail (e.g., "exit 1") if the bug has disappeared. The form of the test script depends on the bug that you saw, but it can be as simple as something like this:
#! /bin/sh cvs2svn --dry-run /path/to/copy/of/repo 2>&1 | grep -q 'KeyError'
If the bug is more subtle, then the test script obviously needs to be more involved.
Once the test script is ready, you can shrink your repository via the following steps:
# You did make a backup, right? /path/to/config/shrink_test_case.py /path/to/copy/of/repo testscript.sh, where testscript.sh is the name of the test script described above. This script will execute testscript.sh many times, each time using a subset of the original repository.