Giter Club home page Giter Club logo

Comments (13)

rcls avatar rcls commented on September 8, 2024

Hi York,
My guess is that you are using the CVS "modules" file to map the subdirectories into software/AB+.
(That would be in /cvsroot/software/CVSROOT/modules).

Is that correct? Is the CVS pubically accessible?

I never wrote code to support the "modules" file in crap-clone, unfortunately.

If all the subdirectories are in one place, then you might get a good enough clone using the real server side directory:

crap-clone :pserver:[email protected]:2401/cvsroot/software project

Or you could remove the "CVSROOT/modules" file and clone the entire CVS repo:

crap-clone :pserver:[email protected]:2401/cvsroot/software .

(If you end up with the CVS repo cloned, but the directory layout wrong, then "git filter-branch" can be used to modify the git repo).

Let me know if this is any use.

There are a couple of other cvs-to-git converters around, you could see if they do better.

Cheers, Ralph.

from crap.

YorkZ avatar YorkZ commented on September 8, 2024

Hi Ralph Loader,

Thank you very much for your help.

My guess is that you are using the CVS "modules" file to map the
subdirectories into software/AB+.

In the file CVSROOT/modules, there is a line referring the module AB+:

AB+         -a  1-1 1-2 1- 3 1-4 1-5 1-6 1-7 1-8 1-9 1-10 1-11 1-12
1-13 1-14 1-15 1-16 1-17 1-18 1-19 1-20

Is the CVS pubically accessible?

Too bad that the repository is highly confidential, and nobody is allowed to disclose anything into public domain. I guess I don't have access to the entire repository either.

If all the subdirectories are in one place, then you might get a good enough
clone using the real server side directory: crap-clone

:pserver:[email protected]:2401/cvsroot/software project

I've tried to clone "project", but I got the following error:

File 99-35/.donotprune branch CDEFG duplicates branch CDE (1.1.1)
Killing zombie version PATH/TO/FILE1.cpp 1.15
Killing zombie version PATH/TO/FILE2.XML 1.7
Killing zombie version PATH/TO/FILE2.cpp 1.12
cvs: cvs [rlog aborted]: could not chdir to Network: Permission denied
Expected RCS file line, not error

Or you could remove the "CVSROOT/modules" file and clone the entire CVS repo:
crap-clone :pserver:[email protected]:2401/cvsroot/software .

I don't think I have the access to remove the "CVSROOT/modules" file on the server. But I did try to clone the entire repo and got the error:

RCS file name '/cvsroot/software/CVSROOT/avail,v' does not start with prefix
'/cvsroot/software/./'

There are a couple of other cvs-to-git converters around, you could see if
they do better.

So far, crap-clone is the only tool that has worked (partly) for me. I didn't bother with cvs2git because as far as I know, it's requires the local account to the CVS repo which I don't have. I've also tried git cvsimport:

$ git cvsimport AB+

But I got the error:

Initialized empty Git repository in ~/tmp/AB+/.git/ Unknown: E cvs checkout:
`1-1/FILE.rpm.specs' is no longer in the repository at
/usr/local/libexec/git-core/git-cvsimport line 511, <GEN0> line 26818.

However, I was able to clone project/1-1 using git-cvsimport:

$ git cvsimport project/1-1

But it was extremely slow.

I also tried cvsclone which gave me the "1.32 Segmentation fault" error.

I would appreciate it if you could give me some help or suggestion.

Thanks a lot

York

from crap.

YorkZ avatar YorkZ commented on September 8, 2024

FYI

I just tried using another tool called cvsclone, to clone the entire CVS repo under "project" to my local drive. It seemed to work well until to the point when it tried to enter the directory "project/Nework". Here's the error it reported:

cvs rlog: Logging project/Network
cvs [rlog aborted]: could not chdir to Network: Permission denied exit: 1

from crap.

rcls avatar rcls commented on September 8, 2024

It looks like you don't have permissions to that part of the repository on the server side. You have a couple of options to work around this, neither especially pleasant:

Clone each directory in project/ separately, and then use git to merge them all together. It should be possible using git-filter-branch and git-stitch-repo. You would have to work out the details yourself though.

Alternatively, you could modify crap-clone (or cvsclone) to modify the 'cvs rlog' command it sends to the server; in my crap-clone.c this looks like:

cvs_printff (&stream,
             "Global_option -q\n"
             "Argument --\n"
             "Argument %s\n"
             "rlog\n", stream.module);

Instead of sending one 'Argument' line for the top level directory (stream.module), send one for each sub-directory that you are interested in. I haven't tried this, you will need to experiment a bit to get it right...

from crap.

rcls avatar rcls commented on September 8, 2024

Actually, I wrote a quick hack to let you do this. Try the branch directory-limit from my repo. This adds a command-line option to list the directories you want to clone. So you should be able to do:

crap-clone -d 1-1 -d 1-2 -d 1-3 :pserver:[email protected]:2401/cvsroot/software project

and this will include the directories 1-1, 1-2, 1-3 but ignore Network.

from crap.

YorkZ avatar YorkZ commented on September 8, 2024

This is super amazing Ralph. Thank you so much!

I'm current cloning all the projects. It takes a while. I'll let you know the results tomorrow morning after arriving my office.

Thanks again!

York

from crap.

YorkZ avatar YorkZ commented on September 8, 2024

Hi Ralph,

I seem to have successfully cloned all the projects within the module AB+ into a single git repository. However, one thing I've noticed is that each git commit contains only one file. In other words, a single CVS commit has been split into several git commits, one commit per file. I guess this is because CVS doesn't have the changeset concept right? If this is the case, I guess it would not be straightforward to re-assemble the changeset to create one git commit right?

Thanks,

York

from crap.

rcls avatar rcls commented on September 8, 2024

Hi York,
I use heuristics based on the meta data to try and put things into changesets: if the commit message / author etc are all identical, and the timestamps are not too far apart, then I put combine revisions into a changeset.
It sounds like this has gone wrong for you for some reason.
Without any access to your CVS repo, it is hard for me to track down. The things I would look at:

  • Does your CVS repo have any commit hooks making the commit message different per-file?
  • It is possible the use of CVS commitid is getting things wrong. I suspect that some CVS clients may commit files one-by-one, getting different commitids. I have created an (untested) branch no-commitid to ignore the commitid - see if that makes any difference.

If you want to debug this yourself, the relevant code is in changeset.c:

The create_changesets() function sorts the revisions, and then aggregates into changesets.

It uses the function strings_match() compares author / commitid / branch-name / log-message (and an internal flag). (It is intentional that the strings are compared by pointer : I keep only one copy of each unique string content).

from crap.

YorkZ avatar YorkZ commented on September 8, 2024

Hi Ralph,

I think you've done a really good job grouping things into changesets; looks like you did it correctly! I have checked a few cases carefully and noticed that even though those CVS commits have the same commit messages, the commit timestamps were really different, they've really been committed several times and they really have different CVS commit Ids. In the cases when several files were really committed in one go, they have exactly the same timestamps and CVS commit Ids; and you have put them into one single git commit! I was under the wrong impression because I thought the "Checkin Notice Emails" I received whenever somebody commits something were automatically generated. But turns out they were not. Those checkin notice were actually manually composed by the developers. I apologize for reporting the non-bug.

I'll ask your help when I have new problem.

Thank you very much!

York

from crap.

YorkZ avatar YorkZ commented on September 8, 2024

Hi Ralph,

In case one truly does commit multiple times with the same commit message, I think it's a good idea to combine the consecutive commits into one single git commit. Therefore, I tried your "no-commitid" branch which seems to work. Great job!

On the other hand, I wanted to make sure that:

  1. Only consecutive commits will be combined right?
  2. We combine commits only if they don't have intersection right? For example, if commit 1 includes file "foo" and "bar", but commit 2 includes "foo" and "baz", these two commits will not be combined right? Because I think in this case combining the two commits would lose the history of file foo.

Thanks again,

York

from crap.

rcls avatar rcls commented on September 8, 2024

Hi York,
That was a lucky guess! I shall merge the commit-id change to master.
Your understanding of how file-versions are combined into commits is correct.
The algorithm is as follows:

  1. Tentatively group file-versions into commits using branch / date / log-message / author. (this is in changeset.c).
  2. Attempt to put all the commits on each branch into a sequence compatible with the version numbers on each file. (It's a topological-sort of a digraph, see emission.c and heap.c).
  3. If that fails, then I break up commits until the previous step succeeds. (When the topo-sort fails, you can identify a cycle in the digraph, and then break the cycle by splitting a commit in two, see the function cycle_split() in emission.c).

The only case I've seen in real life where 3. is necessary is where my first attempt at building the commits has two versions of the same file. Presumable what happened was that someone committed, fixed a problem immediately, and then committed the fix with the same log message.
In theory more complicated inconsistencies in the commit ordering can happen (and the code should cope), but I have not seen this.

from crap.

YorkZ avatar YorkZ commented on September 8, 2024

Hi Ralph,

Thank you very much for your explanation. I just took a quick look into the file changeset.c, and your code looks neat and nice. Amazing job!

I shall merge the commit-id change to master.

Definitely, in my opinion! Maybe add a command line switch for this? Also, don't forget the extremely useful "-d DIRECTORY" option!

I guess after you merges the directory-limit branch, we can close this issue because I'm sure it has been addressed by the new "-d DIRECTORY" option.

Thanks again,

York

from crap.

YorkZ avatar YorkZ commented on September 8, 2024

Importing CVS module to a single Git repository can now be achieved by passing all the directories defined by the CVS module on the command line, using the new "-d DIRECTORY" option.

from crap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.