rtcto / rtc2git Goto Github PK

View Code? Open in Web Editor NEW

78.0 14.0 61.0 427 KB

A tool made for migrating code from an existing IBM's RTC SCM repository into a Git repository

Home Page: https://rtc.to

License: MIT License

Python 100.00%

rtc scm python git cli rational-team-concert migration migration-tool

rtc2git's People

Contributors

Stargazers

Watchers

rtc2git's Issues

No repositories found based on path

The script is executing! Yay! Unfortunately, I'm seeing some errors with the lcsm commands, and the changes aren't being put in my git repo:

I've tried running with and without useProvidedHistory, and I get the same errors either way. I've also tried running with the stream's name instead of UUID. Any ideas on how to fix?

Fails to handle Workspace with spaces in name

Due to the parameter "self.workspace" being put straight into the command line, if the workspace has any spaces in the name, any command using the workspace name will fail.

Start migration from a specific baseline

Me as a dedicated migration-person want to start my migration on a specific (old) baseline in order to get the complete history.

In order to achieve that, the baseline of the components (of the workspace) needs to replaced.

Invalid Syntax error

When I run migration.py, I'm getting a SyntaxError.

I'm running Python 2.7.6 on Windows. Any ideas on how to fix this?

Add configuration to choose between lscm and scm

I often experience that lscm hangs whereas scm always works. It would be nice to add a configuration value for this.

Make migrated branch to master

Currently the master branch contains only the initial commit. For the user, this could be irritating if he clones the migrated repository the first time.

Following commands would make it possible to make a specific branch to a new master
git branch -m master initialCommit (rename master branch locally to something else)
git branch -m myStreamBranch master (rename branch where migration took place to master)
git push -f origin master (push the current master and override the existing one in the repo aka .git folder)

When Git commits are missed, the files are not included with the next commit

Here is what I know:
Stream A and Stream B have common history.
I migrated Stream A. The end result of File 1 in Git matched the end result in RTC, so all was well.
I migrated the portion of Stream B from the branching point forward. Then I rebased the history so that Stream A had a complete set of history. The end result of File 1 in Git did not match the end result in RTC.

It turns out that some of the Git commits were missed in the migration of Stream A prior to the branch point. I expected that any code changes that were not migrated as part of a Git commit (for example, if the script stopped because of a merge conflict and then was manually restarted) would be included in the very next Git commit. This would result in the code changes being associated with the wrong comment, but I was ok with that if it only happened occasionally. Upon investigation, that is not what actually happens. The code changes that are missed do not go into the very next Git commit--they go into the next Git commit that touches that file. In my case, the missed Git commit happened before the branching point. The catchup of the missed code change happened after the branching point. Unfortunately, in my case, this means that Stream B never got this code change--resulting in an incorrect file.

I'm going to think this through some more...would committing file changes whenever the script stops fix the problem?

Switch branch aborted

After completing one stream, the script attempted to switch branches, but the following was displayed:

However, the script continued on with the lscm commands. Does this mean that the changes the script is accepting and committing are actually being committed into the previous branch?

Also, any ideas on what caused this problem and how to avoid it?

Ability to accept multiple (more than two) change sets together to avoid conflicts

When migrating I experience that the script gives up on conflicting change sets if there is a need to accept more than two together.
If there is one change set followed by a "merge" change set everything works fine. But in the code base I am working on sometimes accepting e.g. five change sets together is required to avoid conflicts. I propose retryacceptincludingnextchangeset be made into a loop instead of just trying the next change set. This way it would continue discarding/accepting until it succeeds (or the author differs/comment does not contain the word merge).
Also to support unattended migrations adding an option to automatically attempt multiple accepts would be nice.

Default value for gitreponame and workspacename

Currently you always need to specify a git reponame (like migration.git) and a workspacename for the rtc workspace (Migration_Workspace).

To improve usabilty, have a default value for these two.
The question should be "Do you want to recreate the workspace (name)? [Y/n]"
Default is Yes. N will reuse the workspace

With following commands workspaces can be listed:
lscm list workspaces -n "NAME" -r "URL"

Make encoding configurable

The encoding should be configurable. If nothing is configured, the default encoding should be used (encoding = None).

This issue is resolved when the encoding is configurable in the config and a wiki-entry has been made about how to configure the encoding properly with the magic.properties

See discussion #26 (comment)

Start migration from an existing prepared workspace vs newly created workspace

Currently I always used an existing prepared workspace to do the migration (in order to have less code to migrate, so that I can test the code faster and solve bugs like #7).
This issue should make it possible that it doesnt matter, if you prepared a workspace or let it create by the migration. The migration should deal well with both situations.

Probably #6 needs to be solved first.

Merge created branches into master branch

Currently branches gets created and pushed.
However they never get merged into the master branch.

This should be done in the end

initialCommit should only take place when something has been changed

There should be two steps for the initial commit. One commit should be just the adding of gitignore.
Another initial commit should only happen if the load of the workspace creates files (e.g a git diff).

Commit comments fail when comment contains ""

Sample-Comment: Im doing some "strange" changes
The git command will fail due missing escaping of the "" in the comment.

Ignore binary files

To prevent repository bloat, migrators want to ignore binary files (if so configured).
The configuration should at least list some file types to ignore, such as .zip or .jar.

Some background ...

These are reasons why to avoid big repositories:
http://blogs.atlassian.com/2014/09/ci-git-repos/

These are tips how to handle big files (if not otherwise possible)
http://blogs.atlassian.com/2014/05/handle-big-repositories-git/

the OldestStream option in configure file

Hi, thank you for you guide.

I am very confused, why there is an option OldestStream?

I want to migrate ALL history from a stream to git. Think about this case: The project only has one stream and all changesets are going to this stream.

So, in my understanding, the steps should be:

create a new git repo
list all changesets
loop the changesets in an order from old to new
    checkout the changeset at a time
    copy and commit to git repo
end loop

Thanks.

Mirror Snapshots in RTC to Tags in GIT

It would be really valuable if snapshots in RTC was represented by tags in the corresponding GIT history.
You'd need to read the list of snapshots of the stream since the workspace doesn't show them, but it should be possible to match the baselines of the components to the snapshots from the stream.

test_getSampleConfig_ExpectInitializedConfigWithDefaultValues fails on linux

For some reason the sample config cant be loaded properly on linux.

Could be a similar problem like reported here (the same stack just with keyerror General)

Can somebody of you have a look on it if you have some spare time? @ohumbel, @romixch or @reinhapa

.jazzignore -> .gitignore

At the moment you have to convert your .jazzignore files by hand.
The tool prompts you to do so at the end of the migration.

The first step would be to do this automatically.
And the final version should also track those files during the migration.

My branch jazzignore is intended to hold the implementation

Link commits with Issue-System (Jira/Github/BitBucket)

If we migrate a RTC project with WorkItems and source code with rtc2jira and rtc2git we would like to maintain existing connections between commits and WorkItems.

In RTC it was possible to assign workitems to changesets and these connection we should keep.

If we have a commit-message on git with a certain pattern (NUMBER: WorkitemDescription), we can add a prefix so that the commits targets the new system.

Due the fact that we keep the same numbers on both systems, a prefix is sufficient and there isnt any need for having a conversion table (oldnumber to newnumber).

Decide if migration should be resumed or is started

At the moment, when something bad happens or you just want to begin a migration on a existing git repo, you need to change code in order to resume the migration see wiki-entry

I think the script should detect if it should resume or not.

Resuming script starts the script at the beginning

My script had been running for quite a while and had successfully created 3 git branches to match 3 of my 5 streams. As I was watching the script's output, I saw it try to get the list of changeentries, and it choked with an out of bounds type exception on splittledlines[somenumber]. (I didn't get a screenshot with the details). I restarted the script, and I noticed that it checked out the first branch in the stream list (not the branch it had been working on) and listed that it was getting 286 changes. I opened the workspace in RTC and confirmed that it was accepting changes in the very first stream listed in config.ini. It's as if the script did not resume where it had left off and instead started from the beginning.

The script then got stuck on a merge conflict, so I stopped it, resolved the conflict in RTC, and restarted the script. This time I saw something about getting changes 1/274. It's like the script resumed where it had left off in this case.

In what cases does the script start from the beginning and in what cases does the script resume where it left off?

"<No comment>" comment is translated to "<No comment to"

It seems that > -> --> are all replaced with to when translating the commit message from Rtc to Git. Is this really necessary? I don't see why > should be disallowed in a git commit message.

Renamed file not renamed during migration

I'm going through and verifying the post-migration content matches what is in RTC. I have a file that I'm assuming was renamed at some point during its history as the file begins with a lowercase in the migrated Git repo but does not in the RTC repo. The content of the files is identical. I'm wondering if the Git commands used in the script did not take into account the file rename? Does that ring any bells with you?

Running migration a 2nd time on an already migrated repository starts from scratch

The situation was as follows:

Complete migration of a stream worked flawlessly
Couple days later, we wanted to keep up with the stream changes and ran the migration again
For some reason, the migration didnt continue and started from 0/the beginning instead

Analysis
We figured out, that the workspace was reset to the oldest state.

Cause
The reason was that the only baseline on this stream didnt contain anything (Initial Baseline).
This means, setting the components to the baseline durnig the migration will reset the whole workspace, therefore resetting the migrated workspace.

The 2nd run of the migration triggered a reset of the components to the baseline.

Solution
Luckily we can detect this situation. If the part until the baseline creation is already migrated, we dont have any changesets to get accepted.

Therefore if we have 0 changesets to accept until the branchpoint, we just dont do anything there and continue with the comparing from the workspace with the stream

@romixch : You can link your commit/fix to this issue
@romixch @ohumbel : I created this issue for documentation purpose

Provide configuration with fallback-values

Everytime we add a new option to the config file, we need to adjust our config files. Even if we dont need this new option.

To improve that, we should define fallback-values for most of the options (there were it makes sense).

Current implementation:
scmcommand = generalsection['ScmCommand']
Implementation with fallback values:
scmcommand = generalsection.get('ScmCommand', "lscm")

describe way how to configure scm

to be able to run lscm/scm tools by rtc you need to have environment variable JAVA_HOME set and add in scm.ini the -vm param.

We should describe that in the wiki and link it from the readme.

@romixch Your part? 👯 😄

Collecting change sets to accept together should be from the same component

When collecting change sets to accept together in case of merge conflicts, only change sets from the same component should be accepted together.

Replace UseProvidedHistory

Thanks to a StackOverflow-User I found out, there is some way accessing all changesets from a component.

Its possible to use the command "lscm list changesets".

With this, the config-flag UseProvidedHistory (https://github.com/WtfJoke/rtc2git/wiki/Getting-your-History-Files) is probably not necessary anymore and should therefore be replaced completely using provided command above.

Fixing this issue, will prevent users from doing a lot of manual work by providing the history files

Ability to use -i when loading a workspace

I have a workspace with multiple components having the same root directory names. So to avoid conflicts I need to specify -i when loading the workspace.

I propose this be made configurable (or maybe just always use -i when loading).

Accept changesby date instead of component

Me as a dedicated migration-person want that the changes get accepted/committed in terms of date/time similar to how they were commited in rtc in order to keep up with the git internal date.

Currently all changes of one component gets accepted (inside of the component ordered by date). After doing that it moves to the next component and repeats the process.

With this issue, the behaviour should change and the changes should only accepted sorted by date, independent of any component

Log everything to a file

It would be nice if everything was logged to a file - not only the accept messages.

Migration fails on Windows when using long paths

Migration on Windows will fail if you have too long paths. The problem is in handle_captitalization_filename_changes where an os.chdir is executed. This will fail if the path is too long:

FileNotFoundError: [WinError 206] The filename or extension is too long

Instead of doing an os.chdir maybe call git ls-files with the folder as argument. Git can be configured to work with long paths like this: git config --system core.longpaths true (this works for me at least).

Handling of changeset-comments with linebreaks

An User reported, that there will be problems, when comments contains line breaks (see #20 (comment))

As a person who is migrating the rtc-repo, want that line breaks in the comments doesn't have any negative effect on the migration, in order that I can run my migration without any problems and the comments will be transferred 1:1 to git.

Configuration of conflict resolver

It would be nice to have an option to control what change sets the conflict resolver picks to accept.

Right now, only change sets belonging to the same author or change sets with "merge" in the comment text are accepted together. I have several examples where change sets from different authors need to be accepted together to resolve a conflict.

IMO, the resolver should continue accepting change sets together until there are no more change sets. - and only then give up. It should succeed at some point.

CLI Support

I want to implement command line support in order to make it easier to start multiple instances with different configurations.

~~The command line should also support resume function, so that it isnt necessary anymore to edit the script.~~

Some sample commands could be:
-c PATHTOCONFIG
~~-r resume~~

Conflicts and outgoing changes

For some unknown reasons the rtc workspace contains conflicts and outgoing changes (despite the fact the skript doesnt check in anything in rtc) after the changes of the next stream gets accepted.

This issue should try to find the cause and/or triy to avoid such behaviour.
I tried to fix this issue already by trying out different approaches, but until now I didnt found any solution.

However this might be an issue of rtc itself... Somehow it seem it cant handle that.

One approach which should be tested is that the workspace gets compared directly with the baseline of the components of the headstream. This would result in a longer migration (each stream would get pulled up to the highest stream instead of branching of while pulling up)

Handling of capitalization breaks if a file in a directory beginning with 'A' is renamed

My situation is as follows:

huo@BISONWS1256:~/stuff/temp/rtc2gitMigration/Architecture$ git status
On branch BP_Architektur_Stream
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    modified:   Architekturdokumentation/build.gradle
    renamed:    Architekturdokumentation/src/main/asciidoc/arc42-template.adoc -> Architekturdokumentation/src/main/asciidoc/BisonProcessArchitekturModernisiert.adoc

huo@BISONWS1256:~/stuff/temp/rtc2gitMigration/Architecture$ git status --porcelain
M  Architekturdokumentation/build.gradle
R  Architekturdokumentation/src/main/asciidoc/arc42-template.adoc -> Architekturdokumentation/src/main/asciidoc/BisonProcessArchitekturModernisiert.adoc

huo@BISONWS1256:~/stuff/temp/rtc2gitMigration/Architecture$ git status -z
M  Architekturdokumentation/build.gradle^@R  Architekturdokumentation/src/main/asciidoc/BisonProcessArchitekturModernisiert.adoc^@Architekturdokumentation/src/main/asciidoc/arc42-template.adoc^@huo@BISONWS1256:~/stuff/temp/rtc2gitMigration/Architecture$

Here the ^@ denotes the zero delimiter.
Note that after the 2nd one there is a capital A which is part of the filename.

This leads to the following traceback:

Traceback (most recent call last):
  File "migration.py", line 86, in <module>
    migrate()
  File "migration.py", line 68, in migrate
    rtc.acceptchangesintoworkspace(rtc.getchangeentriestoaccept(changeentries, history))
  File "/home/huo/gitrepos/rtcTo/rtc2git/rtcFunctions.py", line 213, in acceptchangesintoworkspace
    Commiter.addandcommit(changeEntry)
  File "/home/huo/gitrepos/rtcTo/rtc2git/gitFunctions.py", line 52, in addandcommit
    Commiter.handle_captitalization_filename_changes()
  File "/home/huo/gitrepos/rtcTo/rtc2git/gitFunctions.py", line 74, in handle_captitalization_filename_changes
    os.chdir(directoryofnewfile)
FileNotFoundError: [Errno 2] No such file or directory: '/home/huo/stuff/temp/rtc2gitMigration/Architecture/hitekturdokumentation/src/main/asciidoc'
huo@BISONWS1256:~/gitrepos/rtcTo/rtc2git$

retryacceptincludingnextchangeset broken

retryacceptincludingnextchangeset was recently broken. Now it only accepts nextchangeentry. It should accept change and nextchangeentry together.

License for rtc2git

Hi there,
right now we are also trying to migrate our RTC Content to GIT.

Any plans for a license? We would like to use und extend your software for our one-time migration (contributing our changes back to you)

Cheers

Michael

Check Replacing of InitialBaseLines/OldestStream

Thanks to a StackOverflow-User I found out, that there is some way to find the earliest baseline information of a component.

Following command can be used: lscm list baselines --components

Like that its probably possible to remove the "InitialBaselines" and the "Oldest Stream" options in the config.

This issue can be closed when either one of those or both options can be replaced or when a comment is written about the reason why this cant be accomplished.

Migrate based on change sets

Instead of migrating using baseline comparison I was wondering if it might be easier to just use change sets. So the migration process would be something like:

Create a repository workspace
Discard all change sets
Accept one change set at a time and commit to git for each one

Would this not work?

One-way Bridge

One possible step during migration of a SCM-System is to have both systems running paralell to a certain point.

In that case I want to have an easy way, to keep the git-repository up to date. At the moment you can achieve the same by resume the script, but its a bit of an overhead.

So I like to have a special function, which only compares a workspace to the current stream and accept it one by one.

Loop through migrated branches and compare with stream

When migration is finished, it doesnt contain the most current changes from the stream after the baseline-tagging (eg hotfixes, version-fixes, fixes on certain releases).

I want that at the end of the migration that each branch is compared against his corresponding stream in order to get the latest changes which happend on this stream.

CLIClientException at runtime

I started running the script on my real (not sample) project, and I got a CLIClientException

at runtime. I will poke into this more tomorrow, but thought I'd post today in case there was a known solution.

Streams with spaces in their name lead to all sorts of problems

While the lscm/scm commands are properly quoted, the following problems still exist:

history files are not found
git branches get an invalid name

History file not found: ~/rtc2git/History/History_BT_Spider_'Cross main stream'.txt

Executed Command: "git show-ref --verify --quiet refs/heads/'Cross main stream'"
fatal: 'Cross main stream' is not a valid branch name.

Executed Command: "git push origin 'Cross main stream'_branchpoint"
fatal: 'Cross main stream_branchpoint' is not a valid branch name.
fatal: remote part of refspec is not a valid name in Cross main stream_branchpoint

lcsm is not recognized as an internal or external command

I'm running migration.py, and I'm getting the following error:
'lcsm' is not recognized as an internal or external command, operable program or batch file.

Where can I get lcsm?

"Press any other key to skip this changeset and continue" does not continue

Occasionally, the script will report, "Press Enter to try to accept it with next changeset together, press any other key to skip this changeset and continue." When I press any other key, nothing happens. I have reproduced this several times. I have to stop the script and restart it.

I'm using Windows 7 and running the script in Command Prompt.

Delete Folder Logs when migration is started

To keep all clean, the folder "Logs" should be always deleted when the script is started using initialize

rtcto / rtc2git Goto Github PK

rtc2git's People

Contributors

Stargazers

Watchers

Forkers

rtc2git's Issues

Recommend Projects

Recommend Topics

Recommend Org