Giter Club home page Giter Club logo

git-repo-sync's Introduction

git-repo-sync

Synchronization of Remote Git-repositories

The git-repo-sync synchronizes branches between two remote Git-repositories.
It is like you have two entry points to a single repository and your two remote Git-repositories will be behaving as a single repository.

git-repo-sync implemented as a bash script.

The main idea of this tool is to install, auto-run periodically and forget.

Use cases

  • Adhesion of Git-repositories of a client and a software/support supplier.
    • Access to your Git remote repository is restricted by your local network.
    • After completing of some work, remote access to your Git repository could be terminated.
  • Provides an independence from your base remote Git repository if it is slow and could be out of service time after time.
  • You software teams have independent Git remote repositories.

How it works

Copy git-repo-sync somewhere

git clone https://github.com/it3xl/git-repo-sync.git

Let git-repo-sync know location of your remote Git repositories.
Modify url_a and url_b variables in default_sync_project.sh.
You can use URL-s and file paths.

url_a=https://example.com/git/my_repo.git

url_b='/c/my-folder/my-local-git-repo-folder'

Run periodically the git-sync.sh file, which is located in the root of git-repo-sync.
The git-sync.sh will tell you if there are any troubles. The main among them is you need to update awk to gAWK on Ubuntu.

Trade off. Redo a Git-commit in case of a conflict

What if you're working on the same branch with another teammate who is working through the other side repository?
What if you both commit at the same time at the same branch?
The git-repo-sync will decide who wins and who loses in this conflict.
Let's say if you run git-repo-sync once in 2 minutes.
Then update your local Git-repository after 2 minutes and check your last commit.
The losing commit will be deleted from both your remote repositories and will only remain in your local repository.
Nothing wrong with that. Just repeat your commit above the winning commit of your teammate.
Use Git-merge, rebase or cherry-pick and do a Git-push of your changes again.
This is a quite rare situation in the Agile World and more related to the Waterfall development, but you have to know.

On Linux

Run git-sync.sh and it will tell you what git-repo-sync needs.
In most cases you have to install gAWK. This applies to Ubuntu.
Docker Alpine Linux images require bash and gAWK to be installed.
You have to update the bash if you use an extra old Linux distro.

I'm the Windows guy

Ha! You're lucky. Unlike Linux guys, you have to do nothing and have five options to run git-repo-sync.

Open PowerShell or CMD in the git-repo-sync folder and run one of three.

"C:\Program Files\Git\bin\bash.exe" git-sync.sh
"C:\Program Files\Git\usr\bin\bash.exe" git-sync.sh
"C:\Program Files\Git\git-bash.exe" git-sync.sh

Or you can reinstall Git and integrate the bash into your Windows during installation. Then run

bash  git-sync.sh

Or you can try to update the PATH environment variable. Try to add the following (that wasn't tested by me)

;C:\Program Files\Git\cmd;C:\Program Files\Git\mingw64\bin;C:\Program Files\Git\usr\bin

Do not synchronize all branches

Despite that there are fair cases when it is useful to sync all branches, this is not always a good idea.
Some well know Git-servers block some branches in different ways. Some of them create "trash"-branches which you do not want to see synchronized.

So, you can synchronize branches that have special prefixes only.
You could configure these prefixes in default_sync_project.sh configuration file.
What's important, these prefixes are related to correspondent synchronization strategies.

The Victim Sync Strategy

By default all branches are synced under a Victim Synchronization Strategy.
You can do whatever you want with such branches from both remote sides (repositories).
In case of commit conflicts, any newest commit will win.
You can relocate branches to any position, delete and move them back in history if you run git-repo-sync regularly.
Use the following variable to limit synced branches.

victim_branches_prefix=@

The most common value of victim_branches_prefix is "@".
In this case the following branches will be synchronized: @dev, @dev-staging, @test, @test-staging, @my-feature.

The Conventional Sync Strategy

By using this strategy you limit what your teammates may do from another side repository with branches on your side remote repository.

Branches with the following prefix will be owned by the repo from url_a variable. Let's call it A side.

side_a_conventional_branches_prefix=client-

Branches with the following prefix will be owned by the repo from url_b variable. Let's call it B side.

side_b_conventional_branches_prefix=vendor-

Other examples of prefix pairs: a-, b-; microsoft/, google/; foo-, bar-;

On the owning side repo: You can do whatever you want with such branches.

On a repo of another side:
You can do fast-forward updates and merges.
You can move such branches back in Git-history if you run git-repo-sync periodically.

All commit conflicts will be solved in favor of the owning side.

Other Unimplemented Sync Strategies

There are other interesting sync and conflict solving approaches.
For example when you don't lose your conflicting commits in your remote repositories and other teammates can resolve your conflicts after/for you.
Also it is useful if you have a stubborn Git-server that blocks updating commits in different ways.
But the Victim and Conventional approaches cover the most important cases fairly well.

Disaster Protection

People have to make mistakes to become better. This is normal. But let's protect our clients from such the mistakes.
Define sync_enabling_branch variable

sync_enabling_branch=it3xl_git_repo_sync_enabled

Its value may represent any branch name.
Examples: @test, client-prod, vendor-master, it3xl_git_repo_sync_enabled.

The git-repo-sync will check if such a branch exist in both remote repositories and that it has the same or related commits, i.e. its commits are located in the same Git-tree.
This will protect you from occasional adhesion of unrelated git-repositories and deletion of branches that have the same names.
Git may store many independent projects (trees) in the same repository and this is uncommon behavior for many users.

I advise to use it3xl_git_repo_sync_enabled branch name to make this explicit for others that their remote Git-repo is synchronized with another remote repo.
They could search for the word it3xl_git_repo_sync_enabled in the Internet and understand the applied sync solution.

Be aware that a branch mentioned in the sync_enabling_branch variable will be alwasy synchronized by git-repo-sync.
Probably this is not a good idea to specify here the master branch name because a branch mentioned in sync_enabling_branch will be synchronized under the Victim strategy. But you can specify there a branch with one of your conventional prefixes for the Conventional syncing of it. For example client-master.

Notes

  • Usage of SSH wasn't tested.
  • git-repo-sync is resilient for HTTP fails and interruptions.
  • It has protections from an occasional deletion of your entire remote repository.
  • Arbitrary Git-history rewriting is supported.
  • Within a single installation, git-repo-sync can synchronize as many pairs of Git-repositories as you want. Every sync pair is a sync project for git-repo-sync.
  • git-repo-sync doesn't synchronize Git-tags. (Some popular Git-servers block manipulations with Git-tags.)
  • git-repo-sync is developed within the TDD approach. Therefore, its CI/CD has a huge amount of auto tests.

Automation support

  • git-repo-sync works with remote Git repositories asynchronously, by default.
  • It works faster under *nix OS-es because Git-bash on Windows is slower. But compare to network latency, this is nothing.
  • You can separate change detection and synchronization phases of git-repo-sync for readability of build logs.
  • Multiple configuration capabilities are supported. Environment, configuration files, combination of them.
  • Integration with bash Git Credential Helper - git-cred to obtain credentials from a parent shell environment.
  • You shouldn't do anything in case of connectivity fails. Continue to run git-repo-sync periodically and everything will be restored automatically.
  • After every synchronization, analyze notification files to send notifications about branch deletions or commit conflict solving.
    See git-repo-sync/sync-projects/<your-sync-project-name>/file-signals/
    • notify_solving - for conflict solving
    • notify_del - for deletions
  • See instructions on how to configure more synchronization pairs of remote Git repositories.
  • Number of pairs is unlimited. Every pair is a separate sync project.

Required Specification

  • Use any Linux, Windows or Mac machine.
  • Install Git.
  • For users of *nix OS-es.
    • update bash on old Linux distros.
    • check that gAWK (GNU AWK) is installed on your machine. Consider this case if you are going to update mAWK to gAWK on Ubuntu.
  • Tune any automation to run git-repo-sync periodically - crones, schedulers, Jenkins, GitLab-CI, etc. Or run it yourself periodically.

Contacts

It would be great if you could help me to improve the above documentation in response to your setup experience.
In any case, ask any questions. My contacts is here - it3xl.ru

git-repo-sync's People

Contributors

it3xl avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.