Comments (2)
@dmitry I agree with this comment. I implemented a home-grown solution at a previous company, not very elegant but following these basic ideas:
- database would be a copy from production
- all user emails replaced with
<user.id>@example.com
- all password hashes replaced with nil or some garbage
- except for several internal accounts (admin, qa, and other internal users)
- transaction amounts, phone, address, etc as you have said would be replaced with dummy
- except for those linked to above internal accounts
The approach would be cleaner with this gem, but these were some of the objectives:
- I needed to dump the database on the (very secure) production environment
- did this with a nightly cron task
- this also acted as a backup
- from this dump, loaded a separate database instance (also in production)
- separate database is where the anonymizing tasks above were performed
- then dump the anonymized database to a file, and move to a separate user account location
- user account was accessible to allowed users via SSH keys within VPC
- a symlink like
latest_db_dump.sql.gz
was created
This left me with:
- a nightly production backup dump in a secure location not accessible to anyone other than admins
- a nightly refresh of the anonymized db that was SSH-accessible to developers, qa, etc at a known endpoint/filename
- finally, a task (like the ones in this gem) that could load the dump to replace non-production instances
Complicated, to be sure, but we were dealing with people's money, and all sorts of demanding security requirements.
Another consideration (just for completeness... :-)
When the production database gets big, it would be great (albeit hard) to take a sample of the database. Of course the problem is that you can't just take any old records -- you need a set of related records. I think if one was diligent about declaring Rails model relationships, it could be possible to follow a key entity or two (maybe User
) and pull some sample of relationships (e.g. User has many transactions, and has a profile, and roles, and so on ... all depending on your schema).
from capistrano-db-tasks.
@tomharrisonjr thank you for the detailed explanation of your inside. I will try to split all those points into a tasks when I get enough time :)
from capistrano-db-tasks.
Related Issues (20)
- undefined method `zone' for Time:Class HOT 2
- NoMethodError: undefined method `zone' for Time:Class HOT 2
- SSHKit::Runner::ExecuteError "rails exit status: 127" in versions 0.5 and 0.6 HOT 11
- Net::SCP failure when db_dump_dir is set HOT 2
- NoMethodError: undefined method `[]' for nil:NilClass HOT 1
- DB:pull 127 error HOT 10
- how do I pass credentials to pg_dump?
- Option to drop database before import HOT 2
- Note in Docs for use with capistrano-rvm
- istrano HOT 2
- cap staging db:pull causes this problem :( HOT 2
- Serial id lost after db push HOT 2
- scp should happen before remote clean HOT 1
- Doesn't work on winows 10 sub-systems (/usr/bin/env: ‘ruby.exe’: No such file or directory)
- Strip adapter check? HOT 2
- Can not accept `db_dump_dir` configuration HOT 1
- Append the `db:download` task
- Multi database support is broken
- Question to users HOT 2
- `db_ignore_data_tables` setting doesn't work on MySQL
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from capistrano-db-tasks.