Giter Club home page Giter Club logo

django-gdpr-assist's People

Contributors

dependabot[bot] avatar jamesoutterside avatar mserrano07 avatar radiac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

django-gdpr-assist's Issues

Usage with built-in user model

I have an old codebase which still uses the built-in Django User model as a base.

If I add the following code to (actually, where should this code go?!) a models.py somewhere:-

class UserPrivacyMeta:
    fields = ['first_name', 'last_name', 'email']

gdpr_assist.register(User, UserPrivacyMeta)

then makemigrations makes a new migration in the django section of my virtualenv, which of course isn't part of the git repo so isn't committed, which in turn means that it has no effect on the code when I run it on the live system (unless I run makemigrations on there).

This all seems a bit wrong somehow to me. Am I doing something wrong? Is there another way to do this?

Performance optimizations: bulk anonymisation

During testing of bulk anonymisation, there seem to be a few areas where performance can be optimized (although there may be correctness / auditing tradeoffs for some of these).

I'll try to provide some supporting statistics on each of these soon - but as a rough preface, I've been aiming to bring a ~12-hour estimated bulk anonymisation down to less than 3 hours (and ideally reduce it further than that).

Modifications applied so far towards this goal have included:

  • Providing for_bulk=True as an argument to the anonymise method (nb: reduces audit logging)
  • Setting the force=True argument to the anonymise method and flipping the order of the self.is_anonymised() and not force conditionals -- so that no DB exists() query is made when force mode is enabled (nb: does this risk introducing incorrect/circular anonymisation?)
  • Optimizing the anonymiser __getattr__ implementation by using dictionary lookups rather than list iterations to retrieve anonymisers (nb: no evidence of improvements here, yet)

ValueError: Could not find manager CastPrivacyUserManager in gdpr_assist.models

In Django-2.1.2 and python 3.5, in a fresh installation I have overriden the User model:

$ ./manage.py startapp community
$ cat community/models.py
from django.contrib.auth.models import AbstractUser
from django.utils.translation import gettext_lazy as _


class Person(AbstractUser):
    def __str__(self):
        return self.username

I add the code:

    class PrivacyMeta:
        fields = ['email', 'first_name', 'last_name']

Navigating to the admin, returns the error:

Exception Type: ProgrammingError at /admin/logout/
Exception Value: column community_person.anonymised does not exist
LINE 1: ...n"."is_active", "community_person"."date_joined", "community...

Trying to make migrations I receive:

Traceback (most recent call last):
  File "./manage.py", line 15, in <module>
    execute_from_command_line(sys.argv)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/core/management/base.py", line 316, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/core/management/base.py", line 353, in execute
    output = self.handle(*args, **options)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/core/management/base.py", line 83, in wrapped
    res = handle_func(*args, **kwargs)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/core/management/commands/makemigrations.py", line 170, in handle
    migration_name=self.migration_name,
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/db/migrations/autodetector.py", line 44, in changes
    changes = self._detect_changes(convert_apps, graph)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/db/migrations/autodetector.py", line 129, in _detect_changes
    self.new_apps = self.to_state.apps
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/utils/functional.py", line 37, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/db/migrations/state.py", line 210, in apps
    return StateApps(self.real_apps, self.models)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/db/migrations/state.py", line 271, in __init__
    self.render_multiple(list(models.values()) + self.real_models)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/db/migrations/state.py", line 306, in render_multiple
    model.render(self)
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/db/migrations/state.py", line 572, in render
    body.update(self.construct_managers())
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/db/migrations/state.py", line 531, in construct_managers
    as_manager, manager_path, qs_path, args, kwargs = manager.deconstruct()
  File "/home/user/.virtualenvs/my-env/lib/python3.5/site-packages/django/db/models/manager.py", line 65, in deconstruct
    % (name, module_name)
ValueError: Could not find manager CastPrivacyUserManager in gdpr_assist.models.
Please note that you need to inherit from managers you dynamically generated with 'from_queryset()'.

PrivacyMeta docs issue

FYI, the docs specify the method to manually register a model as:

gdpr_assist.register_model(User, UserPrivacyMeta)

However, it seems the correct way is with:

gdpr_assist.registry.register(User, UserPrivacyMeta)

Is that the correct way, or am I missing something?

Anonymising breaks for models with UUID primary key

Hello!
Unfortonately django-gdpr-assist does not play nicely with models that have a UUID primary key.
The resulting issue is, that the library will attempt to parse the pk as an integer and thus the database commit fails, because the "integer" (casted UUID) is too large.

Creates migration for third party models whose manager sets `use_in_migrations=True`, such as built-in User model

I gather this is a regression. Tested with latest master against django 3.1 (and also 3.2)

from django.contrib.auth.models import User

import gdpr_assist

class UserPrivacyMeta:
    fields = ['username', 'email']


gdpr_assist.register(User, UserPrivacyMeta)

then:

$ manage.py makemigrations --dry-run
Waiting for database connection...
Migrations for 'auth':
  /opt/hunter2/venv/lib/python3.8/site-packages/django/contrib/auth/migrations/0013_alter_user_managers.py
    - Change managers on user

This of course relates to #6 but this is with the built-in user model, not a custom one, so no migrations are appropriate in this situation. I don't understand though how this wasn't picked up after fixing #5 though.

Can I suggest as a first step adding a test in which a third-party model like User is registered, and it is checked that no migrations are created in that scenario.

extra migrations for user model - keeps on adding migrations with manager

The problem

Django keeps on adding extra migration for setting user manager for User class which inherits from AbstractUser.

The setup

  • Django==2.2.1
  • django-gdpr-assist==1.0.1

In the examples below I work on MySQL (5.7.23), use Python 3.7.3 and mysqlclient==1.3.13.

How to replicate the problem:

  1. update settings:
INSTALLED_APPS = [
    # ...
    "gdpr_assist",
]

DATABASES = {
    "default": {
        "ENGINE": "django.db.backends.mysql",
        # ...
    },
    "gdpr_log": {
        "ENGINE": "django.db.backends.mysql",
        "NAME": "gdpr_log",
        "HOST": "127.0.0.1",
        "USER": "change_it",
        "PASSWORD": "change_it",
    },
}

DATABASE_ROUTERS = ["gdpr_assist.routers.EventLogRouter"]
  1. make changes to database:
python manage.py migrate
python manage.py migrate --database=gdpr_log

and:

python manage.py makemigrations

gives me 'No changes detected' - so far so good.

  1. add privacy settings for User class, which extends AbstractUser - I'm trying to make private only fields defined on AbstractUser, not my on User.
class User(AbstractUser):
    some_new_field_not_important_actually = models.BooleanField(default=False)

class UserPrivacyMeta:
    fields = [
        "first_name",
        "last_name",
    ]

from gdpr_assist import register
register(User, UserPrivacyMeta)
  1. make migrations:
python manage.py makemigrations

gives me:

Migrations for 'users':
project/users/migrations/0058_auto_20190716_1308.py

  • Change managers on user
  • Add field anonymised to user

The interesting thing for me to note was that the manager seems to have changed.

  1. migrate:
python manage.py migrate
python manage.py migrate --database=gdpr_log

and again - migrations have been applied in both cases, looks good:

Applying users.0058_auto_20190716_1308... OK

  1. and now when I try to make migrations again:
python scripts/manage.py makemigrations

I get:

sym_poc/users/migrations/0059_auto_20190716_1332.py

  • Change managers on user

WHY? The change that Django is trying to make relates only to changing manager, and not to changing the anonymised field (which has already been added to the database).

Feature discussion: deletion of records from model tables

We have a use case where a Django application manages some API and session tokens, and we'd like to remove them from the anonymised (manage.py anonymise_db) version of the database. Replacing them with mock data doesn't seem to make sense, especially for the session tokens.

There are ways to do this outside of the library and Django: we could, for example, perform post-anonymization SQL commands to truncate the relevant tables, and/or exclude the API token tables from the application database backup/restore processes. The main benefits to an in-library solution would be convenience and consistency (one command and one layer of configuration + PrivacyMeta to manage bulk data migration from pristine to anonymised).

Has django-gdpr-assist considered adding support to clear the contents of model tables during the manage.py anonymise_db step, and/or does this seem to make sense as a context and feature request?

object.anonymised missing and object.is_anonymised() not updating properly

After trying to create a simple unit test to validate anonymisation I ran into the following issues:

Acording to the documentation the object should have a anonymised property/field but that doesn't seems to be the case.
https://django-gdpr-assist.readthedocs.io/en/latest/anonymising.html
After some debugging I found the method obj.is_anonymised() method which I guess is a replacement? In either case calling
obj.anonymise() followed directly by obj.is_anonymised() yields "False".
If I requery the model it yields True as expected.

I am running python 3.6
my unit test looks roughly like this:

def test_anonymise(self):
        address = Address( name=... )
        address.save()
        self.assertEqual(address.is_anonymised(), False)

        address.anonymise()
        self.assertEqual(address.is_anonymised(), True)

Thx for the great library. really appreciate it guys :)

Improve support for models to be exported but not anonymised

There are sometimes models which contain PII so need to be exported, but anonymising them doesn't make sense - mailing lists for example.

We could link the post-anonymise event to delete those objects, but that sounds like it could be unexpected and dangerous.

Lets look at adding a can_anonymise=True flag to the PrivacyMeta class; if it's True we could either:

  • anonymise everything else and show a warning that it hasn't been anonymised
  • show a confirmation page asking if they want to leave or delete those objects (with a standard admin warning about what else will be cascade-deleted).

We should also make can_anonymise=False models not get the anonymised field option - looks odd in an AdminModel which shows all fields.

Support for Django 3.0?

Hi. Great project you guys have here. Are there any plans for adding support for Django 3.0 to this project?

Upgrading from 1.3 to 1.4

Hi. After updating to version 1.4.0 I started receiving this error when starting django:

RuntimeError: Registered gdpr_assist model Users manager specified 'use_in_migrations=True', with no name provided.

Is there a guide on what changes need to be done in order to upgrade to version 1.4.0?

Suggestion: use queryset iterators during bulk model anonymisation

During testing, we've seen some high memory usage in the library when it is applied to models that have a large number of object records stored in the database.

This has been traced to the model.objects.all().anonymise() call in anonymise_db which uses a queryset but appears to cache a significant amount of query metadata before anonymisation of the first object takes place.

Since we have once-only usage semantics for the model objects in the anonymise_db use case, we could use a Django queryset iterator to reduce memory consumption.

Inconsistant method resulution for model inheritance

It seems like this product does not work with inheritance.
When I run the code below (just running makemigrations command) I get the following error:
TypeError: Cannot create a consistent method resolution
order (MRO) for bases PrivacyModel, ModelA

class ModelA(models.Model):
    a = models.CharField(max_length=50, default="Hello")

    class PrivacyMeta:
        fields = ["a"]

class ModelB(ModelA):
    b = models.BooleanField(null=True, default=True)

    class PrivacyMeta:
        fields = ["b"]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.