auto-mat / django-import-export-celery Goto Github PK

View Code? Open in Web Editor NEW

171.0 171.0 66.0 593 KB

Run django-import-export processes in celery

License: GNU Lesser General Public License v3.0

Makefile 0.44% Python 97.72% Shell 0.74% Dockerfile 0.30% HTML 0.81%

django-import-export-celery's People

Contributors

Stargazers

Watchers

django-import-export-celery's Issues

Unable to export large amount of data

Hi, I am not able to export a large amount of data(10 millions of rows).
the system gets a timeout issue and takes a massive amount of memory.
JSON list of pks to export: Instead of ids, can we save the query.

IMPORT_DRY_RUN_FIRST_TIME feature is not released, while being in a master

Please, make a patch release with this change - 1ecabe2

The change was merged to master a month ago.
The feature is really needed.

Release new version 1.1.5 to PyPi

Please release new version 1.1.5 to PyPI in case current release has bug which had been already fixed in current code version.

How to show progress of import in dry_run=True?

if row_number % 100 == 0 or row_number == 1: change_job_status( import_job, "import", "3/5 Importing row %s/%s" % (row_number, len(dataset)), dry_run, )

django-import-export-celery/import_export_celery/tasks.py

Line 80 in 67b9fa6

def before_import_row(self, row, **kwargs):

This does not work for dry run. I think it is due to atomic transaction so the database is not updated when doing a dry run . It work only when dry_run = False and using_transactions = False

"Models aren't loaded yet"

Following the instructions on how to setup exports, I am running into a "Models aren't loaded yet." error.

I added the Resource in the models.py file, but I have more than one app with models.py files.

The error is caused from the ResourceModel class defined at the very end of the models.py file.

This is the classmethod I added to the model:

class Ride(models.Model):
    @classmethod
    def export_resource_classes(cls):
        return {
            'all rides': ('Ride resources', RideResource),
        }

this is the RideResource at the end of the file:

class RideResource(ModelResource):
    class Meta:
        model = Ride

pipenv doesn't seem to be pulling in django-author dependency.

Add more efficient way to export all records in a very large table

Say I have a table with 2.5 million rows - the way this library works is that it sends a list of primary keys of all objects to export in the job-creation POST request. For very large sets of objects, uploading megabytes of primary keys can be quite slow or even run up against POST request body-size limits or other misc edge cases.

So, I think it could be useful to efficiently handle the special case of exporting everything.

Perhaps this looks like one of the django-admin gray action buttons w/ a confirmation page which creates the job.

Changes likely required:

possible modification to the export job model to represent "all"
adding a confirmation view page for exporting all
adding a django admin mixin to add the grey action button
docs update

Happy to send a PR if you think this is a good addition.

Add enough tests and set up CI enough that this project could be accepted by django-jazzband

This project has grown to the point where it might be reasonable to request inclusion in https://jazzband.co/about/guidelines
That means there need to be

tests
documentation
a GitHub action to run CI

Job finished but no objects created

Job status info: 5/5 Import job finished

I see thousands of rows in the import summary with the change_type new but nothing was added to the database.

Change the email sent to user to another language or text.

Is it possible to change the email values based on some variables or based on the language used in the app?

Maintainers wanted

Hello,
it seems this project is getting lasting attention. It deserves more maintainers. If you'd like to step up to the plate, just comment below. :)
Thanks

TypeError: argument of type 'NoneType' is not iterable

Hi,

I am trying to run the project file inside examples folder, before that i want to run the celery workers, so i tried to run the celery and i getting the below error.

I think i am getting from task.py file. can you please help.

TypeError: argument of type 'NoneType' is not iterable

I am using Python 3.7x
django 2.1.8

Regards,
Yogesh

Creating async import_jobs via code

Is there a command or function that can be used to create an import job ?

Use case : data from single CSV is being used to populated multiple related tables. It would be a cleaner interface if the user can update all the related tables with a single upload.

In order to do so, we've created resources for the related tables in the after_import of the parent resource. The issue is the imports for the related tables are locked to the parent import and cannot be spun into separate threads. Is there a way to create an celery-import job with code?

Add changelist template

Currently the way import/export works is with an action, from the action dropdown.

The problem here is that sometimes you want to have custom logic that is not based on selecting elements from a model list. In my case I need an export action that exports hundreds of thousand of rows into csv - and the user needs to make a selection in a dropdown but not pass a list of pks. Instead I am overriding the get_queryset method of the resource in question.

Troubles exporting huge table

Hello, I've faced troubles while exporting the table with 180.000+ rows.
Here is a traceback:

[2022-06-23 15:06:51,905: ERROR/ForkPoolWorker-17] Task import_export_celery.tasks.run_export_job[9ed87b13-927c-4acc-aad3-fa9d3a1243ea] raised unexpected: MemoryError()
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/import_export_celery/tasks.py", line 225, in run_export_job
    serialized = format.export_data(data)
  File "/usr/local/lib/python3.9/site-packages/import_export/formats/base_formats.py", line 88, in export_data
    return dataset.export(self.get_title(), **kwargs)
  File "/usr/local/lib/python3.9/site-packages/tablib/core.py", line 427, in export
    return fmt.export_set(self, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/tablib/formats/_csv.py", line 33, in export_set
    return stream.getvalue()
MemoryError   
[2022-06-23 15:06:54,726: ERROR/MainProcess] Pool callback raised exception: MemoryError('Process got: ')
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 1796, in safe_apply_callback
    fun(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/celery/worker/request.py", line 730, in on_success
    return self.on_failure(retval, return_ok=True)
  File "/usr/local/lib/python3.9/site-packages/celery/worker/request.py", line 545, in on_failure
    raise MemoryError(f'Process got: {exc}')
MemoryError: Process got:

It crashes at the very end of exporting:

I have no idea what to do. Here is what command 'top' showing at the end of the process while crashing. It seems my server has enough memory:

Concurrency is not specified. Celery takes 16 as default (as server has).

For example, model with 40.000 rows exported successfully and much faster.

Any ideas, please?

How to turn off dry option for importing models data

I was looking for some option so that on form save for importing model data, dry run could be silenced or turned off. But so far didn't find any other way except re-running it using Perform Import admin action.

So, I would suggest to control this settings from django settings configuration. So, by default it'll keep it turned on. But if it finds settings from django then it'll use that option. For instance, settings.IMPORT_DRY_RUN could be used like below for the post_save signals for import job

@receiver(post_save, sender=ImportJob)
def importjob_post_save(sender, instance, **kwargs):
    dry_run=True
    if settings.IMPORT_DRY_RUN:
       dry_run=settings.IMPORT_DRY_RUN
    if not instance.processing_initiated:
        instance.processing_initiated = timezone.now()
        instance.save()
        transaction.on_commit(lambda: run_import_job.delay(instance.pk, dry_run=dry_run))

@timthelion

Renaming the app and hide import

Hello!

I think the AppConfig setup is missing, because there is no defined app name.

So how to rename the app for the backend users?
And how to remove the import admin page in the menu?

Move the `queryset` field to a FileField

django-import-export-celery/import_export_celery/models/exportjob.py

Line 80 in 67b9fa6

queryset = models.TextField(

would be better as a filefield that took a CSV file to allow one to manually create a list of items to be exported. Even better would be if there could be columns, normally pk would be set, but it could be, for example email and a list of email adresses. Or even more interestingly, it could be like a single row paid and the only value would be True and it would export all Paid users....

Celery 5 support

Hi there,

I've noticed that this package doesn't support celery 5 since the celery.task module has been deprecated which is used in the tasks module here.

I am happy to take a look to see if there is an option to fix this in a PR but I thought I'd raise it here to see if you have any thoughts.

Thanks,
Matt

[Dry run] Error reading file

After import job succeeded, I just have an error

Error reading file: [Errno 2] No such file or directory: 'project/django-import-export-celery-import-jobs/file.csv'

The file is well saved in ImportJob instance. This means in 'project/media/django-import-export-celery-import-jobs/file.csv' and the import celery doesn't find it (from MEDIA_ROOT).

Unable to see add button under export screen

Hi, I am not able to see the add button in the django-export screen. Do i need to give any extra permission?
after modifying "has_add_permission" function(return true) . i was able to see the add button. but still would not see module information.

Tags missing

PyPI has v1.1.3 , however there is no tag here on GitHub for that release.

Importation job doesn't start

Hi all, I have been trying to integrate this library into my Django app with celery 4.4 . Everything seem to be going smooth with these configurations

# CELERY STUFF
    REDIS_HOST = 'localhost'
    REDIS_PORT = '6379'
    BROKER_URL = 'redis://' + REDIS_HOST + ':' + REDIS_PORT + '/0'
    BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': 3600}
    CELERY_RESULT_BACKEND = 'redis://' + REDIS_HOST + ':' + REDIS_PORT + '/0'
    CELERY_TASK_SERIALIZER = "json"
    CELERY_RESULT_SERIALIZER = "json"
    REDIS_URL = os.environ.get('REDIS_URL', 'redis://redis')
    IMPORT_EXPORT_CELERY_INIT_MODULE = "beaver.celery"
    IMPORT_EXPORT_CELERY_MODELS = {
        "Price": {
            'app_label': 'prices',
            'model_name': 'Price',
        }
    }

redis-cli is perfectly responding to PING. But I'm stuck with nothing.

Could you suggest where I can look for the problem ? Thanks.

Override Import Job from.

There is import job form as in django_import_export library we override import form but in this we change Importjobform layout but cannot get the values of field.

Support for django 4.0

In django 4.0 ugettext_lazy was renamed to gettext_lazy

TypeError: NoneType takes no arguments

It happends to me randomly when starting export job. The job runs then when clicking on it (Run with Celery) once again in ExportJobs in Django Admin. Any ideas?

2021-02-23T06:17:09.585791+00:00 app[worker.1]: [2021-02-22 22:17:09,585: ERROR/ForkPoolWorker-7] Task import_export_celery.tasks.run_export_job[6bbc3607-6179-4ab6-b3e1-a23bb82eb4c4] raised unexpected: TypeError('NoneType takes no arguments')
2021-02-23T06:17:09.585792+00:00 app[worker.1]: Traceback (most recent call last):
2021-02-23T06:17:09.585793+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 412, in trace_task
2021-02-23T06:17:09.585793+00:00 app[worker.1]:     R = retval = fun(*args, **kwargs)
2021-02-23T06:17:09.585794+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 704, in __protected_call__
2021-02-23T06:17:09.585794+00:00 app[worker.1]:     return self.run(*args, **kwargs)
2021-02-23T06:17:09.585794+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/sentry_sdk/integrations/celery.py", line 171, in _inner
2021-02-23T06:17:09.585794+00:00 app[worker.1]:     reraise(*exc_info)
2021-02-23T06:17:09.585795+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/sentry_sdk/_compat.py", line 57, in reraise
2021-02-23T06:17:09.585795+00:00 app[worker.1]:     raise value
2021-02-23T06:17:09.585796+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/sentry_sdk/integrations/celery.py", line 166, in _inner
2021-02-23T06:17:09.585796+00:00 app[worker.1]:     return f(*args, **kwargs)
2021-02-23T06:17:09.585796+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/import_export_celery/tasks.py", line 199, in run_export_job
2021-02-23T06:17:09.585797+00:00 app[worker.1]:     class Resource(resource_class):
2021-02-23T06:17:09.585797+00:00 app[worker.1]: TypeError: NoneType takes no arguments

schedule to Import data

Could we have the function to schedule to Import data to Django? For example, import data and update report for every 30 minutes?
Thanks,

Catch TimeLimitExceeded exception

Job status info: [Dry run] 4/5 Generating import summary
The job status info was stuck on this step and when I checked the logs, I found TimeLimitExceeded exception.

2020-06-23T16:53:24.028548585Z app[worker.2]: [2020-06-23 09:53:24,027: ERROR/MainProcess] Task handler raised error: TimeLimitExceeded(300)
2020-06-23T16:53:24.028599784Z app[worker.2]: Traceback (most recent call last):
2020-06-23T16:53:24.028605055Z app[worker.2]:   File "/app/.heroku/python/lib/python3.7/site-packages/billiard/pool.py", line 684, in on_hard_timeout
2020-06-23T16:53:24.028609276Z app[worker.2]:     raise TimeLimitExceeded(job._timeout)
2020-06-23T16:53:24.028625581Z app[worker.2]: billiard.exceptions.TimeLimitExceeded: TimeLimitExceeded(300,)
2020-06-23T16:53:24.036035220Z app[worker.2]: [2020-06-23 09:53:24,035: ERROR/MainProcess] Hard time limit (300s) exceeded for import_export_celery.tasks.run_import_job[45422346-1572-4e69-ad7e-77a488cbf2ee]

A quick change to the config should fix the issue but I think we should catch the exception and add it to the job status info.

Issue with dry_run and skip_diff

I'm seeing a dry run, using skip_diff or skip_html_diff resulting in the error;

Import error 'NoneType' object is not iterable

This comes about here, in _run_import_job;

    if dry_run:
        summary = "<html>"
        summary += "<head>"
        summary += '<meta charset="utf-8">'
        summary += "</head>"
        summary += "<body>"
        summary += '<table  border="1">'  # TODO refactor the existing template so we can use it for this

        if not result.invalid_rows:
            cols = lambda row: "</td><td>".join([field for field in row.diff])

If skip_diff or skip_html_diff are enabled, the row has no attribute diff.

update project setup to make collaboration easier

Hi There,

So i forked this repo to make some updates and I encountered some difficulties with the setup- I would like to suggest the following updates:

migrate to using poetry instead of setup.py, and then add the dependencies and dev dependencies at the root level.
migrate to using github actions, works with poetry: https://github.com/snok/install-poetry
add pre-commit
add contribution guidelines
add unit tests - currently there is really nothing

I would be glad to help with some of these if you are interested.

Does this package allow for automatic task queue execution?

This package seems interesting, however, I have concerns about automatic task queue handling. Let's say I upload a dataset and it creates a celery task/job for the same thing. I want it to be executed automatically without me having to select it and execute it from the admin panel as shown in the screenshots.

Latest change (ugettext_lazy) not published to pypi

i have compared the code from exportjob.py in github agaist the pypi and this file is not updated in pipy

Getting django.core.exceptions.ImproperlyConfigured error in Celery

Below is the complete error log in the Celery Server.

[2019-08-04 17:46:26,578: INFO/MainProcess] Received task: import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b]
[2019-08-04 17:46:26,588: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x7f329f349400> (args:('import_export_celery.tasks.run_import_job', 'cb18c31e-fee4-4072-be25-d46d07f0798b', {'lang': 'py', 'task': 'import_export_celery.tasks.run_import_job', 'id': 'cb18c31e-fee4-4072-be25-d46d07f0798b', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'retries': 0, 'timelimit': [None, None], 'root_id': 'cb18c31e-fee4-4072-be25-d46d07f0798b', 'parent_id': None, 'argsrepr': '(2,)', 'kwargsrepr': "{'dry_run': True}", 'origin': 'gen1874@ubuntu-bionic', 'reply_to': '63856816-dad3-32bc-a9b9-c392f1929017', 'correlation_id': 'cb18c31e-fee4-4072-be25-d46d07f0798b', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}}, b'[[2], {"dry_run": true}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]', 'application/json', 'utf-8') kwargs:{})
[2019-08-04 17:46:26,598: INFO/ForkPoolWorker-4] import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b]: Importing 2 dry-run True
[2019-08-04 17:46:26,631: DEBUG/MainProcess] Task accepted: import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b] pid:2050
[2019-08-04 17:46:26,726: DEBUG/ForkPoolWorker-4] params: (2,)
[2019-08-04 17:46:26,727: DEBUG/ForkPoolWorker-4]
sql_command: SELECT "import_export_celery_importjob"."id", "import_export_celery_importjob"."file", "import_export_celery_importjob"."processing_initiated", "import_export_celery_importjob"."imported", "import_export_celery_importjob"."format", "import_export_celery_importjob"."change_summary", "import_export_celery_importjob"."errors", "import_export_celery_importjob"."model", "import_export_celery_importjob"."author_id", "import_export_celery_importjob"."updated_by_id" FROM "import_export_celery_importjob" WHERE "import_export_celery_importjob"."id" = %(0)s
[2019-08-04 17:46:26,866: DEBUG/ForkPoolWorker-4] Find query: {'filter': {'id': {'$eq': 2}}, 'projection': ['id', 'file', 'processing_initiated', 'imported', 'format', 'change_summary', 'errors', 'model', 'author_id', 'updated_by_id']}
[2019-08-04 17:46:26,882: DEBUG/ForkPoolWorker-4] import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b]: None
[2019-08-04 17:46:27,134: ERROR/ForkPoolWorker-4] Task import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b] raised unexpected: ImproperlyConfigured()
Traceback (most recent call last):
File "/home/vagrant/env/lib/python3.6/site-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/vagrant/env/lib/python3.6/site-packages/celery/app/trace.py", line 648, in protected_call
return self.run(*args, **kwargs)
File "../import_export_celery/tasks.py", line 61, in run_import_job
result = resource.import_data(dataset, dry_run=dry_run)
File "/home/vagrant/env/lib/python3.6/site-packages/import_export/resources.py", line 573, in import_data
raise ImproperlyConfiguredNote:

django.core.exceptions.ImproperlyConfigured

its comming from the task.py file from the below mentioned code.

result = resource.import_data(dataset, dry_run=dry_run)

Note: I am using Vagrant not Dockers and i am using mongoDB

logged in to Admin
Uploaded the CSV file in examples folder
Clicked on Save and Continue editting
In Celery terminal i got the above error.

Thank you and Regards,
Yogesh

django.core.exceptions.FieldDoesNotExist: ExportJob has no field named 'job_status_info'

During file export, I am getting below error.
ExportJob has no field named 'job_status_info'

/lib/python3.6/site-packages/Django-2.2.3-py3.6.egg/django/db/models/options.py", line 567, in get_field
raise FieldDoesNotExist("%s has no field named '%s'" % (self.object_name, field_name))
django.core.exceptions.FieldDoesNotExist: ExportJob has no field named 'job_status_info'

During handling of the above exception, another exception occurred:

No module named 'project'

Hi Team,

This library looks great, I would love to use it for my project! Unfortunately, when I add 'import_export_celery' to my INSTALLED_APPS, I get a ModuleNotFound error due to the expectation that the folder containing celery.py is called 'project' (from __init__.py).

Renaming this folder would require a huge amount of effort. Any way that we could use a setting to control the top-level directory name?

Also, in case I'm interpreting wrong the stack trace is below. I am running Django 3.0 with Gunicorn and Python 3.8. The project is containerized with two seperate Celery worker containers.

Stack Trace:

backend              | Traceback (most recent call last):
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
backend              |     worker.init_process()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/workers/base.py", line 129, in init_process
backend              |     self.load_wsgi()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
backend              |     self.wsgi = self.app.wsgi()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/app/base.py", line 67, in wsgi
backend              |     self.callable = self.load()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
backend              |     return self.load_wsgiapp()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
backend              |     return util.import_app(self.app_uri)
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/util.py", line 350, in import_app
backend              |     __import__(module)
backend              |   File "/home/default/delta_mvp/wsgi.py", line 11, in <module>
backend              |     application = get_wsgi_application()
backend              |   File "/opt/venv/lib/python3.8/site-packages/django/core/wsgi.py", line 12, in get_wsgi_application
backend              |     django.setup(set_prefix=False)
backend              |   File "/opt/venv/lib/python3.8/site-packages/django/__init__.py", line 24, in setup
backend              |     apps.populate(settings.INSTALLED_APPS)
backend              |   File "/opt/venv/lib/python3.8/site-packages/django/apps/registry.py", line 91, in populate
backend              |     app_config = AppConfig.create(entry)
backend              |   File "/opt/venv/lib/python3.8/site-packages/django/apps/config.py", line 90, in create
backend              |     module = import_module(entry)
backend              |   File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
backend              |     return _bootstrap._gcd_import(name[level:], package, level)
backend              |   File "/opt/venv/lib/python3.8/site-packages/import_export_celery/__init__.py", line 1, in <module>
backend              |     from project.celery import app as celery_app
backend              | ModuleNotFoundError: No module named 'project'

Celery memory leak with large files

I am facing a blocking issue using django-import-export and django-import-export-celery.

Context

I need to import large CSV files (about ~250k lines) to my database.
I work on my local environment and I have a few other's available (dev, staging, prod).

Issue

When I perform the import on my local environment, the import is quite long but it eventually works great.
But each time I try to perform an import on dev environment I get this error from celery:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
    human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).

It seems to be a memory usage issue but I can't figure out why it occurs as I tried many to change settings (celery max-tasks-per-child option, celery max-memory-per-child option, DEBUG Django setting).

Also, I tried by increasing my instance memory up to 13Gb (1Gb before) but still, the error occurs.

Questions

Do you have any insight that I can use to solve my issue ?
Is a 250k lines file too much ?
Are my celery settings bad ?

How do you load a resource in IMPORT_EXPORT_CELERY_MODELS?

IMPORT_EXPORT_CELERY_MODELS = {
    "Winner": {'app_label': 'winners', 'model_name': 'Winner'}
}

"The available parameters are app_label, model_name, and resource. 'resource' should be a function which returns a django-import-export Resource."

This is what the documentation says but we can't add a resource before loading the app. Also, adding a string doesn't work.

you forget to do command makemigrations

You doesn't have this migration file

from django.db import migrations, models

class Migration(migrations.Migration):

dependencies = [
    ("import_export_celery", "0007_auto_20210210_1831"),
]

operations = [
    migrations.AlterField(
        model_name="exportjob",
        name="id",
        field=models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name="ID"),
    ),
    migrations.AlterField(
        model_name="importjob",
        name="id",
        field=models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name="ID"),
    ),
]

Missing migration

I integrated this on one of the projects I'm working on, seems fine I noticed the migration issue but even if I do the makemigrations the migration file doesn't apply properly.

The generated migration is under this path /usr/local/lib/python3.8/site-packages/import_export_celery/migrations/0008_auto_20211122_1340.py but trying to navigate to this direct nothing is generated...

I also know that a similar issue was created but I can't seem to find proper documentation about migration modules. Any ideas?

How to define task queue for the django-import-export-celery task

In my django application I'm using a task queue. Whenever, I try to execute model export, it doesn't triggering the django-import-export-celery task. May be I need to define task queue for the shared_task from this library. I wanted to ask, if there is any way to override the queue settings so that the task will be executed on the right celelry task queue? @auto-mat

ModuleNotFoundError: No module named 'celery.task'

Hello i recieved error

ModuleNotFoundError: No module named 'celery.task'

django-import-export-celery/import_export_celery/tasks.py

Line 6 in f5d1e1b

from celery import task

According to celery docs you need to use

from celery import shared_task

Error handling

If import ends with error => dry import is finished ... summary is still created but empty... only head is there.. its strange.

i know error is in details of import ... but still.. who open that?!
TODO:

do not render summary
print error to summary
print that dry import ended with error.... (i think that worked in older version?)

Race condition calling `run_import_job` in `importjob_post_save` upon creation

I have this error popping 90% of the time using version 1.1.4 :

[2021-07-27 12:27:02,269: ERROR/ForkPoolWorker-8] Task import_export_celery.tasks.run_import_job raised unexpected: DoesNotExist('ImportJob matching query does not exist.',)
Traceback (most recent call last):
  File "/celery/app/trace.py", line 412, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/celery/app/trace.py", line 704, in __protected_call__
    return self.run(*args, **kwargs)
  File "/import_export_celery/tasks.py", line 187, in run_import_job
    import_job = models.ImportJob.objects.get(pk=pk)
  File "/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/django/db/models/query.py", line 380, in get
    self.model._meta.object_name
import_export_celery.models.importjob.DoesNotExist: ImportJob matching query does not exist.

When it's called at creation here: here

I don't have the issue anymore if I replace this line with :

run_import_job.apply_async((instance.pk,), kwargs={'dry_run': True}, countdown=1)

Can you remove the line exist in import_export_celery\init.py

When I do makemigration it shows:

"import_export_celery\__init__.py", line 1, in <module>
    from project.celery import app as celery_app

Just need to remove this line completely from that init.py file.

Document custom queryset code

#69 is not documented

Importing fails or raises error for model that contains primary key which is not id

I was trying to import a model resource whose primary key is not the id. And when I'm trying to import it from the celery import admin section but it's failing and raising errors. But the same file is able to create/update rows using django-import-export import action.

This is my model resource

class RoleDetailResource(resources.ModelResource):

    class Meta:
        model = RoleDetail
        chunk_size = 5000
        import_id_fields = ("owner", )
        exclude = ("id", )

And I'm getting the following errors while importing,

Error with dry run:

Error without dry run

But import was successful with django-import-export action and row can be seen on the list:

@timthelion

Missing middleware

Hi, the docs need to be updated with an explanation that the AuthorMiddleware is required, I'm getting this error:

Error "author.middlewares.AuthorDefaultBackendMiddleware" is not found in MIDDLEWARE_CLASSES nor MIDDLEWARE. It is required to use AuthorDefaultBackend

Also, since this is apparently a part of this package as well - maybe it should be clarified that the author middleware is a requirement here.

Migrations pending

Installed aps
django-import-export-celery==1.1.3
issue:
E AssertionError: Your models have changes that are not yet reflected in a migration. You should add them now. Relevant app(s): dict_keys(['import_export_celery'])

It looks like there are pending migrations on the latest release, and then local migrations are failing on tests, I can't run makemigrations to do a work around because I don't have permission on the folder where the code of the library is located.

Support arbitrary celery variable name

Won't work, when using celery variable other than 'app'

AttributeError: module 'projectname.celery' has no attribute 'app'

How to provide customised querysets for the export job

Currently, I'm trying to export a model which might contain a lot of annotated data from different models related to the user model.
But when I export I got only the Model's data and annotated data is missing. But this can be found while using export functionality of django-import-export export action.
I was wondering if there is any way to customise the queryset for the targeted export model with celery export
@timthelion