jschneier / django-storages Goto Github PK

View Code? Open in Web Editor NEW

2.6K 2.6K 842.0 2.04 MB

https://django-storages.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

django python

django-storages's Introduction

Installation

Installing from PyPI is as easy as doing:

pip install django-storages

If you'd prefer to install from source (maybe there is a bugfix in master that hasn't been released yet) then the magic incantation you are looking for is:

pip install -e 'git+https://github.com/jschneier/django-storages.git#egg=django-storages'

For detailed instructions on how to configure the backend of your choice please consult the documentation.

About

django-storages is a project to provide a variety of storage backends in a single library.

This library is usually compatible with the currently supported versions of Django. Check the Trove classifiers in setup.py to be sure.

django-storages is backed in part by Tidelift. Check them out for all of your enterprise open source software commercial support needs.

Security

To report a security vulnerability, please use the Tidelift security contact. Tidelift will coordinate the fix and disclosure. Please do not post a public issue on the tracker.

Found a Bug?

Issues are tracked via GitHub issues at the project issue page.

Documentation

Documentation for django-storages is located at https://django-storages.readthedocs.io/.

Contributing

Check for open issues at the project issue page or open a new issue to start a discussion about a feature or bug.
Fork the django-storages repository on GitHub to start making changes.
Add a test case to show that the bug is fixed or the feature is implemented correctly.
Bug me until I can merge your pull request.

Please don't update the library version in CHANGELOG.rst or storages/__init__.py, the maintainer will do that on release.

History

This repo began as a fork of the original library under the package name of django-storages-redux and became the official successor (releasing under django-storages on PyPI) in February of 2016.

django-storages's People

Contributors

Stargazers

Watchers

Forkers

meshy forrestp caffodian coredumperror yuvadm hwkns nlundquist accordeiro skirsdeda nikolas pczerkas basicinside titusz ericbuckley 1django apkawa maryokhin carleton-avrc flebel erlingbo sbussetti jamestbrown chris-allen g-cassie patgmiller ticosax hyperlab mbarrien 1t uhuramedia agriffis hammingcube kamotos sixpearls chris7 wearpants gnzlo789 zulupro simplefractal series7 falcon1kr procedurallygenerated richardbx un33k juiceinc joshisa taxido jneves ehamiter sasha0 soundlife schumannd oscargicast jamescw nacady malcolmathci robertavram andersontep lmorchard dragonx comandrei monash-merc halfnibble kmmbvnr kinetichub browniebroke bryanchow pczhaoyun marcelchastain jeffcjohnson beersbr symmetric infoscout tellybug keimlink lordgaav dcgoss rdandy ivoscc zuck taifu dougvanhorn mrgaolei benwilber caleb-allen sketchfab epetxepe nvbots a12k alissonperez fladi averrips joern-paessler usgm nikolaik ilhasoft camilonova deployed artivest tnir

django-storages's Issues

ContentFile (String -> S3 Upload) fails with AttributeError: 'str' object has no attribute 'seek'

def upload_content_to(instance, filename):
    return 'content/{}'.format(filename)

class Content(CreationModificationMixin):
    title = models.CharField(max_length=255, blank=False, null=False)
    body  = models.TextField(null=False, blank=False)
    file  = models.FileField(upload_to=upload_content_to, editable=False, null=False, blank=False)

Trying to upload STRING to S3 via Django's ContentFile:

from django.core.files.base import ContentFile
from content.models import Content
c = Content(title='test.json', body='test __body__')
c.file.save(c.title, ContentFile(c.body), save=False)
c.save()

getting the following error:
AttributeError: 'str' object has no attribute 'seek'

Traceback

>>> c.file.save(c.title, ContentFile(c.body))
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "F:\_Projects\project\.venv\project\lib\site-packages\django\db\models\fields\files.py", line 112, in save
    self.instance.save()
  File "F:\_Projects\project\.venv\project\lib\site-packages\django\db\models\base.py", line 710, in save
    force_update=force_update, update_fields=update_fields)
  File "F:\_Projects\project\.venv\project\lib\site-packages\django\db\models\base.py", line 734, in save_base
    update_fields=update_fields)
  File "F:\_Projects\project\.venv\project\lib\site-packages\django\dispatch\dispatcher.py", line 201, in send
    response = receiver(signal=self, sender=sender, **named)
  File "F:\_Projects\project\containers\backend\apps\content\models.py", line 42, in persist_content_to_s3_json_after_save
    instance.file.save(instance.title, instance.body)
  File "F:\_Projects\project\.venv\project\lib\site-packages\django\db\models\fields\files.py", line 94, in save
    self.name = self.storage.save(name, content, max_length=self.field.max_length)
  File "F:\_Projects\project\.venv\project\lib\site-packages\django\core\files\storage.py", line 64, in save
    name = self._save(name, content)
  File "F:\_Projects\project\.venv\project\lib\site-packages\storages\backends\s3boto.py", line 410, in _save
    self._save_content(key, content, headers=headers)
  File "F:\_Projects\project\.venv\project\lib\site-packages\storages\backends\s3boto.py", line 421, in _save_content
    rewind=True, **kwargs)
  File "F:\_Projects\project\.venv\project\lib\site-packages\boto\s3\key.py", line 1207, in set_contents_from_file
    fp.seek(0, os.SEEK_SET)
  File "F:\_Projects\project\.venv\project\lib\site-packages\django\core\files\utils.py", line 20, in <lambda>
    seek = property(lambda self: self.file.seek)
AttributeError: 'str' object has no attribute 'seek'

my settings

# Django Storages
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'

AWS_S3_SECURE_URLS = False       # use http instead of https
AWS_QUERYSTRING_AUTH = False     # don't add complex authentication-related query parameters for requests

Requirements

Great project!

I've been following django-storages for a while facing the same python3 incompatibility problem. Good to see someone is taking care of the project!

I'm quite new to django, so the answer may be simple: what should I use as requirement? I mean, I need to point to this github repo?

Maybe this is a commom question that may be answered in the README file ;)

Thanks a lot!!

Common init signature for all backends

All backends should be compatible at least partly with the signature of django.core.files.storage.FileSystemStorage:

def __init__(self, location=None, base_url=None, file_permissions_mode=None, directory_permissions_mode=None):

Not sure about file and directory permission modes but location and base_url can surely be used in all backends. location should be a path (e.g. '

//') while base_url could provide the rest of URL for connection (protocol + hostname + port). For some backends even authentication credentials could be extracted from base_url.
Currently we have SFTP backend which has no init args, then we have FTP backend which needs full URL in location arg and the rest of backends are not very consistent as well, I guess.
The purpose of this would be to make all backends interchangable without having to fiddle with settings too much.

Azure storage not appending shared access token, how can this be done?

Not able to access azure backed files after they have been uploaded

With the latest master branch I am able to upload files into azure containers, however when I try to access those files from within Django admin I am getting the following notice:

<?xml version="1.0" encoding="utf-8"?>
<Error><Code>ResourceNotFound</Code>
<Message>The specified resource does not exist.
RequestId:865dc956-0001-001e-312d-0bdb7f000000
Time:2015-10-20T11:50:25.5502680Z
</Message>
</Error>

The target URL to the file is correctly formatted with storage account name and container.
What MEDIA_URL if any should I be using?

Generated S3 urls are not standard compliant

If you pass a page with a link to {% static %} the page will fail W3C Validator:

Error: & did not start a character reference. (& probably should have been escaped as &.)
At line 13, column 176
ires=1443458349&AWSAccessKeyId

apache_libcloud backend not python 3 compatible

I'm getting the "name 'basestring' is not defined" in apache_libcloud.py exception. I'd love to use Google Cloud Storage with python3.

AWS credentials only work in base settings module

As per the recommendations in Two Scoops of Django, I've got two settings modules: base.py, and dev.py. Dev.py imports everything from base.py, and then adds whatever settings I need for the development environment.

For some reason my AWS credentials work fine in base.py, but Boto doesn't seem to be able to find them when they're in dev.py. No idea why, because Django itself has no issue seeing the variables in that file. (E.g. my DJANGO_SETTINGS_MODULE is set correctly.)

Secret key issue

I'm running the latest master from github, since PyPi package threw similar error to what I'm getting now.
I'm attempting to create a custom S3BotoStorage class. Very simple... just a new location to put files in a specific folder.

When I import the class or the S3BotoStorage anywhere into my project, I get an ugly error related to the secret key being empty.

Any suggestions for possible fixes?

remove deprecated Mosso backend

Feature request - customizing location at runtime

I'd be open to paying someone for this. In django-storages (the original project on pypi) I can extend S3BotoStorage like this

class MediaRootS3BotoStorage(S3BotoStorage):
    def __init__(self, *args, **kwargs):
        path = 'media'
        if hasattr(connection, 'schema_name'):
            path = path + '/' + connection.schema_name
        kwargs['location'] = path
        super(MediaRootS3BotoStorage, self).__init__(*args, **kwargs)

This let's me save media in different folders when using django-tenant-schemas. It might look like /media/tenant1 and /media/tenant2. With redux however the location seems to just always be whatever first tenant I use. I'd like to use django-storages-redux over the original for the many bug fixes, particularly the timezone and S3 stuff.

Add support for IAM role credentials to Boto S3 backend

If access key and secret key credentials are not provided via the Django settings, or the environment, it would be helpful if the Boto S3 backend automatically tried to retrieve authentication credentials from instance metadata.

Why not rename this repository to django-storages-redux?

Hello, I think it would make sense to change the repository name to the same name of the package in pypi, other words, change name to django-storages-redux and publish django-storages-redux documentation on ReadTheDocs.

If you are interested, I can send a pull-request changing any reference of django-storages to django-storages-redux, except the name of the app to continue compatible without changes. (I think they will change only in the documentation)

ok, also think it would make sense to change too the app name to storage_redux , to avoid mistakes (example: confuse the names and install the original version and not this fork).

If you have no interest, no problems, everything remains as it is.

can't use bucket names with dots

see boto/boto#2836

might need to switch the default calling format to be Original instead of Subdomain

Release Django 1.8 support

Hi! I see that the currently unreleased v1.3 includes fixes for Django 1.8

When will it be released on pypi?

Installing: `UnicodeDecodeError: 'utf-8' codec`

When I install by pip3 install https://github.com/jschneier/django-storages the following error message showed up:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Save existing file into S3FileField - Unsupported operation Seek

I have a model with a S3FileField, which works fine.

However, now I have implemented jquery ui file upload, to upload the file directly to Amazon S3 (to save some bandwith and to unburden my django application). When it succeeds it does an ajax POST request to my view that should store the ModelInstance with file into the database.

I have already found the filepath_to_uri function to create the url name, which seems to work fine.

How can I now save the model to the database, so that it stays consistent with the previous upload method (using the default S3FileField form)?

S3BotoStorage serializes S3 bucket name kwarg in Django 1.7

The bucket kwarg maybe be a settings variable that changes from deployment to deployment, but the value gets serialized into migrations, corrupting my deployment-agnostic migrations with deployment-specific configuration.

The bucket argument isn't used during schema migrations, however it could be used during a data migration that affects the given S3BotoStorage FileField. Because of that we can't simply remove the bucket argument from the deconstructed / serialized arguments.

Instead I propose modifying S3BotoStorage so that it uses a new bucket_alias kwarg instead of bucket, where the value of bucket_alias is a key in a settings dictionary of arbitrary keywords to actual S3 bucket names. This indirection will allow us to serialize bucket definitions in migrations but still independently define the exact buckets used for a given deployment.

This type of issue and solution are the reason databases have aliases in Django.

Boto3 and Infrequent Access Storage Class

Hey there.
Just want to ask a quick question. Do you plan to implement a new S3 Backend based on boto3? This would be super nice, because also comes with the option to upload a key with the new infrequent access storage option, which has not been (and probably will not be) backported to boto2 reference

Blackbraze support

Backbraze will release cloud storage service like S3, but cheaper.
This low-cost storage will be expand with considering following article.

http://techcrunch.com/2015/09/22/backblaze-b2/

so, it would be nice to implement into django-storages.
Backbraze doesn't release in public now, but they provide nice docs.
We can start from now although the service is not start yet.

https://www.backblaze.com/b2/docs/

Best Regards.
Kousuke Takeuchi

Apps python warnings settings interfere with those in my project

Setting warnings.simplefilter('always') in the module's init overrides the settings in my project and are resulting in annoying log spam.

Is there a known workaround this? Barring that, can you give me some background on why you've chosen this configuration?

Using str in AWS_HEADERS causes spaces to be escaped!

See: boto/boto#2536

If I set AWS_HEADERS as
{'Cache-Control': 'max-age=%d, s-maxage=%d, must-revalidate' % (AWS_EXPIRY, AWS_EXPIRY)} (notice spaces), it will render to 'max-age=604800,%20s-maxage=604800,%20must-revalidate'.
Encoding the string as bytestring solved my problem on Django 1.7.7 and python3, but I didn't do any thorough testing.

AWS_HEADERS = {
    'Cache-Control': str.encode(
        'max-age=%d, s-maxage=%d, must-revalidate' % (
        AWS_EXPIREY, AWS_EXPIREY))
}

1.2.1 error

File "/Users/NY/Envs/myproject/lib/python3.4/site-packages/storages/backends/s3boto.py", line 24, in
from storages.utils import setting
ImportError: cannot import name 'setting'

py3-storages work fine

Amazon S3 storage overwrites files with the same name by default

Hi.

To change this default behavior we need to put in settings
AWS_S3_FILE_OVERWRITE = False

I haven't found that info in the official documentation. It would be nice to update this

AWS S3 Frankfurt region not working

COPY OF BITBUCKET ISSUE #214 - https://bitbucket.org/david/django-storages/issue/214/aws-s3-frankfurt-region-not-working

"Andreas Schilling created an issue 2015-01-04

Using Frankfurt region (Germany) with django-storages produces HTTP 400 error. S3 in the the new region supports only Signature Version 4. In all other regions, Amazon S3 supports both Signature Version 4 and Signature Version 2.

I assume django-storages only supports Signature Version 2. Is there any chance to support Version 4?"

Thanks @jschneier for the fork! Is there a chance for django-storages-redux to support the eu-central-1 region?

No attribute 'WindowsAzureMissingResourceError'

Hi,

I've not used django-storages with any provider. I keep getting the error 'module' object has no attribute 'WindowsAzureMissingResourceError'.

Here is the traceback http://dpaste.com/12F5D3S

Here is a snippet of my 'settings.py':

DEFAULT_FILE_STORAGE = 'storages.backends.azure_storage.AzureStorage'

AZURE_ACCOUNT_NAME = "account_name"

AZURE_ACCOUNT_KEY = "xxxxx"

AZURE_CONTAINER = "container_name"

MEDIA_ROOT = '/media/'

MEDIA_URL = 'stashdimages.core.windows.net/'

Models.py

testfile = models.FileField("Test File", blank=True, null= True)

Git Install Instructions

Can the pip install instructions be added to the readme file? I'd be happy to be a pull request if the line below is fine.

pip install -e 'git+https://github.com/jschneier/django-storages-redux.git#egg=django-storages-redux'

NotImplementedError when trying to use Google Cloud Storage

I'm trying to setup django-storages for GCS. As a test bed, I'm using django's "poll tutorial" with default admin interface.

In my config file I have:

INSTALLED_APPS = (
    'storages',
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'polls',
)
...
STATIC_URL = 'https://storage.googleapis.com/my-static/'
DEFAULT_FILE_STORAGE = 'storages.backends.apache_libcloud.LibCloudStorage'
STATICFILES_STORAGE = 'storages.backends.apache_libcloud.LibCloudStorage'

LIBCLOUD_PROVIDERS = {
    'google_cloud_storage': {
        'type': 'libcloud.storage.types.Provider.GOOGLE_STORAGE',
        'user': os.environ.get('GOOGLE_ACCESS_KEY'),
        'key': os.environ.get('GOOGLE_SECRET_KEY'),
        'bucket': 'my-static',
        'secure': True,
    },
}

DEFAULT_LIBCLOUD_PROVIDER = 'google_cloud_storage'

When I call
python3 manage.py collectstatic
everything seems to work and files are uploaded into GCS bucket.
But when I try to view the web page, I get:

NotImplementedError at /admin/login/
get_object_cdn_url not implemented for this driver
Request Method: GET
Request URL:    http://localhost:8080/admin/login/?next=/admin/
Django Version: 1.8.3
Exception Type: NotImplementedError
Exception Value:    
get_object_cdn_url not implemented for this driver
Exception Location: /usr/local/lib/python3.4/dist-packages/libcloud/storage/base.py in get_object_cdn_url, line 296
Python Executable:  /usr/local/bin/uwsgi
Python Version: 3.4.0
Python Path:    
['.',
 '',
 '/usr/src/app/src/django-storages',
 '/usr/lib/python3.4',
 '/usr/lib/python3.4/plat-x86_64-linux-gnu',
 '/usr/lib/python3.4/lib-dynload',
 '/usr/local/lib/python3.4/dist-packages',
 '/usr/lib/python3/dist-packages']
Server time:    Tue, 28 Jul 2015 16:33:57 +0000
Error during template rendering

In template /usr/local/lib/python3.4/dist-packages/django/contrib/admin/templates/admin/base.html, error at line 6
get_object_cdn_url not implemented for this driver
1   {% load i18n admin_static %}<!DOCTYPE html>
2   {% get_current_language as LANGUAGE_CODE %}{% get_current_language_bidi as LANGUAGE_BIDI %}
3   <html lang="{{ LANGUAGE_CODE|default:"en-us" }}" {% if LANGUAGE_BIDI %}dir="rtl"{% endif %}>
4   <head>
5   <title>{% block title %}{% endblock %}</title>
6   
      <link rel="stylesheet" type="text/css" href="{% block stylesheet %}
      {% static "admin/css/base.css" %}
      {% endblock %}" />

Upload multipart to s3

Trying to save to s3 currently does not handle large files elegantly. I've made the following changes to backends/s3boto.py to support the saving of large files.

import os
import posixpath
import mimetypes
from datetime import datetime
from gzip import GzipFile
from tempfile import SpooledTemporaryFile
import warnings
import math

from django.core.files.base import File
from django.core.files.storage import Storage
from django.core.exceptions import ImproperlyConfigured, SuspiciousOperation
from django.utils.encoding import force_text, smart_str, filepath_to_uri, force_bytes
from filechunkio import FileChunkIO


try:
    from boto import __version__ as boto_version
    from boto.s3.connection import S3Connection, SubdomainCallingFormat
    from boto.exception import S3ResponseError
    from boto.s3.key import Key as S3Key
    from boto.utils import parse_ts, ISO8601
except ImportError:
    raise ImproperlyConfigured("Could not load Boto's S3 bindings.\n"
                               "See https://github.com/boto/boto")

from storages.utils import setting
from storages.compat import urlparse, BytesIO, deconstructible

boto_version_info = tuple([int(i) for i in boto_version.split('-')[0].split('.')])

if boto_version_info[:2] < (2, 32):
    raise ImproperlyConfigured("The installed Boto library must be 2.32 or "
                               "higher.\nSee https://github.com/boto/boto")


def parse_ts_extended(ts):
    warnings.warn(
        "parse_ts_extended has been deprecated and will be removed in version "
        "1.3 because boto.utils.parse_ts has subsumed the old functionality.",
        PendingDeprecationWarning
    )
    return parse_ts(ts)


def safe_join(base, *paths):
    """
    A version of django.utils._os.safe_join for S3 paths.

    Joins one or more path components to the base path component
    intelligently. Returns a normalized version of the final path.

    The final path must be located inside of the base path component
    (otherwise a ValueError is raised).

    Paths outside the base path indicate a possible security
    sensitive operation.
    """
    base_path = force_text(base)
    base_path = base_path.rstrip('/')
    paths = [force_text(p) for p in paths]

    final_path = base_path
    for path in paths:
        final_path = urlparse.urljoin(final_path.rstrip('/') + "/", path)

    # Ensure final_path starts with base_path and that the next character after
    # the final path is '/' (or nothing, in which case final_path must be
    # equal to base_path).
    base_path_len = len(base_path)
    if (not final_path.startswith(base_path) or
                final_path[base_path_len:base_path_len + 1] not in ('', '/')):
        raise ValueError('the joined path is located outside of the base path'
                         ' component')

    return final_path.lstrip('/')


@deconstructible
class S3BotoStorageFile(File):
    """
    The default file object used by the S3BotoStorage backend.

    This file implements file streaming using boto's multipart
    uploading functionality. The file can be opened in read or
    write mode.

    This class extends Django's File class. However, the contained
    data is only the data contained in the current buffer. So you
    should not access the contained file object directly. You should
    access the data via this class.

    Warning: This file *must* be closed using the close() method in
    order to properly write the file to S3. Be sure to close the file
    in your application.
    """
    # TODO: Read/Write (rw) mode may be a bit undefined at the moment. Needs testing.
    # TODO: When Django drops support for Python 2.5, rewrite to use the
    # BufferedIO streams in the Python 2.6 io module.
    buffer_size = setting('AWS_S3_FILE_BUFFER_SIZE', 5242880)

    def __init__(self, name, mode, storage, buffer_size=None):
        self._storage = storage
        self.name = name[len(self._storage.location):].lstrip('/')
        self._mode = mode
        self.key = storage.bucket.get_key(self._storage._encode_name(name))
        if not self.key and 'w' in mode:
            self.key = storage.bucket.new_key(storage._encode_name(name))
        self._is_dirty = False
        self._file = None
        self._multipart = None
        #5 MB is the minimum part size (if there is more than one part).
        # Amazon allows up to 10,000 parts.  The default supports uploads
        # up to roughly 50 GB.  Increase the part size to accommodate
        # for files larger than this.
        if buffer_size is not None:
            self.buffer_size = buffer_size
        self._write_counter = 0

    @property
    def size(self):
        return self.key.size

    def _get_file(self):
        if self._file is None:
            self._file = SpooledTemporaryFile(
                max_size=self._storage.max_memory_size,
                suffix=".S3BotoStorageFile",
                dir=setting("FILE_UPLOAD_TEMP_DIR", None)
            )
            if 'r' in self._mode:
                self._is_dirty = False
                self.key.get_contents_to_file(self._file)
                self._file.seek(0)
            if self._storage.gzip and self.key.content_encoding == 'gzip':
                self._file = GzipFile(mode=self._mode, fileobj=self._file)
        return self._file

    def _set_file(self, value):
        self._file = value

    file = property(_get_file, _set_file)

    def read(self, *args, **kwargs):
        if 'r' not in self._mode:
            raise AttributeError("File was not opened in read mode.")
        return super(S3BotoStorageFile, self).read(*args, **kwargs)

    def write(self, content, *args, **kwargs):
        if 'w' not in self._mode:
            raise AttributeError("File was not opened in write mode.")
        self._is_dirty = True
        if self._multipart is None:
            provider = self.key.bucket.connection.provider
            upload_headers = {
                provider.acl_header: self._storage.default_acl
            }
            upload_headers.update(
                {'Content-Type': mimetypes.guess_type(self.key.name)[0] or self._storage.key_class.DefaultContentType})
            upload_headers.update(self._storage.headers)
            self._multipart = self._storage.bucket.initiate_multipart_upload(
                self.key.name,
                headers=upload_headers,
                reduced_redundancy=self._storage.reduced_redundancy
            )
        if self.buffer_size <= self._buffer_file_size:
            self._flush_write_buffer()
        return super(S3BotoStorageFile, self).write(force_bytes(content), *args, **kwargs)

    @property
    def _buffer_file_size(self):
        pos = self.file.tell()
        self.file.seek(0, os.SEEK_END)
        length = self.file.tell()
        self.file.seek(pos)
        return length

    def _flush_write_buffer(self):
        """
        Flushes the write buffer.
        """
        if self._buffer_file_size:
            self._write_counter += 1
            self.file.seek(0)
            headers = self._storage.headers.copy()
            self._multipart.upload_part_from_file(
                self.file, self._write_counter, headers=headers)
            self.file.close()
            self._file = None

    def close(self):
        if self._is_dirty:
            self._flush_write_buffer()
            self._multipart.complete_upload()
        else:
            if not self._multipart is None:
                self._multipart.cancel_upload()
        self.key.close()


class S3BotoStorage(Storage):
    """
    Amazon Simple Storage Service using Boto

    This storage backend supports opening files in read or write
    mode and supports streaming(buffering) data in chunks to S3
    when writing.
    """
    connection_class = S3Connection
    connection_response_error = S3ResponseError
    file_class = S3BotoStorageFile
    key_class = S3Key

    # used for looking up the access and secret key from env vars
    access_key_names = ['AWS_S3_ACCESS_KEY_ID', 'AWS_ACCESS_KEY_ID']
    secret_key_names = ['AWS_S3_SECRET_ACCESS_KEY', 'AWS_SECRET_ACCESS_KEY']

    access_key = setting('AWS_S3_ACCESS_KEY_ID', setting('AWS_ACCESS_KEY_ID'))
    secret_key = setting('AWS_S3_SECRET_ACCESS_KEY', setting('AWS_SECRET_ACCESS_KEY'))
    file_overwrite = setting('AWS_S3_FILE_OVERWRITE', True)
    headers = setting('AWS_HEADERS', {})
    bucket_name = setting('AWS_STORAGE_BUCKET_NAME')
    auto_create_bucket = setting('AWS_AUTO_CREATE_BUCKET', False)
    default_acl = setting('AWS_DEFAULT_ACL', 'public-read')
    bucket_acl = setting('AWS_BUCKET_ACL', default_acl)
    querystring_auth = setting('AWS_QUERYSTRING_AUTH', True)
    querystring_expire = setting('AWS_QUERYSTRING_EXPIRE', 3600)
    reduced_redundancy = setting('AWS_REDUCED_REDUNDANCY', False)
    location = setting('AWS_LOCATION', '')
    encryption = setting('AWS_S3_ENCRYPTION', False)
    custom_domain = setting('AWS_S3_CUSTOM_DOMAIN')
    calling_format = setting('AWS_S3_CALLING_FORMAT', SubdomainCallingFormat())
    secure_urls = setting('AWS_S3_SECURE_URLS', True)
    file_name_charset = setting('AWS_S3_FILE_NAME_CHARSET', 'utf-8')
    gzip = setting('AWS_IS_GZIPPED', False)
    preload_metadata = setting('AWS_PRELOAD_METADATA', False)
    gzip_content_types = setting('GZIP_CONTENT_TYPES', (
        'text/css',
        'text/javascript',
        'application/javascript',
        'application/x-javascript',
    ))
    url_protocol = setting('AWS_S3_URL_PROTOCOL', 'http:')
    host = setting('AWS_S3_HOST', S3Connection.DefaultHost)
    use_ssl = setting('AWS_S3_USE_SSL', True)
    port = setting('AWS_S3_PORT', None)

    # The max amount of memory a returned file can take up before being
    # rolled over into a temporary file on disk. Default is 0: Do not roll over.
    max_memory_size = setting('AWS_S3_MAX_MEMORY_SIZE', 0)

    def __init__(self, acl=None, bucket=None, **settings):
        # check if some of the settings we've provided as class attributes
        # need to be overwritten with values passed in here
        for name, value in settings.items():
            if hasattr(self, name):
                setattr(self, name, value)

        # For backward-compatibility of old differing parameter names
        if acl is not None:
            self.default_acl = acl
        if bucket is not None:
            self.bucket_name = bucket

        self.location = (self.location or '').lstrip('/')
        # Backward-compatibility: given the anteriority of the SECURE_URL setting
        # we fall back to https if specified in order to avoid the construction
        # of unsecure urls.
        if self.secure_urls:
            self.url_protocol = 'https:'

        self._entries = {}
        self._bucket = None
        self._connection = None

        if not self.access_key and not self.secret_key:
            self.access_key, self.secret_key = self._get_access_keys()

    @property
    def connection(self):
        if self._connection is None:
            self._connection = self.connection_class(
                self.access_key,
                self.secret_key,
                is_secure=self.use_ssl,
                calling_format=self.calling_format,
                host=self.host,
                port=self.port,
            )
        return self._connection

    @property
    def bucket(self):
        """
        Get the current bucket. If there is no current bucket object
        create it.
        """
        if self._bucket is None:
            self._bucket = self._get_or_create_bucket(self.bucket_name)
        return self._bucket

    @property
    def entries(self):
        """
        Get the locally cached files for the bucket.
        """
        if self.preload_metadata and not self._entries:
            self._entries = dict((self._decode_name(entry.key), entry)
                                 for entry in self.bucket.list(prefix=self.location))
        return self._entries

    def _get_access_keys(self):
        """
        Gets the access keys to use when accessing S3. If none
        are provided to the class in the constructor or in the
        settings then get them from the environment variables.
        """

        def lookup_env(names):
            for name in names:
                value = os.environ.get(name)
                if value:
                    return value

        access_key = self.access_key or lookup_env(self.access_key_names)
        secret_key = self.secret_key or lookup_env(self.secret_key_names)
        return access_key, secret_key

    def _get_or_create_bucket(self, name):
        """
        Retrieves a bucket if it exists, otherwise creates it.
        """
        try:
            return self.connection.get_bucket(name,
                                              validate=self.auto_create_bucket)
        except self.connection_response_error:
            if self.auto_create_bucket:
                bucket = self.connection.create_bucket(name)
                bucket.set_acl(self.bucket_acl)
                return bucket
            raise ImproperlyConfigured("Bucket %s does not exist. Buckets "
                                       "can be automatically created by "
                                       "setting AWS_AUTO_CREATE_BUCKET to "
                                       "``True``." % name)

    def _clean_name(self, name):
        """
        Cleans the name so that Windows style paths work
        """
        # Normalize Windows style paths
        clean_name = posixpath.normpath(name).replace('\\', '/')

        # os.path.normpath() can strip trailing slashes so we implement
        # a workaround here.
        if name.endswith('/') and not clean_name.endswith('/'):
            # Add a trailing slash as it was stripped.
            return clean_name + '/'
        else:
            return clean_name

    def _normalize_name(self, name):
        """
        Normalizes the name so that paths like /path/to/ignored/../something.txt
        work. We check to make sure that the path pointed to is not outside
        the directory specified by the LOCATION setting.
        """
        try:
            return safe_join(self.location, name)
        except ValueError:
            raise SuspiciousOperation("Attempted access to '%s' denied." %
                                      name)

    def _encode_name(self, name):
        return smart_str(name, encoding=self.file_name_charset)

    def _decode_name(self, name):
        return force_text(name, encoding=self.file_name_charset)

    def _compress_content(self, content):
        """Gzip a given string content."""
        zbuf = BytesIO()
        zfile = GzipFile(mode='wb', compresslevel=6, fileobj=zbuf)
        try:
            zfile.write(force_bytes(content.read()))
        finally:
            zfile.close()
        zbuf.seek(0)
        content.file = zbuf
        content.seek(0)
        return content

    def _open(self, name, mode='rb'):
        name = self._normalize_name(self._clean_name(name))
        f = self.file_class(name, mode, self)
        if not f.key:
            raise IOError('File does not exist: %s' % name)
        return f

    def _save(self, name, content):
        cleaned_name = self._clean_name(name)
        name = self._normalize_name(cleaned_name)
        headers = self.headers.copy()
        content_type = getattr(content, 'content_type',
                               mimetypes.guess_type(name)[0] or self.key_class.DefaultContentType)

        # setting the content_type in the key object is not enough.
        headers.update({'Content-Type': content_type})

        if self.gzip and content_type in self.gzip_content_types:
            content = self._compress_content(content)
            headers.update({'Content-Encoding': 'gzip'})

        # content.name = cleaned_name
        encoded_name = self._encode_name(name)

        key = self.bucket.get_key(encoded_name)
        if not key:
            key = self.bucket.new_key(encoded_name)
        if self.preload_metadata:
            self._entries[encoded_name] = key
            key.last_modified = datetime.utcnow().strftime(ISO8601)

        key.set_metadata('Content-Type', content_type)
        self._save_content(key, content, headers=headers)
        return cleaned_name

    def _save_content(self, key, content, headers):
        # only pass backwards incompatible arguments if they vary from the default
        kwargs = {}
        if self.encryption:
            kwargs['encrypt_key'] = self.encryption
        mp = self.bucket.initiate_multipart_upload(
            key.name,
            headers=headers,
            reduced_redundancy=self.reduced_redundancy,
            policy=self.default_acl,
            **kwargs
        )
        chunk_size = 52428800
        source_size = os.stat(content.name).st_size
        chunk_count = int(math.ceil(source_size / chunk_size)) + 1
        print ('_save_content:key_name.content_name.source_size.chunk_count = ({}).({}).({}).({})'.format(key.name, content.name, source_size, chunk_count))
        for i in range(chunk_count):
            offset = chunk_size * i
            bytes = min(chunk_size, source_size - offset)
            with FileChunkIO(content.name, 'r', offset=offset, bytes=bytes) as fp:
                print('_save_content:key_name.content_name.source_size.chunk_count = ({}).({}).({}).({}).uploading.({})'.format(key.name, content.name, source_size, chunk_count, i+1))
                mp.upload_part_from_file(fp, part_num=i + 1)
        mp.complete_upload()
        # key.set_contents_from_file(content, headers=headers,
        # policy=self.default_acl,
        #                            reduced_redundancy=self.reduced_redundancy,
        #                            rewind=True, **kwargs)

    def delete(self, name):
        name = self._normalize_name(self._clean_name(name))
        self.bucket.delete_key(self._encode_name(name))

    def exists(self, name):
        name = self._normalize_name(self._clean_name(name))
        if self.entries:
            return name in self.entries
        k = self.bucket.new_key(self._encode_name(name))
        return k.exists()

    def listdir(self, name):
        name = self._normalize_name(self._clean_name(name))
        # for the bucket.list and logic below name needs to end in /
        # But for the root path "" we leave it as an empty string
        if name and not name.endswith('/'):
            name += '/'

        dirlist = self.bucket.list(self._encode_name(name))
        files = []
        dirs = set()
        base_parts = name.split("/")[:-1]
        for item in dirlist:
            parts = item.name.split("/")
            parts = parts[len(base_parts):]
            if len(parts) == 1:
                # File
                files.append(parts[0])
            elif len(parts) > 1:
                # Directory
                dirs.add(parts[0])
        return list(dirs), files

    def size(self, name):
        name = self._normalize_name(self._clean_name(name))
        if self.entries:
            entry = self.entries.get(name)
            if entry:
                return entry.size
            return 0
        return self.bucket.get_key(self._encode_name(name)).size

    def modified_time(self, name):
        name = self._normalize_name(self._clean_name(name))
        entry = self.entries.get(name)
        # only call self.bucket.get_key() if the key is not found
        # in the preloaded metadata.
        if entry is None:
            entry = self.bucket.get_key(self._encode_name(name))
        # Parse the last_modified string to a local datetime object.
        return parse_ts(entry.last_modified)

    def url(self, name, headers=None, response_headers=None):
        # Preserve the trailing slash after normalizing the path.
        name = self._normalize_name(self._clean_name(name))
        if self.custom_domain:
            return "%s//%s/%s" % (self.url_protocol,
                                  self.custom_domain, filepath_to_uri(name))
        return self.connection.generate_url(self.querystring_expire,
                                            method='GET', bucket=self.bucket.name, key=self._encode_name(name),
                                            headers=headers,
                                            query_auth=self.querystring_auth, force_http=not self.secure_urls,
                                            response_headers=response_headers)

    def get_available_name(self, name):
        """ Overwrite existing file with the same name. """
        if self.file_overwrite:
            name = self._clean_name(name)
            return name
        return super(S3BotoStorage, self).get_available_name(name)

modified_time fails for new file

Inefficient implementation of `listdir` in the S3 backend

The current implementation gets the lists of all objects in the bucket.

Not only is this inefficient, but it also breaks on large buckets because of a bug in boto: boto/boto#3006

You should steal the more efficient implementation from django-s3-storage.

Outdated README requirement for django

Hello

Just wanted to point out that the README still says it is compatible with django 1.5, but version 1.3 is not.

Cheers

AttributeError: 'Module_six_moves_urllib_parse' object has no attribute 'urljoin'

Django==1.6
boto==2.38.0
django-storages-redux==1.2.3

$ ./manage.py collectstatic

You have requested to collect static files at the destination
location as specified in your settings.

This will overwrite existing files!
Are you sure you want to do this?

Type 'yes' to continue, or 'no' to cancel: yes
Traceback (most recent call last):
  File "./manage.py", line 15, in <module>
    execute_from_command_line(sys.argv)
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/core/management/__init__.py", line 399, in execute_from_command_line
    utility.execute()
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/core/management/__init__.py", line 392, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/core/management/base.py", line 242, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/core/management/base.py", line 285, in execute
    output = self.handle(*args, **options)
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/core/management/base.py", line 415, in handle
    return self.handle_noargs(**options)
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 167, in handle_noargs
    collected = self.collect()
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 111, in collect
    handler(path, prefixed_path, storage)
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 288, in copy_file
    if not self.delete_file(path, prefixed_path, source_storage):
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 214, in delete_file
    if self.storage.exists(prefixed_path):
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/storages/backends/s3boto.py", line 435, in exists
    name = self._normalize_name(self._clean_name(name))
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/storages/backends/s3boto.py", line 362, in _normalize_name
    return safe_join(self.location, name)
  File "/Users/username/.virtualenvs/projectname/lib/python2.7/site-packages/storages/backends/s3boto.py", line 62, in safe_join
    final_path = urlparse.urljoin(final_path.rstrip('/') + "/", path)
AttributeError: 'Module_six_moves_urllib_parse' object has no attribute 'urljoin'

New configuration: BUCKET name for Static files

I think would be a good idea to split Media flies and Static files in two distinct buckets.

Instead of only one AWS_STORAGE_BUCKET_NAME:

We could have two:

AWS_STORAGE_BUCKET_NAME_DATA_FILES
AWS_STORAGE_BUCKET_NAME_STATIC_FILES

Libapache support

This library is not working with libapache.

STATICFILES_STORAGE = 'storages.backends.apache_libcloud.LibCloudStorage'
DEFAULT_LIBCLOUD_PROVIDER = 'google'
LIBCLOUD_PROVIDERS = {
    'google': {
        'type': 'libcloud.storage.types.GOOGLE_STORAGE',
        'user': '<Your Google APIv1 username>',
        'key': '<Your Google APIv1 Key>',
        'bucket': 'bucket-dev',
    }
}

With the following setting, running collectstatic gives

File "manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.5/site-packages/django/core/management/__init__.py", line 354, in execute_from_command_line
    utility.execute()
File "/usr/local/lib/python3.5/site-packages/django/core/management/__init__.py", line 346, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python3.5/site-packages/django/core/management/__init__.py", line 190, in fetch_command
    klass = load_command_class(app_name, subcommand)
File "/usr/local/lib/python3.5/site-packages/django/core/management/__init__.py", line 41, in load_command_class
    return module.Command()
File "/usr/local/lib/python3.5/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 32, in __init__
    self.storage.path('')
File "/usr/local/lib/python3.5/site-packages/django/utils/functional.py", line 225, in inner
    self._setup()
File "/usr/local/lib/python3.5/site-packages/django/contrib/staticfiles/storage.py", line 394, in _setup
    self._wrapped = get_storage_class(settings.STATICFILES_STORAGE)()
File "/usr/local/lib/python3.5/site-packages/storages/backends/apache_libcloud.py", line 52, in __init__
    (self.provider.get('type'), e))
django.core.exceptions.ImproperlyConfigured: Unable to create libcloud driver type libcloud.storage.types.GOOGLE_STORAGE: Invalid module path

ResourceWarning

I get a bunch of ResourceWarning's when installing django_storages_redux-1.2.1 using pip:

Downloading/unpacking django-storages-redux
  Downloading django-storages-redux-1.2.1.tar.gz (55kB): 55kB downloaded
  Running setup.py (path:/home/vagrant/.virtualenvs/myproject/build/django-storages-redux/setup.py) egg_info for package django-storages-redux
    /home/vagrant/.virtualenvs/myproject/build/django-storages-redux/setup.py:13: ResourceWarning: unclosed file <_io.TextIOWrapper name='README.rst' mode='r' encoding='ISO-8859-1'>
      long_description=open('README.rst').read() + '\n\n' + open('CHANGELOG.rst').read(),
    /home/vagrant/.virtualenvs/myproject/build/django-storages-redux/setup.py:13: ResourceWarning: unclosed file <_io.TextIOWrapper name='CHANGELOG.rst' mode='r' encoding='ISO-8859-1'>
      long_description=open('README.rst').read() + '\n\n' + open('CHANGELOG.rst').read(),

    /home/vagrant/.virtualenvs/myproject/lib/python3.4/site-packages/setuptools/command/sdist.py:230: DeprecationWarning: 'U' mode is deprecated
      manifest = open(self.manifest, 'rbU')
Installing collected packages: django-storages-redux
  Running setup.py install for django-storages-redux
    /home/vagrant/.virtualenvs/myproject/build/django-storages-redux/setup.py:13: ResourceWarning: unclosed file <_io.TextIOWrapper name='README.rst' mode='r' encoding='ISO-8859-1'>
      long_description=open('README.rst').read() + '\n\n' + open('CHANGELOG.rst').read(),
    /home/vagrant/.virtualenvs/myproject/build/django-storages-redux/setup.py:13: ResourceWarning: unclosed file <_io.TextIOWrapper name='CHANGELOG.rst' mode='r' encoding='ISO-8859-1'>
      long_description=open('README.rst').read() + '\n\n' + open('CHANGELOG.rst').read(),

    /home/vagrant/.virtualenvs/myproject/lib/python3.4/site-packages/setuptools/command/sdist.py:230: DeprecationWarning: 'U' mode is deprecated
      manifest = open(self.manifest, 'rbU')
Successfully installed django-storages-redux

This is on Python 3.4.2 using Django 1.7.2, both on Ubuntu and Windows.

AWS removing support for SSLv3

I guess this also will affect this project, although all it need probably just picking up the latest boto release but just some heads up - boto/boto#3103

New pypi release?

There are fixes in master that fix critical bugs for Google Cloud Storage. Any chance for a new pypi release?

Wagtail tests failing when using django-storages (S3BotoStorage)

Related
wagtail/wagtail#1542

result

python runtests.py wagtail.wagtailimages --failfast

======================================================================
FAIL: test_closes_image (wagtail.wagtailimages.tests.test_models.TestGetWillowImage)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/wagtail-1.0/wagtail/wagtailimages/tests/test_models.py", line 239, in test_closes_image
    self.assertTrue(self.image.file.closed)
AssertionError: False is not true

----------------------------------------------------------------------
Ran 4 tests in 2.563s

FAILED (failures=1)
Destroying test database for alias 'default'...

Got a reply from a wagtail core developer.
I wonder what can be done to fix this issue.

This looks like a bug in django-storages to me.

When you call file.close() the file.closed property should return True.

django-storages doesn't override djangos default definition of this property: https://github.com/django/django/blob/master/django/core/files/base.py#L67-L69

This tells if the file is closed by simply accessing the .file attribute. But in django-storages, this reopens the file: https://github.com/jschneier/django-storages/blob/master/storages/backends/s3boto.py#L120-L138

So .closed returns True

Files not overwritten when "collectstatic" run multiple times

I believe this is the same issue that was reported in the original django-storages here: https://bitbucket.org/david/django-storages/issue/167

When I run "collectstatic" more than once within a few hours, no files are recognized as modified.

Could you add S3.py into it.

Right now:

from S3 import CallingFormat
AWS_CALLING_FORMAT = CallingFormat.SUBDOMAIN

This does not work.

Take over django-storages on PyPI

It would be nice if this "fork" became the official django-storages package on PyPI. It's possible to get access to the PyPI package by filing a support ticket at http://sourceforge.net/p/pypi/support-requests/

This is assuming you've contacted the original maintainer of the package with no success.

Second file upload override first one, if have same name

Here is the scenario:

Model Product has a image field;
User uploads an image to its product
Another user uploads another image to another product
First image gets overriden.

Is it a problem that django-storages should fix or should I rename the file to make it unique by user?

ResourceWarning: unclosed <ssl.SSLSocket

I keep getting this error:

(...) /lib/python3.4/site-packages/boto/connection.py:1017: ResourceWarning: unclosed <ssl.SSLSocket fd=8, family=AddressFamily.AF_INET, type=SocketType.SOCK_STREAM, proto=6, laddr=('192.168.43.100', 61734)
ex = e

Have anyone seen it?

requests not being signed after switching to a cdn backed by an s3 bucket

I had this working just fine when I used my-bucket-name.s3.amazonaws.com/ as the default storage for my site. Last night I set up CloudFlare to proxy the requests through. In doing this you do the following:

create a bucket name that matches the CNAME you want to serve through (cdn.mysite.com) so that the url would be cdn.mysite.com.s3.amazonaws......
in the CloudFlare DNS panel, you setup a CNAME for cdn.mysite.com to redirect to cdn.mysite.com.s3.amazonaws.com
presto, magic works.

however, when it was using my-bucket-name.s3, it django was automatically signing all of the requests for the content, which I want it to do... but when I switched to the cdn, it stopped, and I had to open the bucket up to public viewing, which I do not want.

remove S3 (based on S3.py not boto) backend

Boto3 Support

About a week ago, a new version of Boto was released. This new version fixes a number of shortcomings we've encountered with the current version of Boto, so it would be great to see this implemented into django-storages!

https://aws.amazon.com/blogs/aws/now-available-aws-sdk-for-python-3-boto3/

mime type issues

The S3BotoStorage lib doesn't set the correct mime type and defaults to 'application/octet-stream'

Hack: overide the _save method and replace:
content_type = getattr(content, 'content_type',
mimetypes.guess_type(name)[0] or self.key_class.DefaultContentType)
with:
content_type = getattr(content._file, 'content_type',
mimetypes.guess_type(name)[0] or self.key_class.DefaultContentType)

Make bad request (400) error more transparent to diagnose

This issue is very closely related to this django ticket, the difference being that it's the s3boto's safe_join method, not django.utils._os.safe_join, that raises the ValueError.

The scenario for this "bad request (400)" can be triggered by the user not setting their MEDIA_ROOT = "", and using a custom S3BotoStorage class that defines say location="media". It's a real pain to diagnose with no debug info or traceback, and even with django logging on all we get is

[2015-06-30 16:35:33,963] ERROR [django.security.SuspiciousOperation:176] Attempted access to '/path/to/my/proj/temp_media/uploads' denied.

which isn't massively helpful.

I encountered it in Mezzanine with s3, when I didn't set my MEDIA_ROOT=''. Mezzanine uses filebrowser in the admin for media uploads, and any pages with filebrowser upload buttons, triggered this 400.

Could it be made so that there is more of a trace or log?

Media files not deleted from Amazon S3

Hello, in a Django 1.7 project we are using the Amazon S3 storage (with boto).

The documentations says:

Deleting an object deletes the file it uses, if there are no other objects still using that file

But, as I delete my model instance the corresponding file is not deleted from S3.

The model is really simple and we don have other objects still using that file.

¿Is it a bug or a problem with the documentation?