Comments (45)
tests for the natural usage of ListField with ForeignKeys are on a separate branch);
Interested in this. Does it accomplish the same thing I tried with this pull request django-nonrel/mongodb-engine#83 and my custom ForeignKey field https://github.com/emperorcezar/Django-utils/blob/master/fields.py ?
from djangotoolbox.
You'll still have to use model_instance.pk rather than simply model_instance (but other than that it should now work with Mongo, not just GAE). I did some research on the issue, but so far this only ended with some tests and a broken, one-sided solution:
https://github.com/django-nonrel/djangotoolbox/compare/feature/collections-with-foreignkeys-tests
The hard part seems to be making ForeignKeys in a ListField fetched from the database appear like model instances -- Django uses descriptors / managers for that, so it would be nice to do something along this line for collection fields.
from djangotoolbox.
In your tests you seem to be pushing, querying, and returning model instances. In what circumstances do you need to use model_instance.pk?
from djangotoolbox.
Thanks so much for the amazing work!
In my opinion, however, a big part of the style changes are just wrong.
- We really don't need to break all lines after 79 characters. 90 chars is okay sometimes. For instance, django-nonrel/mongodb-engine@12ae01c#L1L57 is not improving readability at all.
- Parenthesis-imports are okay and much more readable than using
\
. django-nonrel/mongodb-engine@12ae01c#L2L8 - Not all comments need to end in a dot, nor do exception messages need to end in a dot.
- XXX is not the same thing as TODO
- django-nonrel/mongodb-engine@12ae01c#L21L8 django-nonrel/mongodb-engine@12ae01c#L50L22 okay these are really stupid. We won't do any styling on autogenerated/auto"provided" code.
- django-nonrel/mongodb-engine@12ae01c#L54L9 this doesn't help readability either.
- ...
I really really appreciate your hard work but you need to stop thinking about PEP8 (and friends) as hard rules. They're just guidelines and it's okay to think on your own sometimes.
from djangotoolbox.
Returning model instances is not solved -- the test will just fail there; the solution on the foreignkey branch is just a (bad) idea. Take
a look at this test: https://github.com/django-nonrel/djangotoolbox/blob/feature/type-conversions-refactor-2/djangotoolbox/tests.py#L230, this is what works after refactor.
from djangotoolbox.
Update: due to the renaming of DatabaseCreation.nonrel_db_type
the nonrel branch:
https://github.com/django-nonrel/django-nonrel/compare/feature/drop-related-db-type
is required for the -2
branches to work (not having it results in an infinite recursion).
from djangotoolbox.
Woa GitHub's formatting of email replies is really horrible. Can you just re-post the whole thing using the webinterface?
from djangotoolbox.
Returning model instances is not solved -- the test will just fail there; the solution on the foreignkey branch is just a (bad) idea. Take a look at this test: https://github.com/django-nonrel/djangotoolbox/blob/feature/type-conversions-refactor-2/djangotoolbox/tests.py#L230, this is what works after refactor.
The only way I found to solve that was overriding the to_python on the ForeignKey field and calling rel.to.objects.get
which worked well in my case, but my bet is you're not overriding the ForeignKey field.
https://github.com/emperorcezar/Django-utils/blob/master/fields.py#L18
from djangotoolbox.
In my opinion, however, a big part of the style changes are just wrong.
Probably you're simply right, nevertheless I'll try to explain myself.
- We really don't need to break all lines after 79 characters. 90 chars is okay sometimes. For instance, django-nonrel/mongodb-engine@12ae01c#L1L57 is not improving readability at all.
I'd say 120 is always OK. There may be a few that did decrease readability, but in most cases there's no harm done. And the time spent thinking if 80 / 90 / 100 is fine here or there can be better used.
- Parenthesis-imports are okay and much more readable than using
\
. django-nonrel/mongodb-engine@12ae01c#L2L8
OK, but lets use the same thing in toolbox and GAE too. Fixed.
- Not all comments need to end in a dot, nor do exception messages need to end in a dot.
Yup, this is my personal bias. Can't be helped, sorry :-)
- XXX is not the same thing as TODO
Undone. Maybe lets also explain XXX in docs somewhere -- I can think of several different meanings of it (like FIXME, like HACK, "had no time to properly check", "a critical problem" or any of these); some more TODOs should probably be turned to XXXs then.
- django-nonrel/mongodb-engine@12ae01c#L21L8 django-nonrel/mongodb-engine@12ae01c#L50L22 okay these are really stupid. We won't do any styling on autogenerated/auto"provided" code.
Quite right, some mindless automation must have been in work here. Fixed.
- django-nonrel/mongodb-engine@12ae01c#L54L9 this doesn't help readability either.
It (sometimes) does help maintainability though (you don't have to reindent after adding another case) and (often) does not harm readability significantly. Some guides recommend this explicitly -- the Google one above, but also some for different languages (surely you'll find some that say differently); I'd say the recommendation is sound. Of course, there are exceptions, but I don't think this specific case deserves to be one.
I really really appreciate your hard work but you need to stop thinking about PEP8 (and friends) as hard rules. They're just guidelines and it's okay to think on your own sometimes.
Maybe you'd be surprised to see the styling I tend to use in private (with a lot of camelCase, whole paragraphs on a single line, and even most inline comments lowercase ;-) However, I believe it's good to stick to something for a project that many are working on -- this doesn't have to be PEP or anything other, just something.
I wouldn't like to push these style changes against anyone, so if you'd rather have just the refactor that's OK (some renaming and merging will have to be done, but otherwise it shouldn't be very hard to just drop these first commits).
from djangotoolbox.
@emperorcezar Yes, I had the ambition of making it work with the standard ForeignKey :-) I've also given a short attempt at the descriptor approach (something like defining __get__
and __set__
for AbstractIterableField
) -- didn't work, but should be possible.
You should be able to drop the db_type
method in your class and do just:
def get_db_prep_value(self, value, connection, prepared=False):
if isinstance(type(value), type(self.rel.to)):
return value.pk
else:
return value
instead of both get_db_prep_save
and get_db_prep_lookup
with the refactor.
from djangotoolbox.
Okay just did the review. This looks really really great. I added a few comments here and there. I didn't review the App Engine part because I'm not familiar with (nor interested in) it.
Thanks thanks thanks!
from djangotoolbox.
ah btw I did some MongoDB Engine compiler refactoring a while ago, if you like you can incorporate what's left after your refactoring. https://github.com/django-nonrel/mongodb-engine/tree/feature/compiler-cleanup
from djangotoolbox.
looks really great now. RFC from my point of view for djangotoolbox and MongoDB Engine.
from djangotoolbox.
@jonashaag Thanks for the review!
Second update:
- Fixed in-memory filtering (
matches_filters
only used by GAE). It was comparing non-primary key model values with database values, which doesn't work in a few cases. - Fixed decimal sorting for Mongo (by pulling GAE solution to toolbox).
- Did some more compiler refactoring:
- incorporating Jonas's branch;
- making toolbox pass converted values to filter processing (thus mostly letting back-ends only worry about writing a proper
value_for/from_db
and otherwise not having to deal with conversions at all; toolbox now also knows not to convertisnull
arguments and can handle lists of arguments forin
,range
oryear
lookups properly, so writing those conversion methods should be simple); - passing related model primary key field instead of an original related field to conversions (as generally you should convert a foreign key value in the same way as a "parent" primary key value), passing fields instead of columns to
order_by
and lookup name tovalue_for_db
.
- Coded an alternative solution for the AutoField "just ints" problem -- getting rid of the
value_to_db_auto
(we're blanking rest of thevalue_to_db_*
methods as most conversions needed are not tied to a specific field kind) and also giving AutoField's validation its strength back. There are different solutions possible (global setting / descriptor instead of the metaclass / just some public methods...), please take a closer look at this one -- I've tried a couple of different things, but nothing seems perfect. - Retested for GAE 1.6.3 and MongoDB 2.0.3.
The Django branch now uses the same name as the rest:
https://github.com/django-nonrel/django-nonrel/compare/feature/type-conversions-refactor-2
from djangotoolbox.
Hey Wojtek, are you going to merge your branches soon?
from djangotoolbox.
Anytime we decide that's the time :-) I could find some more things to improve or fix, but it would be better done on some new branches, these are already big enough.
from djangotoolbox.
If you think they're ready for merging, go ahead.
from djangotoolbox.
@wrwrwr would you be kind enough to wait a couple of days before merging this? I'd like to take a look at the patch and issue a review. I agree that it might be hard if not impossible to stick with the django's value_to_db_*
idiom but to me it's a big step backward to make DateField.get_db_prep_(value|save)
return a value that's not ready for database insertion. I mean... all the other backend work this way, how is someone porting code from django to django-nonrel is supposed to know that? What happens with people writing new field subclasses, do they have to write their custom backend to wrap value_to_db
to return the correct type for their field? How is someone calling SetField.value_to_string
is supposed to know that it will return a list?
from djangotoolbox.
@charettes Certainly, but please compare to what nonrel currently has to offer.
DateField.get_db_prep_*
-- you already have to use the nonrel conversions for many (all) fields (nonrel fields, AutoFields
, any "long string" fields with GAE etc.), so I find it more consistent this way (toolbox blanks most of the vanilla database conversions; additonal conversions are currently defined on compilers; at least you'll learn quicker that you should also pass values through them ;-) Potentially you could call value_for_db
from Field.get_db_prep_*
-- see a consideration of two future options: b4315fc#L0R109.
Field subclasses -- they don't need to do anything differently than they'd do with vanilla, the framework is due to back-ends needing to do additional value preparation.
value_to_string
-- it's the same on develop... Nonrel fields serialization is still broken anyway, I believe the method should really look more or less like this (for List/Set/DictField
, EmbeddedModelField
needs a bit more):
return smart_unicode(self._map(self.item_field.value_to_string, self._get_val_from_obj(obj)))
However, I remember someone else was working on the serialization, and this is not the place to fix it.
from djangotoolbox.
Third update:
- Yet another AufoField type solution -- a bit nonstandard usage of
Settings
instead of the static variable, in case someone needs to know / set the type in a context in which fields can't be imported. Also turned the type mismatch exception into a warning -- there are some circumstances that back-ends with different AutoField types may still be used (e.g. without using AutoField at all). - Introduced a
MONGO_INT_BASED_AUTOFIELDS
setting (defaults to False) -- ObjectIds are the natural, scalable choice, but this seems the only way to run Django test suite (at the moment ~25% tests fail mostly due to ManyToManyFields / abstract models). The code is nearly a copy of the MongoDB docs one (an additional counters collection + findAndModify). - Moved remaining parts of the value conversion framework to
DatabaseOperations
, not only compilers may need to use them. - Did some small refactoring to
_get_ordering
that lets back-ends usequery.standard_ordering
directly when there is no field ordering defined.
from djangotoolbox.
I don't like the MONGO_INT_BASED_AUTOFIELDS
thing. It doesn't help at all to make the Django test suite pass with integer IDs because all real-live use cases use ObjectIds.
I understand your desire to run the Django test suite against MongoDB Engine but please have the hacks required for that in a separate branch. (That of course may be on this GitHub repo!)
from djangotoolbox.
I have to disagree. If we want our patches accepted into Django core, then we're going to have to deal with the tests. While this may not be the way, the issue needs to be tackled.
The "cleaner" way is to open tickets against the Django tests and patch those.
from djangotoolbox.
I totally agree Adam! But I don't think the Django folks care about our backend being testing well, only the changes we make to Django.
from djangotoolbox.
Good point. I don't know how they test things. Do they have a test server that runs against each of the backends, or do they just run it against sqlite and call it good?
from djangotoolbox.
Alright, I'll move the MONGO_INT_BASED_AUTOFIELD
to a separate branch. The Django suite is quite slow anyway, hard to use in practice, but even with the AutoField substitute it can detect a lot of other problems (with the additional inconvenience of having to filter out all those "This query is not supported(...)" / "Multi-table inheritance(...)" / "Cursors (...)" failures). I had a slight hope you won't consider this a hack, but an alternative solution :-)
from djangotoolbox.
https://code.djangoproject.com/changeset/17641
https://code.djangoproject.com/changeset/17698
probably fixes the same thing you did with explicit lazy object handling?
from djangotoolbox.
Almost looks so, thanks for the pointers, should I rather remove the checks already or mark them for consideration in 1.4?
Nevertheless, these changes make me wonder about proxies in updates / lookups (toolbox currently only does the lazy evaluation for lookups) and wrappers around objects that shouldn't be cast to a string... Take a look at this comment too: https://code.djangoproject.com/ticket/10498#comment:12.
from djangotoolbox.
I'd keep the test, skipping until 1.4, and remove the fixing code.
from djangotoolbox.
Hmmm, the test will likely fail with 1.4 too (it has cases for lookups with lazy translations and it doesn't seem Django will get any further refactoring for 1.4 -- unless I've missed some big ticket; currently it doesn't need to handle this case because lookup values are always cast to strings for SQL back-ends).
The issue is not significant enough to really care about it and I'm not sure what will actually happen with 1.4, but I guess I'll leave the code on another branch; we might want to bring it back (at least to some future GAE database-adapter or just for lookups).
from djangotoolbox.
@emperorcezar here's the CI they use to test against each backend. Looks like they've broken something recently... I should have time to review the pull request today, sorry for the delay.
from djangotoolbox.
Hmm. There's not oracle on there so they don't seem to run it against all official backends.
from djangotoolbox.
@emperorcezar It used to be there, they removed it because they were having trouble setting it up on the ci box.
from djangotoolbox.
@jonashaag I've checked that the Django changes don't fix lookups (nor some potential cases with using conversions directly, not through the sql.Query) and confirmed that's also a problem for MongoDB (fails with InvalidDocument
: cannot convert value of type <class 'django.utils.functional.__proxy__'> to bson
), so I'm only marking this for reconsideration for now (testing if it's a lookup or not just adds an unnecessary condition).
from djangotoolbox.
right
from djangotoolbox.
Merged into develop. Thanks to everyone involved!
from djangotoolbox.
There's a flaw in your AutoField handling implementation -- it doesn't not allow for using e.g. SQL and MongoDB at the same time (because the AutoField type is hardcoded).
The AutoField type should depend on the backend used for each model.
from djangotoolbox.
Yes, that's true, this was by design.
Models/fields are not tied to back-ends, they only get a database connection when preparing a value for the database, however some AutoField
(type-dependent) value processing may happen without any database interaction (value validation, serialization / deserialization).
If we want to support multi-database settings with differing AutoField
types there won't be effective field-level value validation, what can potentially result in problems in non-database related areas (which may expect AutoField
values to be properly validated even though they have not passed through the database layer). It's more or less what we had previously, so maybe that's acceptable?
I've repushed the branches with two more solutions (two commits for the nonrel repo: moving to a wrong place or dropping the validation). I can also bring back the original check:
if not (value is None or isinstance(value, (basestring, int, long))):
raise exceptions.ValidationError(self.error_messages['invalid'])
but that doesn't seem much better than doing no validation at all (and doesn't work for things that could be cast to unicode
or int
, but are not instances of their subclasses).
from djangotoolbox.
It's critical to support multi-database settings. That's been touted as a benefit of using non-rel for a very long time. To break it would break a lot of installations.
from djangotoolbox.
I guess I'll merge one of the "database setting" solutions tomorrow then. I'm a bit worried about the ineffective validation, but seems there is no way around it. A careful analysis on how AutoField
values are used would be nice -- with nonrel changes (master or develop-to-be) something like this: mark_safe("The id is <em>%s</em>." % field.pk)
may become a script-injection vector where it previously wasn't and things like log spoofing may be a bit easier to carry out (but the lack of the (field-layer) validation shouldn't make any vulnerability on its own).
Do you want the old check back (so I'm not the one to blame ;-) or should I rather drop the validation completely?
from djangotoolbox.
These questions can get pretty philosophic btw ;-)
I guess I prefer the solution that does validation based on the backend's AutoField type (i.e. PKs in forms for a MongoDB model must be ObjectIds). So the backend-setting approach probably makes most sense. However why don't we just use connection.ops.convert-autofield-value-to-what-the-backend-understands()
? So that to-autopk-conversions are nicely abstracted away from the model code.
from djangotoolbox.
Back-end setting makes a lot of sense, but it's not possible to validate using it (field can validate a value that later may be used by various back-ends or no back-end; technically it's something separate from the database layer, you don't have the connection or anything like that available). I did an attempt of pushing the validation for AutoField
to the database layer, but that turns rather ugly -- Django doesn't expect it there, and it's not effective until the field is stored in the database.
The database casts / conversions are another thing, you're right: a method rather than a type-setting is better here, and that's actually what we had before :-)
So the current situation is (on tcr-2 branches):
value_to_db_auto
is back (with a minor cleanup I did some time in the past, but generally that's the original solution; the casts could be moved tovalue_for_db
if it makes it into Django -- but that should probably happen together with a deeper abstraction refactor that's described on the SoC page);AutoField's
field-layer validation is dropped -- there is no way of doing it properly and I believe not keeping the instance check makes the situation clearer (you should expectAutoFields
to hold values of various types, including arbitrary strings, and pass them through output filtering suitable for context).
from djangotoolbox.
I don't get why you think this isn't something that can be validated properly -- after all, this is code of the database abstraction layer, so we can safely assume we have a database connection (more specifically a DatabaseWrapper
instance). Maybe you're confusing form validation and model validation here?
from djangotoolbox.
This is all about field/form validation (done in to_python
where there's no connection and for this purpose it can't be easily made available), this is the one that have been changed/dropped -- vanilla Django assumes that AutoField
values will be ints
and enforces this "on input". There's no impact on the database layer (it does it's own "validation", quoting or casts; everything looks good on this level), but the int-type may have, potentially, been assumed elsewhere.
from djangotoolbox.
Agreed. The problem is that you're not thrown a validation error if you input invalid values into a FK form field but a DatabaseError which results in a HTTP 500.
But that's kind of a no-realworld-relevance usecase and the fix promises to be complex, so I'm fine with no validation at all.
Feel free to merge the branches if you think they're ready.
Well, seems like we've got this beast done :-) Thank you so much for the hard work!
from djangotoolbox.
Remerged, thanks again for all the reviews and discussions, wouldn't get anywhere without that.
from djangotoolbox.
Related Issues (20)
- django.utils.timezone is not available in djangoappengine yet HOT 1
- NoReverseMatch at /admin/auth/user/52ade053f9bed911e71dfa5e/ HOT 1
- Allow Distinct Queries HOT 2
- Nonetype object has no attribute 'model' HOT 1
- Please add documentation HOT 1
- Add ordering by fields in embedded models or dicts
- Support for max_length for ListField and SetField
- Incompatibility with Django 1.7
- python 2 print statement on test files cause exception HOT 1
- Incompatability With Django 1.8 HOT 7
- DatabaseError: Unsupported type for property : <class 'django.utils.safestring.SafeText'> HOT 6
- Bug fix
- No support DateTimeField ? is it be design ?
- Please delete this issue. Unable to remove it.
- This database doesn't support filtering on non-primary key ForeignKey fields.
- RemovedInDjango110Warning: SubfieldBase has been deprecated HOT 3
- No module named from django.utils.importlib import import_module
- importlib issue after installing django 1.11 HOT 5
- DynamicSiteIDMiddleware fails when application when the request.host is an IPv6 address
- Cert errors related to stale docs build
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from djangotoolbox.