Giter Club home page Giter Club logo

Comments (8)

coleifer avatar coleifer commented on May 22, 2024

As to the first one, I believe that is due to unpredictable ordering with dictionaries. Joins are stored using a dictionary, so whether it joins on Blog first or EntryTag first is up in the air -- this one is a bad test case.

As to the count failure -- that's pretty odd. Looks like you're ending up with 2 extra entries per blog -- canyou do any more digging?

from peewee.

madisvain avatar madisvain commented on May 22, 2024

Well it's not that simple. I spent some time on this - this seems really weird. There are 10 entries for blog in the database if I check. But somehow it seems to return 12 yes. As I said really odd.

I will look into this as soon as I get some time. I want to help make this project better.
I think peewee rocks. It's simple, fast ... fantastic. Hope to use it in some of my future projects.

from peewee.

coleifer avatar coleifer commented on May 22, 2024

While the one test failure is obnoxious (sql / dict ordering), I haven't been able to reproduce at all the other. Closing for now, but feel free to reopen if you come up w/any ideas.

from peewee.

tmoertel avatar tmoertel commented on May 22, 2024

I'm encountering the same failure with test_count, with 200 entries expected but 240 found. But the problem isn't that there are two extra blog rows in the database. Rather there are only 10 blog rows, as expected, but the first two get iterated over twice (see the annotated test output at the end of this report).

To see what's happening, consider the test corresponding test code:

        # original from tests.py, line ~606
        for blog in SelectQuery(Blog):
            for i in xrange(20):
                self.create_entry(title='entry%d' % i, blog=blog)

There's an outer loop over the results of a select query and, within this loop, an inner loop that creates 20 Entry rows for the currently selected Blog row. The outer loop is lazy, however, which means the underlying SELECT statement is held open during the entire loop. This has an unfortunate interaction with the inner loop's autocommited INSERT. When self.create_entry finally winds its way down to a peewee.Database.execute call on the given INSERT statement, the latter method finishes the operation with a commit on the database connection. This is where things go wrong.

The problem is that when certain versions of pysqlite commit a connection, they also reset all statements and cursors open on that connection. That means the SELECT statement behind the outer loop gets reset and replays the first two rows. (Why not just the first row? Because pysqlite fetches one row ahead; by the time the first line of the inner loop executes, the second blog row has already been fetched and so it, too, will get replayed.)

Here are two possible workarounds:

        # fix 1: read blogs eagerly
        for blog in list(SelectQuery(Blog)):
            for i in xrange(20):
                self.create_entry(title='entry%d' % i, blog=blog)

        # fix 2: keep laziness but corral commit w/ explicit transaction
        def create_blog_entries():
            for blog in SelectQuery(Blog):
                for i in xrange(20):
                    self.create_entry(title='entry%d' % i, blog=blog)
        test_db.commit_on_success(create_blog_entries)()

I hope that explanation makes sense.

Cheers,
Tom

P.S. As promised, here's the debug log that shows what statements get executed when the problem occurs. I've added annotations to make it easier to correlate the statements with the original test code.


[thor@helium peewee (master)]$ nosetests -v tests:ModelTestCase
test_cloning (tests.ModelTestCase) ... ok
test_count (tests.ModelTestCase) ... FAIL
test_count_with_joins_issue27 (tests.ModelTestCase) ... ok
test_delete (tests.ModelTestCase) ... ok
test_inner_joins (tests.ModelTestCase) ... ok
test_insert (tests.ModelTestCase) ... ok
test_multi_joins (tests.ModelTestCase) ... ok
test_outer_joins (tests.ModelTestCase) ... ok
test_pagination (tests.ModelTestCase) ... ok
test_update (tests.ModelTestCase) ... ok

======================================================================
FAIL: test_count (tests.ModelTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/peewee/tests.py", line 614, in test_count
    self.assertEqual(count, 200)
AssertionError: 240 != 200
-------------------- >> begin captured logging << --------------------

peewee.logger: DEBUG: ('DROP TABLE IF EXISTS "membership";', None)
peewee.logger: DEBUG: ('DROP TABLE IF EXISTS "member";', None)
peewee.logger: DEBUG: ('DROP TABLE IF EXISTS "team";', None)
peewee.logger: DEBUG: ('DROP TABLE IF EXISTS "relationship";', None)
peewee.logger: DEBUG: ('DROP TABLE IF EXISTS "users";', None)
peewee.logger: DEBUG: ('DROP TABLE IF EXISTS "entrytag";', None)
peewee.logger: DEBUG: ('DROP TABLE IF EXISTS "entry";', None)
peewee.logger: DEBUG: ('DROP TABLE IF EXISTS "blog";', None)

peewee.logger: DEBUG: ('CREATE TABLE "blog" ("id" INTEGER NOT NULL PRIMARY KEY, "title" VARCHAR(255) NOT NULL);', None)
peewee.logger: DEBUG: ('CREATE TABLE "entry" ("pk" INTEGER NOT NULL PRIMARY KEY, "title" VARCHAR(50) NOT NULL, "content" TEXT NOT NULL, "pub_date" DATETIME, "blog_id" INTEGER NOT NULL REFERENCES blog (id) ON DELETE CASCADE);', None)
peewee.logger: DEBUG: ('CREATE  INDEX "entry_blog_id" ON "entry"("blog_id");', None)
peewee.logger: DEBUG: ('CREATE TABLE "entrytag" ("id" INTEGER NOT NULL PRIMARY KEY, "tag" VARCHAR(50) NOT NULL, "entry_id" INTEGER NOT NULL REFERENCES entry (pk));', None)
peewee.logger: DEBUG: ('CREATE  INDEX "entrytag_entry_id" ON "entrytag"("entry_id");', None)
peewee.logger: DEBUG: ('CREATE TABLE "users" ("id" INTEGER NOT NULL PRIMARY KEY, "username" VARCHAR(50) NOT NULL, "blog_id" INTEGER REFERENCES blog (id), "active" SMALLINT NOT NULL);', None)
peewee.logger: DEBUG: ('CREATE  INDEX "users_blog_id" ON "users"("blog_id");', None)
peewee.logger: DEBUG: ('CREATE  INDEX "users_active" ON "users"("active");', None)
peewee.logger: DEBUG: ('CREATE TABLE "relationship" ("id" INTEGER NOT NULL PRIMARY KEY, "from_user_id" INTEGER NOT NULL REFERENCES users (id), "to_user_id" INTEGER NOT NULL REFERENCES users (id));', None)
peewee.logger: DEBUG: ('CREATE  INDEX "relationship_to_user_id" ON "relationship"("to_user_id");', None)
peewee.logger: DEBUG: ('CREATE  INDEX "relationship_from_user_id" ON "relationship"("from_user_id");', None)
peewee.logger: DEBUG: ('CREATE TABLE "team" ("id" INTEGER NOT NULL PRIMARY KEY, "name" VARCHAR(255) NOT NULL);', None)
peewee.logger: DEBUG: ('CREATE TABLE "member" ("id" INTEGER NOT NULL PRIMARY KEY, "username" VARCHAR(255) NOT NULL);', None)
peewee.logger: DEBUG: ('CREATE TABLE "membership" ("id" INTEGER NOT NULL PRIMARY KEY, "team_id" INTEGER NOT NULL REFERENCES team (id), "member_id" INTEGER NOT NULL REFERENCES member (id));', None)
peewee.logger: DEBUG: ('CREATE  INDEX "membership_member_id" ON "membership"("member_id");', None)
peewee.logger: DEBUG: ('CREATE  INDEX "membership_team_id" ON "membership"("team_id");', None)



# From tests.py:
#        for i in xrange(10):
#            self.create_blog(title='a%d' % i)

peewee.logger: DEBUG: ('INSERT INTO "blog" ("title") VALUES (?)', [u'a0'])
... 8 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "blog" ("title") VALUES (?)', [u'a9'])


# From tests.py:
#         count = SelectQuery(Blog).count()
#         self.assertEqual(count, 10)
# 
#         count = Blog.select().count()
#         self.assertEqual(count, 10)

peewee.logger: DEBUG: ('SELECT COUNT(id) FROM "blog"', [])
peewee.logger: DEBUG: ('SELECT COUNT(id) FROM "blog"', [])



# From tests.py:
#         for blog in SelectQuery(Blog):
#             for i in xrange(20):
#                 self.create_entry(title='entry%d' % i, blog=blog)

peewee.logger: DEBUG: ('SELECT "id", "title" FROM "blog"', [])

# NOTE that the first two rows get replayed
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [1, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [2, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [1, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [2, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [3, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [4, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [5, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [6, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [7, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [8, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [9, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('INSERT INTO "entry" ("blog_id","content","pub_date","title") VALUES (?,?,?,?)', [10, '', None, u'entry0'])
... 19 lines omitted ...
peewee.logger: DEBUG: ('SELECT COUNT(pk) FROM "entry"', [])
--------------------- >> end captured logging << ---------------------

----------------------------------------------------------------------
Ran 10 tests in 56.481s

FAILED (failures=1)

from peewee.

coleifer avatar coleifer commented on May 22, 2024

Brilliant! I figured it went somewhere along those lines but could not figure out why just blog 0 & 1 were duplicated -- thank you for the report, it was educational!

from peewee.

coleifer avatar coleifer commented on May 22, 2024

per your suggestion, eb91835

from peewee.

madisvain avatar madisvain commented on May 22, 2024

I thought it was something special to do with sqlite but could never figure this out. Thank you very much for explaining this.

from peewee.

commadelimited avatar commadelimited commented on May 22, 2024

@tmoertel's comment did the trick for me. Thank you.

from peewee.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.