djrobstep / sqlakeyset Goto Github PK

View Code? Open in Web Editor NEW

341.0 341.0 50.0 340 KB

offset-free paging for sqlalchemy

License: The Unlicense

Makefile 0.56% Python 99.44%

sqlakeyset's People

Contributors

Stargazers

Watchers

sqlakeyset's Issues

Do not work with Python prior to 3.6 because fstring is used

Hello.
I'm getting error using this library with python 3.5 due to the usage of fstring.
It will be better to either not use fstring or clearly documentate that python 3.6+ is required.

../../../virtualenv/python3.5.7/lib/python3.5/site-packages/sqlakeyset/paging.py:6: in <module>
    from .columns import parse_clause, find_order_key
E     File "/home/travis/virtualenv/python3.5.7/lib/python3.5/site-packages/sqlakeyset/columns.py", line 43
E       f"Ordering by nullable column {x} can cause rows to be "
E                                                              ^
E   SyntaxError: invalid syntax

Package status

Hey, thanks for the work on this. It's pretty cool.

This issue is to ask about the future. Sadly this package is the only keyset pagination lib for sqlalchemy i can find on the internets. And it also seems like you might be done with it (not judging, i know all about being too busy).

So i just wanted to verify your thoughts before i figure out my next strat.

Thanks.

Bookmark for every item in a page

Currently we can only retrieve the next or previous "bookmark".

As an extension of this, it would be nice to be able to retrieve the "bookmark" of every item in a page or have a utility function (I am unsure if one already exists?) to obtain the "bookmark" for a specific item in a page.

The use-case here is for satisfying Relay-style cursor pagination where every node has a unique cursor (see also https://github.com/photocrowd/django-cursor-pagination).

PyPI publication pipeline is broken

Hey @djrobstep, looks like PyPI has turned off username/password auth so the publish step of the CD is broken: https://app.circleci.com/pipelines/github/djrobstep/sqlakeyset/352/workflows/dd344fcb-f4b7-4d87-99d6-3d954a88d230/jobs/4321

Looks like we need to switch over to OIDC or an API token. I'd give it a shot myself but don't have admin access to this repo or maintainer status on PyPI.

Cheers

asyncio paging returns tuples of length 1

I'm using the latest version of 2.0.1691149549.

When running the following code:

query = select(Something)
page = await select_page(session, query, page=page_token, per_page=limit)

I get a list of length-1 tuples instead of a list of ORM objects. Print shows:

[(<app.models.Something object at 0x106ff7580>,)]

This causes some issues with Pydantic's orm_mode validation.

I'm expecting to get:

[<app.models.Something object at 0x106ff7580>]

Sorting by literal_column produces invalid SQL and fails

When adding a calculated field with a custom label, if we try to sort by that column we get an error saying known column in field list and when looking in to the query dump, the order by column name has being replaced with _sqlakeyset_oc_2 DESC

An example query

SELECT 
    products.name,
    products.sku
   (
     SELECT
       sum(lines.qty)
     FROM
        lines
     JOIN 
        lines.sku ON products.sku
     WHERE
         lines.sku = products.sku
    ) AS allocated_qty
FROM
   products
GROUP BY 
   products.id
ORDER BY
   allocated_qty DESC,
   products.id

In SQLAlchemy we add the orer by clause like

query.order_by(literal_column("allocated_qty").desc())

This gets translated into

SELECT 
    products.name,
    products.sku
   (
     SELECT
       sum(lines.qty)
     FROM
        lines
     JOIN 
        lines.sku ON products.sku
     WHERE
         lines.sku = products.sku
    ) AS allocated_qty
FROM
   products
GROUP BY 
   products.id
ORDER BY
   _sqlakeyset_oc_2 DESC,
   products.id

Tuple/ROW comparison not supported in Oracle and SQL Server

The official explanation of Sqlalchemy is as follows: The composite IN construct is not supported by all backends, and is currently known to work on PostgreSQL, MySQL, and SQLite. Unsupported backends will raise a subclass of DBAPIError when such an expression is invoked.

Using the SQLAlchemy 2 native UUID as the column type breaks cursor pagination

Using the sa.UUID as the column type breaks the pagination:

sqlalchemy.exc.StatementError: (builtins.AttributeError) 'str' object has no attribute 'hex'

The issue happens in the pair_for_comparison function when it applies the preprocessing:

value = compval.type.bind_processor(dialect)(value)

resulting in the UUID being converted into a str prematurely.
It must pass the uuid.UUID value directly when using the sa.UUID as the column type.

Commenting out the line value = compval.type.bind_processor(dialect)(value) resolves the issue.

Custom binding logic issue

Hi! I have and issue when the library is trying to get the associated dialect when doing get_page

We have a custom session that deals with a read and write engine, so we can lock to a write when needed. This switch is doen by overriding the get_bind method based on some context.

The problem is that the sqlalkeyset is doing

if place:
 dialect = getattr(s, "bind", s).dialect

So that is returning None for us. I was able to fix that by doing

maker = sessionmaker(class_=ReadWriteSession, bind=metadata.bind)
sess = scoped_session(maker)

Adding the bind param was the temporary solution

The problem is that now it will always use the same bind. I think the sqlakeyset library should be calling the get_bind method instead so custom logic from custom sessions is executed.

This is kinda a blocker for us.

What do you think?

`row._mapping` is broken after pagination

Already found the culprit at

sqlakeyset/sqlakeyset/sqla20.py

Lines 105 to 112 in aa097fc

 # 2.0.11+ 

 structure = ( 

 { # Strip out added OCs from the keymap: 

 k: row[v] 

 for k, v in row._key_to_index.items() 

 if not (isinstance(k, str) and k.startswith(ORDER_COL_PREFIX)) 

 }, 

 )

Instead of key -> col value mapping, Row wants key -> col index mapping.

I'm running SQLAlchemy 2.0.23 and I have no idea if something has changed after 2.0.11.

page.paging.bookmark_next only contains `>` and page.paging.next is always (None, False)

Hello everyone,

I'm integrating sqlakeyset in a project I'm working on and something is not working as expected.

Scenario

In my test I prepare 100 records and I ask the pagination to paginate by 10 records (get_page(query, per_page=10)), ordering by purchase_date (it's a datetime) and id (it's a unique, str type id).

What works

In my page I correctly have only the first 10 records.

What doesn't work

page.paging.bookmark_next always contains > only, while I would expect it to contain >d:2020-08-31....etc.... I've tried to sort my data with a few possible combinations: purchase_date only, id only and other str fields that are generated random (and uniquely) at every INSERT. Nothing works.

Next page looks empty too:

ipdb> len(page)                                                                                                                                                                                                                     
10

ipdb> page.paging.next                                                                                                                                                                                                              
(None, False)

which is weird, because there are 90 records left to be returned.

Questions

am I doing something wrong?
is there any known bug?
what else could I try to understand what is not working?

bookmark_ serializer does not support uuid.UUID objects

Error with aliased columns (ie with from_self) if model contains field named "info"

I have an SQLAlchemy orm model with a column named info.

Normally, get_page works just fine for queries generated using that model. However, if I construct using the from_self construct, e.g. session.query(MyModel).from_self().order_by(MyModel.some_field), calling get_page on it raises an exception:

  File "/xxx/python3.8/site-packages/sqlakeyset/columns.py", line 328, in derive_order_key
    mapper = expr.parent
AttributeError: type object 'MyModel' has no attribute 'parent'

The reason is that this piece of code incorrectly assumes that expr is an attribute based on just the fact it has an attribute named info: https://github.com/djrobstep/sqlakeyset/blob/6bcc01e/sqlakeyset/columns.py#L327

I think by bare minimum that code should check hasattr for both info and parent (and name), and that would not likely break any existing code.
That said, there's probably a better way to do that check. After all, it's not unreasonable for a model to have fields info, parent and name.

The reason that this only occurs for the query with from_self is that the check on this line: https://github.com/djrobstep/sqlakeyset/blob/6bcc01e/sqlakeyset/columns.py#L321
fails with an error (which is swallowed by the try-except clause) like this:

sqlalchemy.orm.exc.UnmappedColumnError: No column %(140242126363088 anon)s.my_model_some_field is configured on mapper mapped class MyModel->my_model...

For the query without from_self, the mapper finds the column correctly and thus the function returns before it has a change to crash on line 328.

Polymorphic tables cause ValueError: can't find value for column parent.name in the results returned

I have a structure where I query my Child table which has a Parent table.

class Child(Parent):
    child_id = Column(Integer, ForeignKey(Parent.parent_id), primary_key=True)

    __mapper_args__ = {
        'polymorphic_identity': CLS_INDEX['Child'],  # comes from elsewhere in the code
        'inherit_condition': (child_id == Parent.parent_id)
    }
    ...

class Parent(Base):
    name = Column(String)
    parent_id = Column(Integer, primary_key=True)
    ...

When I try to query Child.name, I receive the ValueError in the title.

>>> q = session.query(Child).order_by(Child.name)
>>> get_page(q, per_page=50)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mekhami/.envs/lab7/local/lib/python2.7/site-packages/sqlakeyset/paging.py", line 250, in get_page
    backwards)
  File "/home/mekhami/.envs/lab7/local/lib/python2.7/site-packages/sqlakeyset/paging.py", line 143, in orm_get_page
    current_marker=place)
  File "/home/mekhami/.envs/lab7/local/lib/python2.7/site-packages/sqlakeyset/paging.py", line 39, in orm_page_from_rows
    paging = Paging(rows, page_size, ocols, backwards, current_marker, get_marker)
  File "/home/mekhami/.envs/lab7/local/lib/python2.7/site-packages/sqlakeyset/results.py", line 115, in __init__
    self.marker_1 = get_marker(rows[0], ocols)
  File "/home/mekhami/.envs/lab7/local/lib/python2.7/site-packages/sqlakeyset/paging.py", line 109, in orm_placemarker_from_row
    return tuple(get_value(x) for x in ocols)
  File "/home/mekhami/.envs/lab7/local/lib/python2.7/site-packages/sqlakeyset/paging.py", line 109, in <genexpr>
    return tuple(get_value(x) for x in ocols)
  File "/home/mekhami/.envs/lab7/local/lib/python2.7/site-packages/sqlakeyset/paging.py", line 108, in get_value
    raise ValueError(CANT_FIND.format(ocol.full_name))
ValueError: can't find value for column parent.name in the results returned

No attribute 'is_single_entity' with Flask-SQLAlchemy

Hello, I am trying to use this library with an existing project and it seems that my q does not have the is_single_entity attribute that is being used here:

https://github.com/djrobstep/sqlakeyset/blob/master/sqlakeyset/paging.py#L22

My models inherit from db.Model which comes from Flask-SQLAlchemy. When I run type(q) it returns <class 'flask_sqlalchemy.BaseQuery'> which I located here:

https://github.com/pallets/flask-sqlalchemy/blob/master/src/flask_sqlalchemy/__init__.py#L416

it looks like it inherits from orm.Query which is here:

https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/orm/query.py#L71

Which appears to have the desired is_single_entity attribute.. does anyone have any ideas as to why that property wouldn't be propagating through the inheritance hierarchy to my Model?

When paging backwards, page.paging.items() seems to misalign bookmarks and values

I'm using sqlakeyset to implement relay-style pagination in a strawberry graphql server, and I noticed that when paging backwards using items(), the bookmarks and values are iterating in opposite directions. Sorry in advance for the graphql noise, it's just the easiest way for me to demonstrate the bug since I already have it implemented and can copy-paste.

For example, first I iterate forwards like:

{
  accounts(first: 5, order: [{col: FULL_NAME, orderType: ASC}]) {
    pageInfo{
      hasNextPage
      hasPreviousPage
    }
    edges {
      cursor
      node {
        fullName
      }
    }
  }
}

Results:

{
  "data": {
    "accounts": {
      "pageInfo": {
        "hasNextPage": true,
        "hasPreviousPage": false,
      },
      "edges": [
        {
          "cursor": "s:Abel French~i:259",
          "node": {
            "fullName": "Abel French"
          }
        },
        {
          "cursor": "s:Abraham Banks~i:85",
          "node": {
            "fullName": "Abraham Banks"
          }
        },
        {
          "cursor": "s:Albert Ballard~i:28",
          "node": {
            "fullName": "Albert Ballard"
          }
        },
        {
          "cursor": "s:Alexis Carter~i:191",
          "node": {
            "fullName": "Alexis Carter"
          }
        },
        {
          "cursor": "s:Algae Mountain~i:120",
          "node": {
            "fullName": "Algae Mountain"
          }
        }
      ]
    }
  }
}

The cursor that I'm using is just sqlakeyset.results.s.serialize_values(keyset) where keyset is grabbed from for (keyset, _), value in page.paging.items(). If I then take one of those cursors and iterate backwards via sqlakeyset.get_page(query, per_page=5, before=sqlakeyset.results.s.unserialize_values(cursor)), e.g. on "s:Algae Mountain~i:120" above

I get:

{
  "data": {
    "accounts": {
      "pageInfo": {
        "hasNextPage": true,
        "hasPreviousPage": false
      },
      "edges": [
        {
          "cursor": "s:Alexis Carter~i:191",
          "node": {
            "fullName": "Abel French"
          }
        },
        {
          "cursor": "s:Albert Ballard~i:28",
          "node": {
            "fullName": "Abraham Banks"
          }
        },
        {
          "cursor": "s:Abraham Banks~i:85",
          "node": {
            "fullName": "Albert Ballard"
          }
        },
        {
          "cursor": "s:Abel French~i:259",
          "node": {
            "fullName": "Alexis Carter"
          }
        }
      ]
    }
  }
}

Notice that the cursors (which are just keysets taken from page.paging.items()) are reversed relative to the values.

Maybe this is because I'm not encoding the backwards property of the bookmark in my cursor, but I wouldn't think that would matter.

Please let me know if there's any additional information I can provide to help with debugging.

Versions:
sqlakeyset: 1.0.1650280980
sqlalchemy: 1.4.37
database: PostgreSQL 14

Breaks when using sqlalchemy session

Breaks when using .session(), see:

import sqlalchemy as sa
from sqlakeyset import select_page
from sqlalchemy import Integer
from sqlalchemy.orm import DeclarativeBase, Mapped, Session, mapped_column


class Base(DeclarativeBase):
    pass


from sqlalchemy import create_engine

engine = create_engine("postgresql+psycopg://postgres:[email protected]:5432/guru", echo=True)


class User2(Base):
    __tablename__ = "user2"
    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)


with Session(engine) as session:
    query_breaks = sa.select(User2).order_by(User2.id)
    page_breaks = select_page(session, query_breaks, per_page=3)
    print([type(page_breaks), page_breaks.paging.next])

    page_breaks = select_page(session, query_breaks, per_page=3, after=page_breaks.paging.next)

Output:

Traceback (most recent call last):
  File "/home/guru/Desktop/my_project/minimal.py", line 26, in <module>
    page_breaks = select_page(session, query_breaks, per_page=3, after=page_breaks.paging.next)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/my_project/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 411, in select_page
    return core_get_page(session, selectable, per_page, place, backwards)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/my_project/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 286, in core_get_page
    sel = prepare_paging(
          ^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/my_project/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 214, in prepare_paging
    condition = where_condition_for_page(order_cols, place, dialect)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/my_project/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 97, in where_condition_for_page
    raise InvalidPage(
sqlakeyset.serial.serial.InvalidPage: Page marker has different column count to query's order clause

Forwards/backwards does not work when ordered by DESC datetime

I have a set of user notifications I am trying to get paginated and this library is making it so that the bookmark_next and bookmark_previous calls are not consistent. We have a business requirement of ordering our SQL results by DESC datetime (Alert.time). However this seems to throw off the keyset and makes it so when you try to navigate the bookmark to return the previous result set, it does the opposite. we use the pyramid framework. Note: remove_objects_from_multiple_records(response) just JSON serializes any data that needs it. Is there a requirement that the incoming serialized bookmark data (sent from the client containing the output from bookmark_ generated during a previous call to our API) be parsed back into its tuple format before getting put in the get_page()? Please see our implementation below and thank you for your time:

#
@view_config(route_name='read_all_users_alerts',
             permission='user' if DEV_PERMISSIONS_ENABLED is not True else DEV_PERMISSIONS,
             renderer='json')
def read_all_users_alerts(request):
    page = request.params.get('page') if request.params.get('page') is not None else None
    per_page = int(request.params.get('per_page')) if request.params.get('per_page') is not None else None
    users_uuid = request.matchdict.get('users_uuid') if 'users_uuid' in request.matchdict else request.authenticated_userid if request.has_permission(permission="admin") else request.authenticated_userid
    filter_type = int(request.params.get('filter_type')) if request.params.get('filter_type') is not None else 4
    filter_parameter = request.params.get('filter_parameter')
    try:
        query = build_query_statement_alerts_users_alerts_tables(request, users_uuid, filter_type, filter_parameter)
        if page is not None and per_page is not None:
            count = query.count()
            result = get_page(query, per_page=per_page, page=page)
        elif page is None and per_page is None:
            count = query.count()
            result = get_page(query, per_page=20)
        else:
            return Response(json={"Error": "Bad Request"}, content_type='application/json', status=400)
        current_page = result.paging.bookmark_current
        if result.paging.has_previous:
            previous_page = result.paging.bookmark_previous
        else:
            previous_page = ">"
        if result.paging.has_next:
            next_page = result.paging.bookmark_next
        else:
            next_page = "<"
        response = []
        for item in result:
            item[0].deleted = item[1].deleted
            item[0].users_uuid = item[1].users_uuid
            item[0].alert_status = item[1].alert_status
            item[0].total_alerts = count
            item[0].current_page = current_page
            item[0].next_page = next_page
            item[0].previous_page = previous_page
            response.append(item[0])
        return remove_objects_from_multiple_records(response)
    except DBAPIError:
        return Response(json={"Error": db_err_msg}, content_type='application/json', status=500)

# @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
#
def build_query_statement_alerts_users_alerts_tables(request, users_uuid, filter_type, filter_parameter):
    switcher = {
        0: lambda: read_multiple_alerts_camera_name(request, users_uuid, filter_parameter),
        1: lambda: read_multiple_alerts_camera_group_name(request, users_uuid, filter_parameter),
        2: lambda: read_multiple_alerts_trigger_type(request, users_uuid, filter_parameter),
        3: lambda: read_multiple_alerts_time_range(request, users_uuid, filter_parameter),
        4: lambda: read_multiple_alerts(request, users_uuid)
    }
    return switcher.get(filter_type, lambda: "type not valid")()


# @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
#
def read_multiple_alerts_camera_name(request, users_uuid, cameras_name):
    return request.dbsession.query(Alert, UserAlert) \
        .join(UserAlert, UserAlert.alerts_uuid == Alert.uuid) \
        .filter(
            UserAlert.users_uuid == users_uuid,
            UserAlert.deleted == False,
            Alert.cameras_name == cameras_name
        ).order_by(Alert.time.desc(), UserAlert.id)


# @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
#
def read_multiple_alerts_camera_group_name(request, users_uuid, camera_groups_name):
    return request.dbsession.query(Alert, UserAlert) \
        .join(UserAlert, UserAlert.alerts_uuid == Alert.uuid) \
        .filter(
            UserAlert.users_uuid == users_uuid,
            UserAlert.deleted == False,
            Alert.camera_groups_name == camera_groups_name
        ).order_by(Alert.time.desc(), UserAlert.id)


# @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
#
def read_multiple_alerts_trigger_type(request, users_uuid, trigger_type):
    return request.dbsession.query(Alert, UserAlert) \
        .join(UserAlert, UserAlert.alerts_uuid == Alert.uuid) \
        .filter(
            UserAlert.users_uuid == users_uuid,
            UserAlert.deleted == False,
            Alert.trigger_type == trigger_type
        ).order_by(Alert.time.desc(), UserAlert.id)


# @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
#
def read_multiple_alerts_time_range(request, users_uuid, time_range):
    time_range = list(map(str, time_range.split(',')))
    time_range[0] = parse(time_range[0], fuzzy=True).replace(tzinfo=None)
    time_range[1] = parse(time_range[1], fuzzy=True).replace(tzinfo=None)
    return request.dbsession.query(Alert, UserAlert) \
        .join(UserAlert, UserAlert.alerts_uuid == Alert.uuid) \
        .filter(
            UserAlert.users_uuid == users_uuid,
            UserAlert.deleted == False,
            and_(Alert.time >= time_range[0], Alert.time <= time_range[1])
        ).order_by(Alert.time.desc(), UserAlert.id)


# @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
#
def read_multiple_alerts(request, users_uuid):
    return request.dbsession.query(Alert, UserAlert) \
        .join(UserAlert, UserAlert.alerts_uuid == Alert.uuid) \
        .filter(
            UserAlert.users_uuid == users_uuid,
            UserAlert.deleted == False,
        ).order_by(Alert.time.desc(), UserAlert.id)```

Question: Converting Page to Pydantic Model

How do I convert a db model returned from sqlakeyset page to a pydantic model ? When I convert using from_orm, there isn't any error but it just get 1 empty record. My solution now is to loop the page and append to the pydantic model manually. But I wanted to use the from_orm out of the box without writing any custom mapping. Is this possible ?

Example code

# Pydantic Model
class AppSchemaModel(BaseModel):
  customer_id: str
  field1: Optional[str] = None
  field2: Optional[str] = None
  field3: Optional[str] = None

  class Config:
      orm_mode = True

# SQLAlchemy model
class AppModel(Base):
  customer_id = Column(String(36))
  field1 = Column(String(50))
  field2 = Column(String(50))
  field3 = Column(String(50))
  created_date = Column(TIMESTAMP_TZ)


query = (
    db_session.query(AppModel)
    .filter(AppModel.customer_id == customer_id)
    .order_by(AppModel.created_date.desc())
)
page = get_page(query, per_page=5)

# The page returns 5 records successfully, but i want to convert the
# db model into pydantic.
data = AppSchemaModel.from_orm(page)
# data now contain 1 instead of 5 records but all the 4 fields are empty.

Support multiple engines in one Session

With multiple Engines, Session.get_bind() needs a context to fetch the correct Engine. It seems that just passing clause=q.statement for it fixes it for us.

I guess this was broken even before 93bc7da but our use case just didn't reach the code after if place:.

EDIT: Ah, as q can also be a statement already, maybe s.get_bind(clause=getattr(q, 'statement', q)) is ok?

asyncpg uuid support

I'm using asyncpg and getting this error when trying to sort by created_at and uuid.

sqlakeyset.serial.serial.UnregisteredType: Don't know how to serialize type of d12ca4d0-2d5c-4ffe-b23e-1473efbd8997 (<class 'asyncpg.pgproto.pgproto.UUID'>). Use custom_bookmark_type to register it.

asyncpg.pgproto.pgproto.UUID is a subclass of uuid.UUID.

I added this code:

custom_bookmark_type(
    AsyncPgUUID,
    "uuid2",
)

Which works, but it'd be nice if it would work out of the box.

Is it possible to use this library when querying multiple tables

I have a complex query with that is getting rows from a table and counts related to those rows from various tables, e.g.

s.query(
            CM1,
            sq_cc.c.child_count,
            func.count(Comment.id.distinct()).label("comment_count"),
            func.count(CM2.id.distinct()).label("member_count"),
            func.count(CM2.id.distinct()).filter(CM2.liked).label("likes"),
            Card.comments.any(Comment.user_id == CM1.user_id),
)

Is it possible to use this library with such a query? My initial attempts were met with a TypeError: Boolean value of this clause is not defined error, and when I tried the change in #7 I got a sqlalchemy.orm.exc.UnmappedColumnError: No column cards.created_when is configured on mapper mapped class CardMembership->card_memberships... error.

Any thoughts? I'm thinking I should just implement the paging manually for this query.

ImportError: cannot import name 'LegacyRow' from 'sqlalchemy.engine.row'

Hello,

I get the following error, which I guess is caused by SQLAlchemy >= 2.0:

ImportError: cannot import name 'LegacyRow' from 'sqlalchemy.engine.row'

Is there any workaround ?

Versions:

SQLAlchemy==2.0.3
sqlakeyset==1.0.1659142803

ModuleNotFoundError: No module named 'packaging'

the packaging package is imported here, but not declared in the package requirements.

I'm seeing a ModuleNotFoundError being thrown after the latest release in a project that uses pip for package management and sqlalchemy.

Accessing page at certain index

Hi,
I am experimenting with sqlakeyset library and have one question: if I want to access some specific page; say page number 10 (out of total 11 pages) - do I have to 'unpack' each page while looping like this:

for i in range(1, page_index):
    if not page.paging.next:
        break

    page = get_page(query, per_page=per_page, page=page.paging.next)

I tried without unpacking, but that does not work. I am not sure will this perform good on large datasets.

Thank you kindly.

Bst regards

Pagination not working with .desc()

It seems that it is not possible to page query results while using .desc() or desc(...) I guess it would be extremely helpful.

SQLAlchemy 2 wrong type for column desc

Using Postgres and getting this error when I call select_page against core (not ORM):

  File "/Users/csantero/projects/example/.venv/lib/python3.10/site-packages/sqlakeyset/asyncio.py", line 84, in select_page
    return await core_get_page(s, selectable, per_page, place, backwards)
  File "/Users/csantero/projects/example/.venv/lib/python3.10/site-packages/sqlakeyset/asyncio.py", line 30, in core_get_page
    sel = prepare_paging(
  File "/Users/csantero/projects/example/.venv/lib/python3.10/site-packages/sqlakeyset/paging.py", line 196, in prepare_paging
    mapped_ocols = [find_order_key(ocol, column_descriptions) for ocol in order_cols]
  File "/Users/csantero/projects/example/.venv/lib/python3.10/site-packages/sqlakeyset/paging.py", line 196, in <listcomp>
    mapped_ocols = [find_order_key(ocol, column_descriptions) for ocol in order_cols]
  File "/Users/csantero/projects/example/.venv/lib/python3.10/site-packages/sqlakeyset/columns.py", line 429, in find_order_key
    ok = derive_order_key(ocol, desc, index)
  File "/Users/csantero/projects/example/.venv/lib/python3.10/site-packages/sqlakeyset/columns.py", line 370, in derive_order_key
    entity = desc["entity"]
KeyError: 'entity'

Python: 3.10.10
SQLAlchemy: 2.0.18
sqlakeyset: 2.0.1691149549
psycopg: 3.1.9

I've tried both AsyncEngine and Engine with psycopg 3. I also tried psycopg2 Engine and got the same result.

When I debug I can tell that desc is a dict, and not a ColumnElement:

"{'name': 'my_column', 'type': UUID(), 'expr': Column('my_column', UUID(), table=<my_table>)}"

If I stick this code at the start of derive_order_key I can extract the ColumnElement and then everything works as expected:

    if isinstance(desc, dict):
        desc = desc.get("expr")
        if desc is None:
            return None

If this looks like the right approach then I'm happy to work up a PR, though I'm not quite sure how to produce a failing test case.

Add type hints

Hi guys,
thanks a lot for creating this library. Since SQLAlchemy has type hints (in typeshed and as sqlalchemy2-stubs), it would be great if types could also be added to this library.

Do you think this is possible?

Support SQLAlchemy 2.0-style ORM queries

perform_paging is broken when running SQLAlchemy 1.4.7.

This fix worked for me:

def perform_paging(q, per_page, place, backwards, orm=True, s=None):
    if orm:
        selectable = orm_to_selectable(q)
        s = q.session
        column_descriptions = q.column_descriptions
        keys = orm_query_keys(q)
    else:
        selectable = q
        column_descriptions = q._raw_columns

def perform_paging(q, per_page, place, backwards, orm=True, s=None):
    column_descriptions = q.column_descriptions
    if orm:
        selectable = orm_to_selectable(q)
        s = q.session
        keys = orm_query_keys(q)
    else:
        selectable = q

Explicit error when unserialization failed ?

It will be nice to have an explicit error when unserialization failed.
Using serialization for http api, it's possible that client send uncorrect bookmark page string, which
return different python unexplicit error, which are difficult to catch correctly (get distinction between broken code and incorrect bookmark string from client).

Here, my workaround to handle this case directly in my own code (my software use page_token naming):

try:
    a = unserialize_bookmark(page_token)
except Exception as e:
    raise InvalidPageToken('page token given is not valid')
return get_page(query, per_page=count, page=page_token or False)

paging doesn't work with custom types in ordering

We're using SQLAlchemyUtils' ArrowType for representing datetime fields in our models (as opposed to plain Python datetimes). This works fine with the rest of SQLAlchemy, but seems to cause problems with sqlakeyset when a column with this type is used in an ordering clause.

In particular, here's a simple example that fails:

In [62]: q = db.session.query(m.Conversation).order_by(m.Conversation.modified_at.desc())

In [63]: results = get_page(q, per_page=20)

In [64]: results.paging.next
Out[64]: ((<Arrow [2017-07-20T16:12:35+00:00]>,), False)

In [65]: results2 = get_page(q, per_page=20, page=results.paging.next)
---------------------------------------------------------------------------
ProgrammingError                          Traceback (most recent call last)
# Exception stack trace...
ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'Arrow'
# query details...

I'm not entirely sure how custom types like ArrowType work, but it looks like the type conversion that typically happens is skipped here, and as a result the lower-level psycopg2 library doesn't know what to do with this type.

Any idea how to fix this?

Breaks on sqlalchemy & flask-sqlalchemy on a simple model

See example code (python 3.11, sqlalchemy 2.0.20, flask-sqlalchemy 3.0.5, sqlakeyset 2.0.1691149549):

import time

import gevent
import sqlalchemy as sa
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from sqlakeyset import get_page, select_page

app = Flask(__name__)
# configure the SQLite database, relative to the app instance folder
app.config["SQLALCHEMY_DATABASE_URI"] = "postgresql+psycopg://postgres:[email protected]:5432/postgres"
# initialize the app with the extension
db = SQLAlchemy(app)

from sqlalchemy import Integer
from sqlalchemy.orm import Mapped, mapped_column


class User2(db.Model):
    __tablename__ = "user2"
    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)


with app.app_context():
    db.create_all()
    for _i in range(50):
        user = User2()
        db.session.add(user)
    db.session.commit()

    # works
    query_works = User2.query.order_by(User2.id.desc())
    page_works = get_page(query_works, per_page=3)
    print([type(page_works), page_works.paging.next])

    # breaks
    query_breaks = sa.select(User2).order_by(User2.id)
    page_breaks = select_page(db.session.connection(), query_breaks, per_page=3)
    print([type(page_breaks), page_breaks.paging.next])

And exception:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/flask/__main__.py", line 3, in <module>
    main()
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/flask/cli.py", line 1064, in main
    cli.main()
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/click/decorators.py", line 92, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/flask/cli.py", line 912, in run_command
    raise e from None
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/flask/cli.py", line 898, in run_command
    app = info.load_app()
          ^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/flask/cli.py", line 309, in load_app
    app = locate_app(import_name, name)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/flask/cli.py", line 219, in locate_app
    __import__(module_name)
  File "/home/guru/Desktop/myproject/myflask.py", line 39, in <module>
    page_breaks = select_page(db.session.connection(), query_breaks, per_page=3)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 411, in select_page
    return core_get_page(s, selectable, per_page, place, backwards)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 297, in core_get_page
    page = core_page_from_rows(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 328, in core_page_from_rows
    key_rows = [tuple(col.get_from_row(row) for col in mapped_ocols) for row in rows]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 328, in <listcomp>
    key_rows = [tuple(col.get_from_row(row) for col in mapped_ocols) for row in rows]
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/sqlakeyset/paging.py", line 328, in <genexpr>
    key_rows = [tuple(col.get_from_row(row) for col in mapped_ocols) for row in rows]
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/home/guru/Desktop/myproject/.venv/lib/python3.11/site-packages/sqlakeyset/columns.py", line 323, in get_from_row
    return getattr(row[self.index], self.attr)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'int' object has no attribute 'id'

Process finished with exit code 1

First queries are unbounded

Hi,

Thank you for creating this library.

I am a bit puzzled by the queries that this library is producing. Given a query(Model).order_by(*pks), this library is generating 3 queries when get_page(query, 20) is used:

a SELECT * FROM ... ORDER BY pk
a SELECT * FROM ... ORDER BY pk
a SELECT * FROM ... ORDER BY pk ASC LIMIT 21

(observed by the statements outputted by the sqlalchemy.engine logger to INFO)

Do we need to perform the first two (equal) unbounded statements against the table to retrieve the bookmarks? These statements are quite expensive in a large table

FeatureRequest: get_pages/select_pages

I use sqlakeyset to do keyset-based pagination in my GraphQL server. It works great for top-level resolvers, but if I have nested pages, I end up encountering the N + 1 GraphQL problem. The solution to the N + 1 problem is basically a "DataLoader" that just implements a batch API. Functionally, as it relates to sqlakeyset, that means implementing a "get_pages".

I went ahead and implemented this for my company, and was hoping that sqlakeyset would be receptive to homing this functionality. I'd be happy to send a PR.

More specifically, I implemented a get_homogeneous_pages that assumes the queries all select the same columns (but can have different filters or order_bys), which allows us to do a UNION ALL and make a single round trip to the database. That would be an easy addition to sqlakeyset because there's no need to understand the caller's nor session's threading model. A get_heterogeneous_pages may be useful to someone else, but requires making assumptions about how to execute asynchronously that I think sqlakeyset probably shouldn't touch.

Thoughts? Would you be open to adding this if I sent a PR?

Support pagination key serialization

So although this library does work, perhaps it would be better if the pagination key was returned in such a way that supports serialization of DateTime, etc? Right now, the pagination key is just a list, which means I need to implement my own custom serializer, especially if the keyset uses DateTime. It doesn't end up saving me much code once I've implemented that.

A bonus would be if it doesn't leak the abstraction. Could just be base64 encoding, to discourage clients from messing with the pagination token.

check table.name instead of tablename in orm_placemarker_from_row

line 74 in paging.py might be more robust with the check as:

        if entity.__table__.name == ocol.table_name:

Since the entity will (eventually) have a __table__ attribute but might not have a __tablename__ one.
see: http://docs.sqlalchemy.org/en/latest/orm/extensions/declarative/table_config.html#using-a-hybrid-approach-with-table

Doesn't work with SQLAlchemy 1.4.0b1

sqlalchemy.engine.result.RowProxy is gone (thought wasn't actually even used in paging.py) and so is sqlalchemy.util.lightweight_named_tuple. Not sure if the lightweight namedtuple is moved somewhere else or if it's gone for good.

Exception due to double-resolution of values with Enums in pagination

I have an enum value in a table:

class AddressFamily(enum.Enum):
    IPv4 = 4
    IPv6 = 6

    Column('address_family', Enum(AddressFamily)),

When I paginate a core query, sqlakeyset removes some columns in paging.py:

        N = len(row._row) - len(extra_columns)
        row = row[:N]

__getitem__ on BaseRowProxy however applies processing to the values, which resolves the enum from its string name to the actual enum value:

                            l.append(processor(value))

When I then try to read the value from the row I received out of pagination:

row[prefixes.c.address_family]

Here, the value gets resolved again, using the str() of the already resolved enum as a lookup key, and I get an exception: 'AddressFamily.IPv4' is not among the defined enum values. Enum name: addressfamily. Possible values: IPv4, IPv6

ORM Query is called 3 times.

Calling query._iter() executes the query.

In sqla14.py:8 I think the code can be replaced with

return [c['name'] for c in query.column_descriptions]

At least seems to work so far.

Another execution comes from sqla14.py:20.
Not sure how to tackle that and I guess the comment in core_get_page is related.

Newlines in sort columns break deserialization

If a query is ordered by a text-type column and the values stored in this column include newlines, parsing the resulting bookmarks results in markers cut off after the newline:

>>> from sqlakeyset.results import s
>>> s.unserialize_values(s.serialize_values(('hello\nthere', 12)))
['hello']

SQLite compatibility

sqlakeyset uses the row() function which is not supported by SQLite. Using row values (name1, name2) in the comparison should also be possible and is supported by most database systems:

Get the total number of pages and / or records returned by query

First of all, thanks for the great work!

I was wondering whether there is a way to efficiently query the total amount of pages and / or records (so not just the records on a given page) that are returned by the query?

I have not found anything in the documentation, nor in the source, so I'm just making sure I'm not missing anything.

Thanks!

Support sqlalchemy 1.4.0

Pipenv breaks when trying to install with sqlalchemy >=1.4.0b1

	# 2.0.11+
	structure = (
	{ # Strip out added OCs from the keymap:
	k: row[v]
	for k, v in row._key_to_index.items()
	if not (isinstance(k, str) and k.startswith(ORDER_COL_PREFIX))
	},
	)