Pagination with FastAPI

Learn how to easily and efficiently implement pagination for FastAPI

We identified our pagination requirements and developed a flexible, typed, OpenAPI-friendly solution that avoids unnecessarily complex type definitions in openapi.json.

For each of our projects, we first consider which backend technology is the most suitable. In addition to a conventional CMS for websites, we typically rely on Laravel or FastAPI for APIs. FastAPI impresses us with its quick and easy option to set up a typed API. Thanks to the automatic OpenAPI generation, we also have a high level of reliability when developing frontends.

One topic that comes up in practically every API is pagination: Instead of a list endpoint returning all records at once, the results are divided into pages with limited sizes. This keeps responses small, fast, and easy for clients to process.
We've thought about what we need from pagination and come up with a solution that's flexible, typed, and OpenAPI-friendly, without any unnecessarily complex type definitions in the openapi.json-file.

You can find a repository with the files and an example here.

The Tech Stack #

Although the solution is relatively generic, some parts are specifically tailored to the libraries we have chosen. Here is a brief overview of our stack:

  • Python 3.13 as the programming language
  • FastAPI as the framework
  • SQLModel as the ORM
  • PostgreSQL as the database

If you use a different ORM library (other than SQLAlchemy), the helper functions will need to be adapted. However, the concepts themselves remain the same.

Requirements #

At the start of every software project, the question arises: What are our actual requirements? Since pagination is a classic topic in software development, these are quickly determined:

  • Pagination should be strictly typed (Pydantic)
  • Filtering and sorting must be done in the backend
  • Filtering and sorting should also work for nested objects
    (e.g., person.contact_info.first_name )
  • Efficient database queries
  • The response should contain one page + the total number of results
  • The defined classes should be easily reusable (generics)
  • The schema generated for OpenAPI should remain easy to read
    (no Literal[“id”, ‘name’, “email”] )
  • Endpoints should be usable in Swagger UI.
  • We adhere to REST standards → GET endpoints have no body parameters.

Generic Classes #

Based on the requirements defined above, we have developed several generic classes based on Pydantic's BaseModel class.

Pagination #

types/pagination.py
class Pagination[T](BaseModel):
    page: list[T]
    total: int

The Pagination class defines the generic return value of a list endpoint.

  • page is an array of the respective placeholder class (T).
  • total returns the total number of hits.

ListInput, Filter und Sort #

types/pagination.py
class ListInput[T](BaseModel):
    limit: int | None = Field(100, gt=0, le=100)
    offset: int | None = Field(0, ge=0)
    sort: list[Sort[T]] | None = None
    filter: list[Filter[T]] | None = None

The ListInput class defines which arguments can be passed to a list endpoint.
Since it is generic, the permitted arguments can be easily customized for each endpoint.
The following parameters are available:

  • limit determines the maximum number of values per page
  • offset is the number of values to be skipped
  • sort specifies the ordering of the results, while filter restricts which items are included.
types/pagination.py
class Sort[T](BaseModel):
    sort_field: T | None = None
    sort_order: Literal["asc", "desc"] | None = None

The Sort class allows you to

  • define a field (sort_field) to sort by,
  • and specify the sort order (sort_order) with the values asc or desc.
types/pagination.py
filter_functions = Literal["eq", "lt", "lte", "gt", "gte", "not", "like", "isnull"]

class Filter[T](
    BaseModel,
):
    filter_field: T
    filter_function: filter_functions
    filter_value: str | int | float | bool | datetime

    @field_validator("filter_value", mode="before")
    def parse_datetime(cls, v):
        if isinstance(v, str):
            try:
                # Try parsing the string as a datetime
                return datetime.fromisoformat(v)
            except ValueError:
                pass  # If it can't be parsed, return it as a string
        return v

The Filter class allows you to

  • choose a field (filter_field),
  • select a comparison function (filter_function),
  • and provide a value (filter_value). Since date values are passed as ISO strings, the field_validator checks whether the value is a date and converts it if necessary. Without this conversion, true date comparisons at the database level would not be possible.
    We deliberately restricted the available filter functions to ensure the frontend can offer a clear and consistent selection.

Problems in Swagger UI #

Even though FastAPI generates an OpenAPI schema and the inputs in Swagger UI look correct at first glance, an issue arises as soon as you try out endpoints that use the generic classes:

response.json
{
  "detail": [
    {
      "type": "model_attributes_type",
      "loc": [
        "query",
        "sort",
        0
      ],
      "msg": "Input should be a valid dictionary or object to extract fields from",
      "input": "{\"sort_field\":\"id\",\"sort_order\":\"asc\"}"
    }
  ]
}

he issue is that Swagger sends object fields as JSON strings. However, Pydantic cannot automatically convert these strings into objects. There are different ways to solve this problem.
We decided to go with a compromise using a Custom Validator:

  • Swagger remains fully functional.
  • In the frontend, however, the arguments must be passed using JSON.stringify().

Custom Validator #

validators.py
def convert_items_to_json(x: list[str]) -> list[dict]:
    items = []
    try:
        for item in x:
            items.append(json.loads(item))
    except Exception:
        return items
    return items

StringDictValidator = BeforeValidator(lambda x: convert_items_to_json(x) if x is not None else [])

With this validator, string inputs can be interpreted as JSON.

Fixed ListInput Class #

types/pagination.py
class ListInput[T](BaseModel):
    limit: int | None = Field(100, gt=0, le=100)
    offset: int = Field(0, ge=0)
    sort: Annotated[list[Sort[T]] | None, StringDictValidator] = Field(default=None, validate_default=True)
    filter: Annotated[list[Filter[T]] | None, StringDictValidator] = Field(default=None, validate_default=True)

We add the validator using an annotation, this makes Pydantic attempt to interpret the elements as JSON strings during validation. This way, the Swagger UI remains usable, with only a minor limitation in the frontend.

Concrete Example #

Let’s look at a simple example to see how the Pagination and ListInput classes can be used in an endpoint:

example_router.py
router = APIRouter(prefix="/example", tags=["example"])

class ContactInfo(BaseModel):
    name: str
    email: str
    phone: str

class ExampleOut(BaseModel):
    id: int
    contact: ContactInfo

example_fields = Literal["id", "contact.name", "contact.email", "contact.phone"]

@router.get(
    "/",
    response_model=Pagination[ExampleOut],
    # don't forget to add dependencies (database, user, scopes,...)
)
def list_examples(
    list_input: Annotated[ListInput[example_fields], Query()],
):
    return ExampleService().list(list_input)  # We ignore service implementation for now

The code snippet includes the following key elements:

  • list_input: Annotated[ListInput[example_fields], Query()]
    Here, we define a ListInput where the generic type T is set to a literal variable (example_fields).
    The annotation with Query() also makes it clear that the values are expected as query parameters, not as body parameters (see requirements above).
  • response_model=Pagination[ExampleOut]
    We use the Pagination class as the response model and provide ExampleOut as the generic type T.

With this simple setup, everything is prepared. We only need to implement the service class.
At this stage, the Swagger UI and the generated openapi.json already provide a fully documented interface for a paginated list endpoint.

If it weren't for... #

There is an issue when using Literal to define the allowed fields. It affects the generated schema, making the field names cumbersome and hard to read:

Filter[Literal['id', 'contact.name', 'contact.email', 'contact.phone']]

This is technically correct but hardly readable. As soon as multiple field names exist, it quickly becomes confusing, especially for endpoints with many options.

Better Solution: Enum #

Instead of Literal, we use an Enum class:

example_router.py
class ExampleFields(str, Enum):
    id = "id"
    name = "contact.name"
    email = "contact.email"
    phone = "contact.phone"

@router.get(
    "/",
    response_model=Pagination[ExampleOut],
)
def list_examples(
    list_input: Annotated[ListInput[ExampleFields], Query()],
):
    return ExampleService().list(list_input)

Advantages of this approach:

  • We retain the ability to clearly define the allowed fields.
  • In the code, fields can be accessed directly via field.name
  • In the openapi.json, the type is cleanly represented as Filter[ExampleFields], which is clearly named and easily identifiable.

Integration into a Service #

Once the interface is in place, the service implementation must follow to ensure that the correct results are returned from the database.
An example service class could look like this:

example_service.py
class Example(SQLModel, table=True):
    id: int | None = Field(default=None, primary_key=True)
    contact: ContactInfo = Relationship(back_populates="example")

class ExampleService:
    # add joins to prevent lazy loading for big pages
    query_options: ClassVar[list[Options]] = [
       joinedload(Example.contact)
    ]

    # pass the db session directly to the Service or collect it from a ContextVar
    def __init__(self, db: Session | None = None):
        if db is None:
            self.db = db_session.get()
        else:
            self.db = db

    # a simple example on how to convert from SQLModel to our response_model
    @classmethod
    def to_out(cls, example: Example) -> ExampleOut:
        return ExampleOut.model_validate(example, from_attributes=True)

    def list(self, list_input: ListInputExample) -> Pagination[ExampleOut]:
        page, total = get_page(Example, list_input, self.query_options)
        return Pagination(total=total, page=[self.to_out(example) for example in page])

So far, everything should be understandable, but the core logic actually resides in the get_page function.
This is where it's determined how filtering, sorting, limit, and offset are applied to the database queries.

Helper Functions #

Our goal is to write utility functions that can be reused independently of the specific SQLModel class and the filters being used. Let’s start with the central function, get_page.

get_page #

pagination.py
def get_page[T](model: type[T], list_input: ListInput, options: list[Options] | None = None) -> tuple[Sequence[T], int]:
    session = db_session.get()
    wheres, sorts = get_where_and_sort(model, list_input)

    filter_and_sort_fields = list(
        set(
            [f.filter_field for f in list_input.filter or [] if f.filter_field and "." in f.filter_field]
            + [s.sort_field for s in list_input.sort or [] if s.sort_field and "." in s.sort_field]
        )
    )

    statement = select(model)
    statement = apply_joins(statement, model, filter_and_sort_fields)

    if wheres:
        statement = statement.where(*wheres)
    if sorts:
        statement = statement.order_by(*sorts)

    total = get_total(statement)

    # options are initial joins to prevent lazy loading sub-tables they are not needed for the total query
    if options:
        statement = statement.options(*options)

    statement = statement.offset(list_input.offset).limit(list_input.limit)
    rows = session.exec(statement).all()
    return rows, total

Function Signature #

def get_page[T](model: type[T], list_input: ListInput, options: list[Options] | None = None) -> tuple[Sequence[T], int]:

The function is generic (T) and receives:

  • model: the SQLModel class (type information for query/mapping),
  • list_input: client parameters (limitoffsetfiltersort),
  • optional options: SQLAlchemy Options (e.g., joinedload) to prevent lazy loading.

Return value: a tuple containing the result list (Sequence[T]) and the total count (int).

Database session and parsing filters and sorts #

session = db_session.get()
wheres, sorts = get_where_and_sort(model, list_input)

We retrieve the current DB session (in our case via a ContextVar) and extract the WHERE and ORDER_BY clauses from the list_input using get_where_and_sort. This helper function encapsulates the logic for parsing filters and sorting.

Finding required Joins #

filter_and_sort_fields = list(
    set(
        [f.filter_field for f in list_input.filter or [] if f.filter_field and "." in f.filter_field]
        + [s.sort_field for s in list_input.sort or [] if s.sort_field and "." in s.sort_field]
    )
)

We collect all field paths (dot notation) that involve relationships, e.g., contact.address.country. From these fields, we later determine which joins are necessary so that filtering and sorting work on nested fields.

Base Statement and Joins #

statement = select(model)
statement = apply_joins(statement, model, filter_and_sort_fields)

Once the input data has been prepared, the initial SELECT statement is created: statement = select(model).
Then, apply_joins adds the necessary JOINs for all relationships found in filter_and_sort_fields (more on this below).

Applying WHERE and ORDER BY #

if wheres:
    statement = statement.where(*wheres)
if sorts:
    statement = statement.order_by(*sorts)

Filter and sort clauses are applied to the statement. wheres and sorts are SQLAlchemy expressions generated by get_where_and_sort.

Calculating the Total Count: get_total #

total = get_total(statement)
def get_total(statement: Select) -> int:
    session = db_session.get()
    count_statement = select(func.count()).select_from(statement.order_by(None).subquery())
    return session.exec(count_statement).one()

get_total creates a count query from the current statement (without order_by) and returns the total number of results. It's important that order_by is removed during counting because it can impact performance and is irrelevant for the total count.

Eager-loading #

if options:
    statement = statement.options(options)

If pre-defined options (e.g., joinedload) were provided, we add them now. This prevents the N+1 problem when accessing relationships in the loaded rows (see here for more details).

Limit, Offset and Execution #

statement = statement.offset(list_input.offset).limit(list_input.limit)
rows = session.exec(statement).all()

Finally, we apply the offset and limit, execute the query, and fetch the result rows.

Return Values #

return rows, total

We return the results along with the total count, exactly the values that will later populate our Pagination object.

get_where_and_sort #

The functions called here form the core of our pagination implementation.

pagination.py
from sqlalchemy.sql._typing import _ColumnExpressionArgument, _ColumnExpressionOrStrLabelArgument

Where = _ColumnExpressionArgument[bool] | bool
OrderBy = _ColumnExpressionOrStrLabelArgument[Any]

def get_where_and_sort[T](model: type[T], list_input: ListInput) -> tuple[list[Where], list[OrderBy]]:
    wheres: list[Where] = add_filter(model, list_input)
    sort: list[OrderBy] = add_order_by(model, list_input)
    return wheres, sort

The function receives an SQLModel class as the first parameter and our ListInput as the second.
It returns two lists of SQLAlchemy column expressions.
To make the code easier to read, the internal SQLAlchemy types have been aliased to simpler names (WhereOrderBy).

add_filter #

pagination.py
filter_functions_map = {
    "eq": lambda col, val: col == val,
    "not": lambda col, val: col != val,
    "lt": lambda col, val: col < val,
    "lte": lambda col, val: col <= val,
    "gt": lambda col, val: col > val,
    "gte": lambda col, val: col >= val,
    "like": lambda col, val: col.ilike(f"%{val}%"),
    "isnull": lambda col, val: col.is_(None) if val else col.is_not(None),
}

def add_filter[T](model: type[T], list_input: ListInput) -> list[Where]:
    wheres = []
    if list_input.filter is not None:
        for filter_in in list_input.filter:
            if filter_in.filter_field and filter_in.filter_function and filter_in.filter_value is not None:
                column = resolve_column(model, filter_in.filter_field)
                if column is not None and filter_in.filter_function in filter_functions_map:
                    clause = filter_functions_map[filter_in.filter_function](column, filter_in.filter_value)
                    wheres.append(clause)
    return wheres

This function iterates through all filters in the list_input variable and generates the corresponding WHERE statements using the filter_function_map.
This version is case-sensitive. If you want to ignore case, a special handling for enums is required, as SQLAlchemy does not provide functions like .ilike for such columns.

add_order_by #

pagination.py
def resolve_column(model: type, path: str) -> ColumnElement | None:
    parts = path.split(".")
    attr = getattr(model, parts[0], None)

    for part in parts[1:]:
        if attr is None:
            return None
        attr = getattr(attr.property.mapper.class_, part, None)

    return attr

def add_order_by[T](model: type[T], list_input: ListInput) -> list[OrderBy]:
    order_bys = []
    if list_input.sort:
        for sort in list_input.sort:
            if sort.sort_field:
                column = resolve_column(model, sort.sort_field)
                if column is not None:
                    order_clause = asc(column) if sort.sort_order == "asc" else desc(column)
                    order_bys.append(order_clause)
    return order_bys

The add_order_by function iterates through all sort instructions in list_input and tries to locate the correct column based on dot-notation.
It then applies the appropriate sort order using asc() or desc().

apply_joins #

The longest function in our pagination implementation comes last:

pagination.py
def apply_joins(statement: Select, model: type[SQLModel], filter_and_sort_fields: list[str]) -> Select:
    joined = set()
    current_path = []

    for field in filter_and_sort_fields:
        parts = field.split(".")
        if len(parts) < 2:
            continue  # Not a relationship field

        current_model = model
        current_path.clear()
        for part in parts[:-1]:  # all but last part are relationships
            current_path.append(part)
            path_str = ".".join(current_path)

            if path_str in joined:
                continue

            relationship_attr = getattr(current_model, part, None)
            if relationship_attr is None:
                break

            rel_prop = getattr(relationship_attr, "property", None)
            if not isinstance(rel_prop, RelationshipProperty):
                break

            statement = statement.join(relationship_attr)
            joined.add(path_str)
            current_model = rel_prop.mapper.class_

    return statement

This function is similar to the add_order_by implementation, but here both filters and sorting are taken into account. Based on the dot-notation, it determines which fields need to be resolved through relationships. The corresponding tables are then added to the existing statement using .join().

Conclusion #

This completes our implementation of a flexible and efficient pagination solution for FastAPI. With just four classes and seven functions, we’ve created a system that integrates seamlessly with any type of list endpoint. The solution is generic, extensible, and leverages the full power of SQLAlchemy and SQLModel without becoming unnecessarily complex.
An additional advantage is that we retain full control over performance, readability, and extensibility. Especially in projects where pagination is a core feature across many endpoints, this investment quickly pays off.

Alternatives #

Of course, when implementing a very common functionality, one might ask why not just use an existing library. However, these are often less flexible and less efficient:

  • Many libraries need to support multiple ORMs, which makes it difficult to achieve the same level of precision as a custom implementation.
  • Some are inefficient: for example, FastAPI Pagination loads the entire result set first and then slices it into pages, instead of performing pagination directly at the database level.
  • Other projects, such as fastapi-paginate, are no longer actively maintained, which creates long-term maintenance risks.

With our own lean implementation, we achieve:

  • A clean API for filtering, sorting, and pagination,
  • Optimized database queries that avoid unnecessary load,
  • A future-proof solution fully under our control.

This results in pagination that is not only functional but also understandable, extensible, and performant.

Digression: How options improve Performance #

Throughout this blog post, we’ve repeatedly touched on performance, particularly why the options parameter can have a crucial impact. To wrap up, let’s take a closer look.
SQLAlchemy (and by extension SQLModel) supports lazy loading for relationships by default. This means that as soon as an object is in memory and a field that hasn’t been loaded yet is accessed, SQLAlchemy automatically fetches the necessary data.
This is convenient as long as the field isn’t always needed. But for endpoints that always require this data, especially if done in a loop for multiple objects, lazy loading can result in a significant number of additional database queries.

Example #

Consider a Person table. Each person is linked to ContactContact to Address, and Address to Country.
If we want to load all persons along with their contact, address, and country information, the difference becomes clear:

  • Without options (lazy loading)
    • One query for the total number of persons
    • One query for the page of persons
    • For each person: one query for the contact, one for the address, and one for the country
    • Even for a page with a single person, that’s already 5 queries
  • With joinedload in options
    • One query for the total number of persons
    • One query for the page including contact, address, and country
    • Only 2 queries in total

The difference may seem minor with a single person, but for a page with 100 results, we’re talking 302 vs. 2 queries. This isn’t just theoretical, the impact on load times is immediately noticeable, even in a relatively simple scenario.
By using options strategically, we can avoid unnecessary queries and massively improve the performance of list endpoints.

If you need support for your project or help optimizing slow queries, feel free to contact us at hello@bitperfect.at.