Giter Club home page Giter Club logo

fastapi-best-practices's Introduction

FastAPI Best Practices

Opinionated list of best practices and conventions we used at our startup.

For the last 1.5 years in production, we have been making good and bad decisions that impacted our developer experience dramatically. Some of them are worth sharing.

Contents

  1. Project Structure. Consistent & predictable.
  2. Excessively use Pydantic for data validation.
  3. Use dependencies for data validation vs DB.
  4. Chain dependencies.
  5. Decouple & Reuse dependencies. Dependency calls are cached.
  6. Follow the REST.
  7. Don't make your routes async, if you have only blocking I/O operations.
  8. Custom base model from day 0.
  9. Docs.
  10. Use Pydantic's BaseSettings for configs.
  11. SQLAlchemy: Set DB keys naming convention.
  12. Migrations. Alembic.
  13. Set DB naming convention.
  14. Set tests client async from day 0.
  15. BackgroundTasks > asyncio.create_task.
  16. Typing is important.
  17. Save files in chunks.
  18. Be careful with dynamic pydantic fields.
  19. SQL-first, Pydantic-second.
  20. Validate hosts, if users can send publicly available URLs.
  21. Raise a ValueError in custom pydantic validators, if schema directly faces the client.
  22. FastAPI converts Pydantic objects to dict, then to Pydantic object, then to JSON
  23. If you must use sync SDK, then run it in a thread pool.
  24. Use linters (black, ruff).
  25. Bonus Section.

Project sample built with these best-practices in mind.

1. Project Structure. Consistent & predictable

There are many ways to structure the project, but the best structure is a structure that is consistent, straightforward, and has no surprises.

  • If looking at the project structure doesn't give you an idea of what the project is about, then the structure might be unclear.
  • If you have to open packages to understand what modules are located in them, then your structure is unclear.
  • If the frequency and location of the files feels random, then your project structure is bad.
  • If looking at the module's location and its name doesn't give you an idea of what's inside it, then your structure is very bad.

Although the project structure, where we separate files by their type (e.g. api, crud, models, schemas) presented by @tiangolo is good for microservices or projects with fewer scopes, we couldn't fit it into our monolith with a lot of domains and modules. Structure that I found more scalable and evolvable is inspired by Netflix's Dispatch with some little modifications.

fastapi-project
├── alembic/
├── src
│   ├── auth
│   │   ├── router.py
│   │   ├── schemas.py  # pydantic models
│   │   ├── models.py  # db models
│   │   ├── dependencies.py
│   │   ├── config.py  # local configs
│   │   ├── constants.py
│   │   ├── exceptions.py
│   │   ├── service.py
│   │   └── utils.py
│   ├── aws
│   │   ├── client.py  # client model for external service communication
│   │   ├── schemas.py
│   │   ├── config.py
│   │   ├── constants.py
│   │   ├── exceptions.py
│   │   └── utils.py
│   └── posts
│   │   ├── router.py
│   │   ├── schemas.py
│   │   ├── models.py
│   │   ├── dependencies.py
│   │   ├── constants.py
│   │   ├── exceptions.py
│   │   ├── service.py
│   │   └── utils.py
│   ├── config.py  # global configs
│   ├── models.py  # global models
│   ├── exceptions.py  # global exceptions
│   ├── pagination.py  # global module e.g. pagination
│   ├── database.py  # db connection related stuff
│   └── main.py
├── tests/
│   ├── auth
│   ├── aws
│   └── posts
├── templates/
│   └── index.html
├── requirements
│   ├── base.txt
│   ├── dev.txt
│   └── prod.txt
├── .env
├── .gitignore
├── logging.ini
└── alembic.ini
  1. Store all domain directories inside src folder
    1. src/ - highest level of an app, contains common models, configs, and constants, etc.
    2. src/main.py - root of the project, which inits the FastAPI app
  2. Each package has its own router, schemas, models, etc.
    1. router.py - is a core of each module with all the endpoints
    2. schemas.py - for pydantic models
    3. models.py - for db models
    4. service.py - module specific business logic
    5. dependencies.py - router dependencies
    6. constants.py - module specific constants and error codes
    7. config.py - e.g. env vars
    8. utils.py - non-business logic functions, e.g. response normalization, data enrichment, etc.
    9. exceptions.py - module specific exceptions, e.g. PostNotFound, InvalidUserData
  3. When package requires services or dependencies or constants from other packages - import them with an explicit module name
from src.auth import constants as auth_constants
from src.notifications import service as notification_service
from src.posts.constants import ErrorCode as PostsErrorCode  # in case we have Standard ErrorCode in constants module of each package

2. Excessively use Pydantic for data validation

Pydantic has a rich set of features to validate and transform data.

In addition to regular features like required & non-required fields with default values, Pydantic has built-in comprehensive data processing tools like regex, enums for limited allowed options, length validation, email validation, etc.

from enum import Enum
from pydantic import AnyUrl, BaseModel, EmailStr, Field, constr

class MusicBand(str, Enum):
   AEROSMITH = "AEROSMITH"
   QUEEN = "QUEEN"
   ACDC = "AC/DC"


class UserBase(BaseModel):
    first_name: str = Field(min_length=1, max_length=128)
    username: constr(regex="^[A-Za-z0-9-_]+$", to_lower=True, strip_whitespace=True)
    email: EmailStr
    age: int = Field(ge=18, default=None)  # must be greater or equal to 18
    favorite_band: MusicBand = None  # only "AEROSMITH", "QUEEN", "AC/DC" values are allowed to be inputted
    website: AnyUrl = None

3. Use dependencies for data validation vs DB

Pydantic can only validate the values from client input. Use dependencies to validate data against database constraints like email already exists, user not found, etc.

# dependencies.py
async def valid_post_id(post_id: UUID4) -> Mapping:
    post = await service.get_by_id(post_id)
    if not post:
        raise PostNotFound()

    return post


# router.py
@router.get("/posts/{post_id}", response_model=PostResponse)
async def get_post_by_id(post: Mapping = Depends(valid_post_id)):
    return post


@router.put("/posts/{post_id}", response_model=PostResponse)
async def update_post(
    update_data: PostUpdate,  
    post: Mapping = Depends(valid_post_id), 
):
    updated_post: Mapping = await service.update(id=post["id"], data=update_data)
    return updated_post


@router.get("/posts/{post_id}/reviews", response_model=list[ReviewsResponse])
async def get_post_reviews(post: Mapping = Depends(valid_post_id)):
    post_reviews: list[Mapping] = await reviews_service.get_by_post_id(post["id"])
    return post_reviews

If we didn't put data validation to dependency, we would have to add post_id validation for every endpoint and write the same tests for each of them.

4. Chain dependencies

Dependencies can use other dependencies and avoid code repetition for similar logic.

# dependencies.py
from fastapi.security import OAuth2PasswordBearer
from jose import JWTError, jwt

async def valid_post_id(post_id: UUID4) -> Mapping:
    post = await service.get_by_id(post_id)
    if not post:
        raise PostNotFound()

    return post


async def parse_jwt_data(
    token: str = Depends(OAuth2PasswordBearer(tokenUrl="/auth/token"))
) -> dict:
    try:
        payload = jwt.decode(token, "JWT_SECRET", algorithms=["HS256"])
    except JWTError:
        raise InvalidCredentials()

    return {"user_id": payload["id"]}


async def valid_owned_post(
    post: Mapping = Depends(valid_post_id), 
    token_data: dict = Depends(parse_jwt_data),
) -> Mapping:
    if post["creator_id"] != token_data["user_id"]:
        raise UserNotOwner()

    return post

# router.py
@router.get("/users/{user_id}/posts/{post_id}", response_model=PostResponse)
async def get_user_post(post: Mapping = Depends(valid_owned_post)):
    return post

5. Decouple & Reuse dependencies. Dependency calls are cached.

Dependencies can be reused multiple times, and they won't be recalculated - FastAPI caches dependency's result within a request's scope by default, i.e. if we have a dependency that calls service get_post_by_id, we won't be visiting DB each time we call this dependency - only the first function call.

Knowing this, we can easily decouple dependencies onto multiple smaller functions that operate on a smaller domain and are easier to reuse in other routes. For example, in the code below we are using parse_jwt_data three times:

  1. valid_owned_post
  2. valid_active_creator
  3. get_user_post,

but parse_jwt_data is called only once, in the very first call.

# dependencies.py
from fastapi import BackgroundTasks
from fastapi.security import OAuth2PasswordBearer
from jose import JWTError, jwt

async def valid_post_id(post_id: UUID4) -> Mapping:
    post = await service.get_by_id(post_id)
    if not post:
        raise PostNotFound()

    return post


async def parse_jwt_data(
    token: str = Depends(OAuth2PasswordBearer(tokenUrl="/auth/token"))
) -> dict:
    try:
        payload = jwt.decode(token, "JWT_SECRET", algorithms=["HS256"])
    except JWTError:
        raise InvalidCredentials()

    return {"user_id": payload["id"]}


async def valid_owned_post(
    post: Mapping = Depends(valid_post_id), 
    token_data: dict = Depends(parse_jwt_data),
) -> Mapping:
    if post["creator_id"] != token_data["user_id"]:
        raise UserNotOwner()

    return post


async def valid_active_creator(
    token_data: dict = Depends(parse_jwt_data),
):
    user = await users_service.get_by_id(token_data["user_id"])
    if not user["is_active"]:
        raise UserIsBanned()
    
    if not user["is_creator"]:
       raise UserNotCreator()
    
    return user
        

# router.py
@router.get("/users/{user_id}/posts/{post_id}", response_model=PostResponse)
async def get_user_post(
    worker: BackgroundTasks,
    post: Mapping = Depends(valid_owned_post),
    user: Mapping = Depends(valid_active_creator),
):
    """Get post that belong the active user."""
    worker.add_task(notifications_service.send_email, user["id"])
    return post

6. Follow the REST

Developing RESTful API makes it easier to reuse dependencies in routes like these:

  1. GET /courses/:course_id
  2. GET /courses/:course_id/chapters/:chapter_id/lessons
  3. GET /chapters/:chapter_id

The only caveat is to use the same variable names in the path:

  • If you have two endpoints GET /profiles/:profile_id and GET /creators/:creator_id that both validate whether the given profile_id exists, but GET /creators/:creator_id also checks if the profile is creator, then it's better to rename creator_id path variable to profile_id and chain those two dependencies.
# src.profiles.dependencies
async def valid_profile_id(profile_id: UUID4) -> Mapping:
    profile = await service.get_by_id(profile_id)
    if not profile:
        raise ProfileNotFound()

    return profile

# src.creators.dependencies
async def valid_creator_id(profile: Mapping = Depends(valid_profile_id)) -> Mapping:
    if not profile["is_creator"]:
       raise ProfileNotCreator()

    return profile

# src.profiles.router.py
@router.get("/profiles/{profile_id}", response_model=ProfileResponse)
async def get_user_profile_by_id(profile: Mapping = Depends(valid_profile_id)):
    """Get profile by id."""
    return profile

# src.creators.router.py
@router.get("/creators/{profile_id}", response_model=ProfileResponse)
async def get_user_profile_by_id(
     creator_profile: Mapping = Depends(valid_creator_id)
):
    """Get creator's profile by id."""
    return creator_profile

Use /me endpoints for users resources (e.g. GET /profiles/me, GET /users/me/posts)

  1. No need to validate that user id exists - it's already checked via auth method
  2. No need to check whether the user id belongs to the requester

7. Don't make your routes async, if you have only blocking I/O operations

Under the hood, FastAPI can effectively handle both async and sync I/O operations.

  • FastAPI runs sync routes in the threadpool and blocking I/O operations won't stop the event loop from executing the tasks.
  • Otherwise, if the route is defined async then it's called regularly via await and FastAPI trusts you to do only non-blocking I/O operations.

The caveat is if you fail that trust and execute blocking operations within async routes, the event loop will not be able to run the next tasks until that blocking operation is done.

import asyncio
import time

@router.get("/terrible-ping")
async def terrible_catastrophic_ping():
    time.sleep(10) # I/O blocking operation for 10 seconds
    pong = service.get_pong()  # I/O blocking operation to get pong from DB
    
    return {"pong": pong}

@router.get("/good-ping")
def good_ping():
    time.sleep(10) # I/O blocking operation for 10 seconds, but in another thread
    pong = service.get_pong()  # I/O blocking operation to get pong from DB, but in another thread
    
    return {"pong": pong}

@router.get("/perfect-ping")
async def perfect_ping():
    await asyncio.sleep(10) # non-blocking I/O operation
    pong = await service.async_get_pong()  # non-blocking I/O db call

    return {"pong": pong}

What happens when we call:

  1. GET /terrible-ping
    1. FastAPI server receives a request and starts handling it
    2. Server's event loop and all the tasks in the queue will be waiting until time.sleep() is finished
      1. Server thinks time.sleep() is not an I/O task, so it waits until it is finished
      2. Server won't accept any new requests while waiting
    3. Then, event loop and all the tasks in the queue will be waiting until service.get_pong is finished
      1. Server thinks service.get_pong() is not an I/O task, so it waits until it is finished
      2. Server won't accept any new requests while waiting
    4. Server returns the response.
      1. After a response, server starts accepting new requests
  2. GET /good-ping
    1. FastAPI server receives a request and starts handling it
    2. FastAPI sends the whole route good_ping to the threadpool, where a worker thread will run the function
    3. While good_ping is being executed, event loop selects next tasks from the queue and works on them (e.g. accept new request, call db)
      • Independently of main thread (i.e. our FastAPI app), worker thread will be waiting for time.sleep to finish and then for service.get_pong to finish
      • Sync operation blocks only the side thread, not the main one.
    4. When good_ping finishes its work, server returns a response to the client
  3. GET /perfect-ping
    1. FastAPI server receives a request and starts handling it
    2. FastAPI awaits asyncio.sleep(10)
    3. Event loop selects next tasks from the queue and works on them (e.g. accept new request, call db)
    4. When asyncio.sleep(10) is done, servers goes to the next lines and awaits service.async_get_pong
    5. Event loop selects next tasks from the queue and works on them (e.g. accept new request, call db)
    6. When service.async_get_pong is done, server returns a response to the client

The second caveat is that operations that are non-blocking awaitables or are sent to the thread pool must be I/O intensive tasks (e.g. open file, db call, external API call).

  • Awaiting CPU-intensive tasks (e.g. heavy calculations, data processing, video transcoding) is worthless since the CPU has to work to finish the tasks, while I/O operations are external and server does nothing while waiting for that operations to finish, thus it can go to the next tasks.
  • Running CPU-intensive tasks in other threads also isn't effective, because of GIL. In short, GIL allows only one thread to work at a time, which makes it useless for CPU tasks.
  • If you want to optimize CPU intensive tasks you should send them to workers in another process.

Related StackOverflow questions of confused users

  1. https://stackoverflow.com/questions/62976648/architecture-flask-vs-fastapi/70309597#70309597
  2. https://stackoverflow.com/questions/65342833/fastapi-uploadfile-is-slow-compared-to-flask
  3. https://stackoverflow.com/questions/71516140/fastapi-runs-api-calls-in-serial-instead-of-parallel-fashion

8. Custom base model from day 0.

Having a controllable global base model allows us to customize all the models within the app. For example, we could have a standard datetime format or add a super method for all subclasses of the base model.

from datetime import datetime
from typing import Any
from zoneinfo import ZoneInfo

from fastapi.encoders import jsonable_encoder
from pydantic import BaseModel, ConfigDict, model_validator


def convert_datetime_to_gmt(dt: datetime) -> str:
    if not dt.tzinfo:
        dt = dt.replace(tzinfo=ZoneInfo("UTC"))

    return dt.strftime("%Y-%m-%dT%H:%M:%S%z")


class CustomModel(BaseModel):
    model_config = ConfigDict(
        json_encoders={datetime: convert_datetime_to_gmt},
        populate_by_name=True,
    )

    @model_validator(mode="before")
    @classmethod
    def set_null_microseconds(cls, data: dict[str, Any]) -> dict[str, Any]:
        datetime_fields = {
            k: v.replace(microsecond=0)
            for k, v in data.items()
            if isinstance(k, datetime)
        }

        return {**data, **datetime_fields}

    def serializable_dict(self, **kwargs):
        """Return a dict which contains only serializable fields."""
        default_dict = self.model_dump()

        return jsonable_encoder(default_dict)

In the example above we have decided to make a global base model which:

  • drops microseconds to 0 in all date formats
  • serializes all datetime fields to standard format with explicit timezone

9. Docs

  1. Unless your API is public, hide docs by default. Show it explicitly on the selected envs only.
from fastapi import FastAPI
from starlette.config import Config

config = Config(".env")  # parse .env file for env variables

ENVIRONMENT = config("ENVIRONMENT")  # get current env name
SHOW_DOCS_ENVIRONMENT = ("local", "staging")  # explicit list of allowed envs

app_configs = {"title": "My Cool API"}
if ENVIRONMENT not in SHOW_DOCS_ENVIRONMENT:
   app_configs["openapi_url"] = None  # set url for docs as null

app = FastAPI(**app_configs)
  1. Help FastAPI to generate an easy-to-understand docs
    1. Set response_model, status_code, description, etc.
    2. If models and statuses vary, use responses route attribute to add docs for different responses
from fastapi import APIRouter, status

router = APIRouter()

@router.post(
    "/endpoints",
    response_model=DefaultResponseModel,  # default response pydantic model 
    status_code=status.HTTP_201_CREATED,  # default status code
    description="Description of the well documented endpoint",
    tags=["Endpoint Category"],
    summary="Summary of the Endpoint",
    responses={
        status.HTTP_200_OK: {
            "model": OkResponse, # custom pydantic model for 200 response
            "description": "Ok Response",
        },
        status.HTTP_201_CREATED: {
            "model": CreatedResponse,  # custom pydantic model for 201 response
            "description": "Creates something from user request ",
        },
        status.HTTP_202_ACCEPTED: {
            "model": AcceptedResponse,  # custom pydantic model for 202 response
            "description": "Accepts request and handles it later",
        },
    },
)
async def documented_route():
    pass

Will generate docs like this: FastAPI Generated Custom Response Docs

10. Use Pydantic's BaseSettings for configs

Pydantic gives a powerful tool to parse environment variables and process them with its validators.

from pydantic import AnyUrl, PostgresDsn
from pydantic_settings import BaseSettings  # pydantic v2

class AppSettings(BaseSettings):
    class Config:
        env_file = ".env"
        env_file_encoding = "utf-8"
        env_prefix = "app_"

    DATABASE_URL: PostgresDsn
    IS_GOOD_ENV: bool = True
    ALLOWED_CORS_ORIGINS: set[AnyUrl]

11. SQLAlchemy: Set DB keys naming convention

Explicitly setting the indexes' namings according to your database's convention is preferable over sqlalchemy's.

from sqlalchemy import MetaData

POSTGRES_INDEXES_NAMING_CONVENTION = {
    "ix": "%(column_0_label)s_idx",
    "uq": "%(table_name)s_%(column_0_name)s_key",
    "ck": "%(table_name)s_%(constraint_name)s_check",
    "fk": "%(table_name)s_%(column_0_name)s_fkey",
    "pk": "%(table_name)s_pkey",
}
metadata = MetaData(naming_convention=POSTGRES_INDEXES_NAMING_CONVENTION)

12. Migrations. Alembic.

  1. Migrations must be static and revertable. If your migrations depend on dynamically generated data, then make sure the only thing that is dynamic is the data itself, not its structure.
  2. Generate migrations with descriptive names & slugs. Slug is required and should explain the changes.
  3. Set human-readable file template for new migrations. We use *date*_*slug*.py pattern, e.g. 2022-08-24_post_content_idx.py
# alembic.ini
file_template = %%(year)d-%%(month).2d-%%(day).2d_%%(slug)s

13. Set DB naming convention

Being consistent with names is important. Some rules we followed:

  1. lower_case_snake
  2. singular form (e.g. post, post_like, user_playlist)
  3. group similar tables with module prefix, e.g. payment_account, payment_bill, post, post_like
  4. stay consistent across tables, but concrete namings are ok, e.g.
    1. use profile_id in all tables, but if some of them need only profiles that are creators, use creator_id
    2. use post_id for all abstract tables like post_like, post_view, but use concrete naming in relevant modules like course_id in chapters.course_id
  5. _at suffix for datetime
  6. _date suffix for date

14. Set tests client async from day 0

Writing integration tests with DB will most likely lead to messed up event loop errors in the future. Set the async test client immediately, e.g. async_asgi_testclient or httpx

import pytest
from async_asgi_testclient import TestClient

from src.main import app  # inited FastAPI app


@pytest.fixture
async def client():
    host, port = "127.0.0.1", "5555"
    scope = {"client": (host, port)}

    async with TestClient(
        app, scope=scope, headers={"X-User-Fingerprint": "Test"}
    ) as client:
        yield client


@pytest.mark.asyncio
async def test_create_post(client: TestClient):
    resp = await client.post("/posts")

    assert resp.status_code == 201

Unless you have sync db connections (excuse me?) or aren't planning to write integration tests.

15. BackgroundTasks > asyncio.create_task

BackgroundTasks can effectively run both blocking and non-blocking I/O operations the same way FastAPI handles blocking routes (sync tasks are run in a threadpool, while async tasks are awaited later)

  • Don't lie to the worker and don't mark blocking I/O operations as async
  • Don't use it for heavy CPU intensive tasks.
from fastapi import APIRouter, BackgroundTasks
from pydantic import UUID4

from src.notifications import service as notifications_service


router = APIRouter()


@router.post("/users/{user_id}/email")
async def send_user_email(worker: BackgroundTasks, user_id: UUID4):
    """Send email to user"""
    worker.add_task(notifications_service.send_email, user_id)  # send email after responding client
    return {"status": "ok"}

16. Typing is important

FastAPI, Pydantic, and modern IDEs encourage to take use of type hints.

Without Type Hints

With Type Hints

17. Save files in chunks.

Don't hope your clients will send small files.

import aiofiles
from fastapi import UploadFile

DEFAULT_CHUNK_SIZE = 1024 * 1024 * 50  # 50 megabytes

async def save_video(video_file: UploadFile):
   async with aiofiles.open("/file/path/name.mp4", "wb") as f:
     while chunk := await video_file.read(DEFAULT_CHUNK_SIZE):
         await f.write(chunk)

18. Be careful with dynamic pydantic fields (Pydantic v1)

If you have a pydantic field that can accept a union of types, be sure the validator explicitly knows the difference between those types.

from pydantic import BaseModel


class Article(BaseModel):
   text: str | None
   extra: str | None


class Video(BaseModel):
   video_id: int
   text: str | None
   extra: str | None

   
class Post(BaseModel):
   content: Article | Video

   
post = Post(content={"video_id": 1, "text": "text"})
print(type(post.content))
# OUTPUT: Article
# Article is very inclusive and all fields are optional, allowing any dict to become valid

Solutions:

  1. Validate input has only allowed valid fields and raise error if unknowns are provided
from pydantic import BaseModel, Extra

class Article(BaseModel):
   text: str | None
   extra: str | None
   
   class Config:
        extra = Extra.forbid
       

class Video(BaseModel):
   video_id: int
   text: str | None
   extra: str | None
   
   class Config:
        extra = Extra.forbid

   
class Post(BaseModel):
   content: Article | Video
  1. Use Pydantic's Smart Union (>v1.9, <2.0) if fields are simple

It's a good solution if the fields are simple like int or bool, but it doesn't work for complex fields like classes.

Without Smart Union

from pydantic import BaseModel


class Post(BaseModel):
   field_1: bool | int
   field_2: int | str
   content: Article | Video

p = Post(field_1=1, field_2="1", content={"video_id": 1})
print(p.field_1)
# OUTPUT: True
print(type(p.field_2))
# OUTPUT: int
print(type(p.content))
# OUTPUT: Article

With Smart Union

class Post(BaseModel):
   field_1: bool | int
   field_2: int | str
   content: Article | Video

   class Config:
      smart_union = True


p = Post(field_1=1, field_2="1", content={"video_id": 1})
print(p.field_1)
# OUTPUT: 1
print(type(p.field_2))
# OUTPUT: str
print(type(p.content))
# OUTPUT: Article, because smart_union doesn't work for complex fields like classes
  1. Fast Workaround

Order field types properly: from the most strict ones to loose ones.

class Post(BaseModel):
   content: Video | Article

19. SQL-first, Pydantic-second

  • Usually, database handles data processing much faster and cleaner than CPython will ever do.
  • It's preferable to do all the complex joins and simple data manipulations with SQL.
  • It's preferable to aggregate JSONs in DB for responses with nested objects.
# src.posts.service
from typing import Mapping

from pydantic import UUID4
from sqlalchemy import desc, func, select, text
from sqlalchemy.sql.functions import coalesce

from src.database import database, posts, profiles, post_review, products

async def get_posts(
    creator_id: UUID4, *, limit: int = 10, offset: int = 0
) -> list[Mapping]: 
    select_query = (
        select(
            (
                posts.c.id,
                posts.c.type,
                posts.c.slug,
                posts.c.title,
                func.json_build_object(
                   text("'id', profiles.id"),
                   text("'first_name', profiles.first_name"),
                   text("'last_name', profiles.last_name"),
                   text("'username', profiles.username"),
                ).label("creator"),
            )
        )
        .select_from(posts.join(profiles, posts.c.owner_id == profiles.c.id))
        .where(posts.c.owner_id == creator_id)
        .limit(limit)
        .offset(offset)
        .group_by(
            posts.c.id,
            posts.c.type,
            posts.c.slug,
            posts.c.title,
            profiles.c.id,
            profiles.c.first_name,
            profiles.c.last_name,
            profiles.c.username,
            profiles.c.avatar,
        )
        .order_by(
            desc(coalesce(posts.c.updated_at, posts.c.published_at, posts.c.created_at))
        )
    )
    
    return await database.fetch_all(select_query)

# src.posts.schemas
import orjson
from enum import Enum

from pydantic import BaseModel, UUID4, validator


class PostType(str, Enum):
    ARTICLE = "ARTICLE"
    COURSE = "COURSE"

   
class Creator(BaseModel):
    id: UUID4
    first_name: str
    last_name: str
    username: str


class Post(BaseModel):
    id: UUID4
    type: PostType
    slug: str
    title: str
    creator: Creator

    @validator("creator", pre=True)  # before default validation
    def parse_json(cls, creator: str | dict | Creator) -> dict | Creator:
       if isinstance(creator, str):  # i.e. json
          return orjson.loads(creator)

       return creator
    
# src.posts.router
from fastapi import APIRouter, Depends

router = APIRouter()


@router.get("/creators/{creator_id}/posts", response_model=list[Post])
async def get_creator_posts(creator: Mapping = Depends(valid_creator_id)):
   posts = await service.get_posts(creator["id"])

   return posts

If aggregated data from the DB is a simple JSON, then take a look at Pydantic's Json field type, which will load raw JSON first.

from pydantic import BaseModel, Json

class A(BaseModel):
    numbers: Json[list[int]]
    dicts: Json[dict[str, int]]

valid_a = A(numbers="[1, 2, 3]", dicts='{"key": 1000}')  # becomes A(numbers=[1,2,3], dicts={"key": 1000})
invalid_a = A(numbers='["a", "b", "c"]', dicts='{"key": "str instead of int"}')  # raises ValueError

20. Validate hosts, if users can send publicly available URLs

For example, we have a specific endpoint which:

  1. accepts media file from the user,
  2. generates unique url for this file,
  3. returns url to user,
    1. which they will use in other endpoints like PUT /profiles/me, POST /posts
    2. these endpoints accept files only from whitelisted hosts
  4. uploads file to AWS with this name and matching URL.

If we don't whitelist URL hosts, then bad users will have a chance to upload dangerous links.

from pydantic import AnyUrl, BaseModel

ALLOWED_MEDIA_URLS = {"mysite.com", "mysite.org"}

class CompanyMediaUrl(AnyUrl):
    @classmethod
    def validate_host(cls, parts: dict) -> tuple[str, str, str, bool]:  # pydantic v1
       """Extend pydantic's AnyUrl validation to whitelist URL hosts."""
        host, tld, host_type, rebuild = super().validate_host(parts)
        if host not in ALLOWED_MEDIA_URLS:
            raise ValueError(
                "Forbidden host url. Upload files only to internal services."
            )

        return host, tld, host_type, rebuild


class Profile(BaseModel):
    avatar_url: CompanyMediaUrl  # only whitelisted urls for avatar

21. Raise a ValueError in custom pydantic validators, if schema directly faces the client

It will return a nice detailed response to users.

# src.profiles.schemas
from pydantic import BaseModel, validator

class ProfileCreate(BaseModel):
    username: str
    
    @validator("username")  # pydantic v1
    def validate_bad_words(cls, username: str):
        if username  == "me":
            raise ValueError("bad username, choose another")
        
        return username


# src.profiles.routes
from fastapi import APIRouter

router = APIRouter()


@router.post("/profiles")
async def get_creator_posts(profile_data: ProfileCreate):
   pass

Response Example:

22. FastAPI converts Pydantic objects to dict, then to Pydantic object, then to JSON

If you think you can return Pydantic object that matches your route's response_model to make some optimizations, then it's wrong.

FastAPI firstly converts that pydantic object to dict with its jsonable_encoder, then validates data with your response_model, and only then serializes your object to JSON.

from fastapi import FastAPI
from pydantic import BaseModel, root_validator

app = FastAPI()


class ProfileResponse(BaseModel):
    @root_validator
    def debug_usage(cls, data: dict):
        print("created pydantic model")

        return data

    def dict(self, *args, **kwargs):
        print("called dict")
        return super().dict(*args, **kwargs)


@app.get("/", response_model=ProfileResponse)
async def root():
    return ProfileResponse()

Logs Output:

[INFO] [2022-08-28 12:00:00.000000] created pydantic model
[INFO] [2022-08-28 12:00:00.000010] called dict
[INFO] [2022-08-28 12:00:00.000020] created pydantic model
[INFO] [2022-08-28 12:00:00.000030] called dict

23. If you must use sync SDK, then run it in a thread pool.

If you must use a library to interact with external services, and it's not async, then make the HTTP calls in an external worker thread.

For a simple example, we could use our well-known run_in_threadpool from starlette.

from fastapi import FastAPI
from fastapi.concurrency import run_in_threadpool
from my_sync_library import SyncAPIClient 

app = FastAPI()


@app.get("/")
async def call_my_sync_library():
    my_data = await service.get_my_data()

    client = SyncAPIClient()
    await run_in_threadpool(client.make_request, data=my_data)

24. Use linters (black, ruff)

With linters, you can forget about formatting the code and focus on writing the business logic.

Black is the uncompromising code formatter that eliminates so many small decisions you have to make during development. Ruff is "blazingly-fast" new linter that replaces autoflake and isort, and supports more than 600 lint rules.

It's a popular good practice to use pre-commit hooks, but just using the script was ok for us.

#!/bin/sh -e
set -x

ruff --fix
black src tests

Bonus Section

Some very kind people shared their own experience and best practices that are definitely worth reading. Check them out at issues section of the project.

For instance, lowercase00 has described in details their best practices working with permissions & auth, class-based services & views, task queues, custom response serializers, configuration with dynaconf, etc.

If you have something to share about your experience working with FastAPI, whether it's good or bad, you are very welcome to create a new issue. It is our pleasure to read it.

fastapi-best-practices's People

Contributors

0-th avatar 0x12th avatar allmonday avatar anton-shum avatar daliseiy avatar halon176 avatar nbanic avatar ohld avatar omars44 avatar paulovitorweb avatar zhanymkanov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fastapi-best-practices's Issues

Pydantic base model to parse and return camelCase JSON

Credit: This idea has been inspired by this article and and this comment.

I want to be able to both...

  • parse incoming JSON with camelCase naming style, and
  • return camelCase JSON in my responses

For this I'm using a custom base model for pretty much all my Pydantic schemas:

from humps import camelize
from pydantic import BaseModel

class BaseSchema(BaseModel):
    class Config:
        # enable sqlalchemy model parsing
        orm_mode = True

        # enable camelCase JSON parsing
        alias_generator = camelize
        allow_population_by_field_name = True

    # enable camelCase json response
    def json(self, *args, **kwargs):
        kwargs.setdefault("by_alias", True)
        return super().json(*args, **kwargs)

(This requires the pyHumps package.)


With this BaseSchema, I can now create pydantic schemas with the intended behavior:

class Device(BaseSchema):
    name: str | None
    serial_number: str

# ingest camelCase
device = Device.parse_raw("""{"name": "Device XYZ", "serialNumber": "XYZ-123-ABC-000"}""")

# return camelCase
print(device.json())
# {"name": "Device XYZ", "serialNumber": "XYZ-123-ABC-000"}

My solution above uses the camelize function from the pyhumps package. Alternatively, you can create the function yourself like this:

def camelize(string: str) -> str:
    string_split = string.split("_")
    return string_split[0] + "".join(word.capitalize() for word in string_split[1:])

Circular import

How you guys manage to avoid cirular import errors? I am following this structure but circular import just came out.

Our Experiences with a similar structure

@zhanymkanov thanks for the write up. It’s great to have some benchmarks on professional implementations, this is awesome and one of the most valuables repositories, just a lot of production-ready and architecture tips, great stuff, thanks a lot for sharing this!

To our (very positive) surprise, this is very similar to what we are doing in our side. I though it was worth sharing our experiences and the choices we've made along the way good and bad.

⚠️ This ended up being a lot longer than what I expected, my apologies.

Project Structure

This is very similar to what we are doing. The functional way of splitting things doesn’t really work except for really small projects, so we also have a “module” based approach. Our application looks something like:

ourproject-backend
├── alembic/
├── app
│   ├── auth
│   │   ├── routes.py
│   │   ├── schemas.py  # pydantic models
│   │   ├── models.py  # db models
│   │   ├── permissions.py # our decorator
│   │   ├── exceptions.py
│   │   ├── service.py
│   │   └── utils.py
│   ├── core
│   │   ├── routes.py
│   │   ├── services.py
│   │   ├── ....
│   ├── users
│   │   ├── routes.py
│   │   ├── services.py
│   │   ├── ....
│   ├── tenants
│   │   ├── routes.py
│   │   ├── services.py
│   │   ├── ....
│   ├── extensions
│   │   ├── logs.py # JSON Logger etc
│   │   ├── middleware.py # correlation ID & request tracker
│   │   ├── ....
│   ├── services
│   │   ├── mailer.py # a client to SES
│   │   ├── filesystem.py #  a wrapper over S3
│   │   ├── ....
│   ├── db
│   │   ├── mixin.py
│   │   ├── base.py
│   │   ├── engine.py
│   │   ├── ....
│   ├── utils
│   │   ├── schemas.py
│   │   ├── helpers.py
│   │   ├── ....
│   ├── modules
│   │   ├── module_a
│   │   │   ├── models.py
│   │   │   ├── routes.py
│   │   │   ├── schemas.py
│   │   │   ├── ....
│   │   ├── module_b
│   │   │   ├── models.py
│   │   │   ├── routes.py
│   │   │   ├── schemas.py
│   │   │   ├── ....
│   ├── config.py # where the Dynaconf singleton lives
│   ├── exceptions.py
│   ├── routes.py # registration of all system routes
│   ├── hub.py # our event hub
│   └── main.py
├── tests/
│   ├── users
│   ├── tenants
│   └── module_a
├── .env
├── .secrets.toml
├── .gitignore
├── settings.toml
├── mypy.ini
└── alembic.ini

A few comments:

  • We use a sort of “mixed” structure in the sense that some global/generic modules (like Users/Tenants/Auth) have all the same structure and are in the top level, but the application specific business logic is in the modules module. We have been using this structure for the past couple of years and have been pretty happy with the separation of concerns it brings. We even reuse the same blueprint for different projects, we mostly just change the modules which is great.
  • Having a specific db module on the top level has helped a lot giving us flexibility to have more robust Mixin classes, better engine configuration and some other goodies.
  • We also are really happy with having a core module on the top level. This gives us flexibility to do things like a specific mock service, a taskStatus route or more generic resources.
  • We really like how predictable this is and much boilerplate code we can just copy around from module to module. We have dramatically speed up our development process of new modules with this. This also helped a lot new devs to understand the codebase logic.

Permissions & Auth

Although the “recommended” way of doing authentication in FastAPI would be the dependency injection, we have chosen to use a class-based decorator to control access on the route level.
So our routes look something like:

@route.get('/me')
@access_control(Resources.users_view_self) # this is a enum
def myroute(self):
...

@route.get('/superuser_only')
@access_control(superuser=True)
def myroute(self):
...


@route.get('/open')
@access_control(open=True)
def myroute(self):
...

And our access_control class looks like:

class access_control:  # pylint: disable=invalid-name
    MASTER_USER_ID = 0

    def __init__(
        cls,
        module: Optional[AppModules] = None,
        resource: Optional[AppActions] = None,
        superuser: bool = False,
        open: bool = False,
    ) -> None:
        cls.module = module
        cls.resource = resource
        cls.superuser = superuser
        cls.open: bool = open
        cls.tenant_id: Optional[int] = None
        cls.object_id: Optional[int] = None
        cls.current_user: Optional[UserResponse] = None
        cls.request: Optional[Request] = None
        cls.headers: Optional[dict[Any, Any]] = None
        cls.auth_header: Optional[str] = None
        cls.token: Optional[str] = None

    def __call__(cls, function) -> Callable[..., Any]:
        @functools.wraps(function)
        async def decorated(*args, **kwargs):
            t0 = time.time()
            try:
                await cls.parse_request(**kwargs)
                is_allowed = await cls.verify_request(*args, **kwargs)
                if not is_allowed:
                    raise HTTPException(403, "Not allowed.")
                return await function(*args, **kwargs)
            except exc.NotAllowed as error:
                raise HTTPException(403, str(error)) from error

        return decorated

    async def parse_request(cls, **kwargs) -> None:
        """Get the current user from the request"""
        dependencies = kwargs.get("self", kwargs.get("base_args"))
        base_args: Optional[RequestArgs] = getattr(dependencies, "base_args", None)
        if not base_args:
            return
        cls.tenant_id = base_args.tenant_id
        cls.current_user = base_args.current_user
        return None

    async def verify_request(cls, **kwargs) -> None:
        """Actually check for permission based on route, user, tenant etc"""
        ...

A few benefits we encountered, and few drawbacks:

  • This is great to accept multiple parameters like module or action or superuser=True and things like that.
  • The permission controller (the access_control class itself) is fairly easy to work on, being very powerful at the same time, since it has the *args and **kwargs from the request, and the full context (current user, path, tenant, etc), so all sort of checks can be used. As we increase the granularity over access control we have been considering implementing a permissions decorator for each module, so we can have more specific control over a given resource. But WIP still.

Class-based Services

Our service module service.py started to get big and a mess of functions, so we started having a few class based services, which have been working very well. Something like TenantService , UserService. This almost looks like a repository for simple modules (in some cases we even spiltd the service into service and repository (for more complex business logic). Now each service module has anything from 1 to 10 service classes, this greatly improved our organization and readability.

Class-based views

Earlier this year we refactor all of our routes to use a class based view that is included in the fastapi-utils package and this is made our code a lot cleaner. The main benefit for us, is that the basic authentication process (reading the token and the X-Tenant-ID for the header) is done in one place only, se we don’t have to repeat the dependencies.
What we’ve done is, we have a custom commons_deps function, and at the beginning of each route class we do something like:

@cbv(router)
class MyModuleRouter:
    commons = Depends(commons_deps)
    service = MyModuleService()		

    @route.get('/me')
    @access_control(Resources.users_view_self)
    def myroute(self):
         # And now here we can access the common deps & the service
         current_user = self.commons.current_user
         tenant_id = self.commons.tenant_id
         response = self.service.get_module_resource(tenant_id)

We have been experimenting with something slightly different nowadays, which is having the service being instantiated with the tenant_id and current_user in a dependency injection, so that our service starts up a bit more complete.

Task Queues

We are long time Celery users, but celery is overwhelming and fairly difficult to reason about when you get to the internals and specifics. We just switched to RQ and couldn’t be happier with a few caveats. The logic is amazing (the Queue , Job objets are really intuitive and easy to work with, as are dependency chains with depends_on. The thing is that there’s an issue with async functions. They work if you use the worker, but won’t work if you run in the same process, which is kind of a pain when debugging. We haven’t experimented with starlette’s. Background jobs as we always valued having a centralized dashboard for tasks and an easy way to get a task status for example. As we deploy most of our applications in Kubernetes, being able to scale the workers easily and indefinitely is awesome and we are really glad with it. I have been experimenting with a few different snippets to try to open a PR and make RQ compatible in every scenario.

The fancy architecture

In same cases (actually projects) we slightly changed our module architecture to account for a proper business oriented Model object.

...
│   ├── modules
│   │   ├── module_a
│   │   │   ├── routes.py
│   │   │   ├── services.py
│   │   │   ├── orm.py # the sqlalchemy classes
│   │   │   ├── models.py # "pure" modules (are also pydantic)
│   │   │   ├── schemas.py # the pydantic API schemas
│   │   │   ├── adapters.py
│   │   │   ├── builders.py
│   │   │   ├── interfaces.py
│   │   │   ├── repository.py

For fancier implementations this worked very well, although is a lot more complex to start with. This gives us a proper EntityModel and great separation of concerns, but it gets a lot more verbose really quick, so we found it was only worth it for very complex projects, but it’s also a possibility.

Custom Response Serializers & BaseSchema

We found that the response_class in FastAPI also serializes the data in Pydantic, so it’s not purely for documentation. You can, however, overwrite the default response behavior by making a custom response class, which we did going a bit of performance (anywhere from 50-100ms) and flexibility. So we have something like:

# utils/schemas.py

class JSONResponse(Response):
    media_type = "application/json"

    def __init__(
        self,
        content: typing.Any = None,
        status_code: int = 200,
        headers: t.Optional[t.Mapping[str, str]] = None,
        media_type: t.Optional[str] = None,
        background: t.Optional[BackgroundTasks] = None,
    ) -> None:
        self.status_code = status_code
        if media_type is not None:
            self.media_type = media_type
        self.background = background
        self.body = self.render(content)
        self.init_headers(headers)

    def render(self, content: BaseSchema | list[BaseSchema] | Any):
       # This is not 100% battle proof, but as our services are controlled (only return Pydantic modules) works fine
        if isinstance(content, BaseSchema):
            return content.json().encode("utf-8")
        if isinstance(content, list):
            if isinstance(content[0], BaseSchema):
                def uuid_decoder(obj):
                    if isinstance(obj, UUID):
                        return str(obj)
                return orjson.dumps([item.dict() for item in content], default=uuid_decoder)

And then we use the response directly like:

@cbv(router)
class MyModuleRouter:
    commons = Depends(commons_deps)
    service = MyModuleService()		

    @route.get('/me', response_class=[...])
    @access_control(Users.view_self) # this is a enum
    def myroute(self):
        # And now here we can access the commons
      	current_user = self.commons.current_user
      	tenant_id = self.commons.tenant_id
        response = self.service.get_module_resource(tenant_id)
	return JSONResponse(response, 200)

This gave us a cleaner router since we can use the status code on the response itself, which was more intuitive for use, gained a bit of performance with the orjson encoder and we just like it better. The (big) downside is that we face the risk of having documentation/API inconsistencies, in our case it happened once or twice, but we think it’s still worth it.

Just as you guys we also have a BaseSchema base for all Pydantic schemas we use that have a couple of configurations like orm_mode enum etc.

Using a DefaultResponse class

In several occasions the response is kind of generic, so we use a lot of a schema called DefaultResponse:

class DefaultResponse(BaseSchema):
    status: bool
    msg: str
    details: Optional[dict[Any, Any]] = {}

This is a kind of standardized way of communicating with our client (we have a React frontend) so the front devs always know what to look for when getting a DefaultResponse.

Configuration

Although Pydantic is nice for configuration as well, we couldn’t be happier using the amazing @dynaconf lib, developed and maintained by @BrunoRocha. This was a game changer in our settings management.

All of our settings/secrets went to .toml files and a few things happened:
- Only one file for multiple environments using toml headers
- Only one place to manage keys (in Flask we were used of having multiple configuration classes which were a pain to maintain)
- a singleton with global access our settings.py file has ~10 lines:

#app/config.py

from dynaconf import Dynaconf

settings = Dynaconf(
    settings_files=[".settings.toml", ".secrets.toml"],
    envvar_prefix="MYAPP",
    env_switcher="MYAPP_APP_ENV",
    load_dotenv=True,
    environments=True,
)

And now everywhere we can just

from app.config import settings

myvar = settings['MYVAR']
myvar_a = settings.MYVAR_A

And don’t need to change anything when deploying to K8S since we already inject everything with env vars (config). Can’t recommend it more. We still have to experiment with the Vault integration, which is the next step.

The Message Hub

This helped a lot while we were trying to further decouple our services.
The hub is a centralized hub to share message between modules, something like:

class MessageHub:
    """Message hub for events"""

    handlers = {
        module_a.ResourceCreated: [
            module_b.handle_resource_created,
            module_c.handle_resource_created,
        ],
        module_d.ResourceDeleted: [
            module_b.handle_resource_deleted,
            module_c.handle_resource_deleted,
        ],
    }  # type: dict[Type["Event"], list[Callable[..., Any]]]

    @classmethod
    async def track(cls, event: ApplicationEvent):
        """Tracks the Application activity.
        Receives the application event that will be used by the AuditService.

        Args:
            event (ApplicationEvent): The ApplicationEvent
        """
        await AuditService.save(event)

    @classmethod
    async def handle(cls, event: Event):
        """
        Handles an arbitrary event.
        It will receive the event, and get the handlers that should handle
        the event. The order on which the handlers will execute the event may vary.
        If the event is sent to the worker, the handlers are async, meaning they can run at the same time.
        If the event is synchronous, than each handlers will handle the event sequentially.

        Args:
            event (Event): The Event.
        """
        if type(event) not in cls.handlers:
            logger.info("No handlers for event: %s", event.__class__.__name__)
            return

        # Call listeners functions
        for fn in cls.handlers[type(event)]:
            if event.is_async:
                worker.enqueue(fn, event)
                return

            await fn(event)

And in most modules we have handlers.py module that will have a few functions that handle events. The services themselves usually dispatch events, like hub.MessageHub.handle(event_created_by_the_service), and we also use it to track application activity, normally called by the route hub.MessageHub.track(application_activity_schema)

Types & Docs

100% of arguments are typed and 100% of methods / functions have docstrings. I honestly can't live without anymore. Now wondering if could just compile the whole code to C and make it fly? Nuitka, MyPyC maybe? TBC...


Now the bad part, and our (really) bad practices

Local Session Management

For a couple of reasons we didn’t implement the request-coupled session management (inject the session through FastAPI’s dependency injection system) and we ended up having a lot of services that handle the session locally, which is not cool and not recommended by SQLAlchemy. Think of:

class ModuleService:
    ...
    async def module_method(self, ...):
       # Terrible
        async with async_session() as session:
	    ...
	return something

Managing the session lifecycle itself is fairly ok and it works for really simple services, but what we found is that for more complex services methods that call on another you end up nesting sessions which is terrible. Imagine calling other_method from module_method that also has the same session lifecycle management, now you just opened a session within another session. Jus terrible. We are gradually moving to better session management, but we are still trying to find better ways of handling it.

Little use of the Dependency Injection

In your write up a lot of great example of how to properly use and leverage the power of dependency injection, we don’t use much of those, and we definitely should.

Lack of Context in Services

Sometimes we found ourselves having a Service class that didn’t even have a initializer and was purely for organization, this is fine, but we are missing a lot of benefits of having some context in the service (example: tenant_id and session) which would save was from having the tenant_id being passed to every single method in a service class. So there’s definitely a lot to improve here.

There's obviously a lot to improve and a whole lot more of bad things that I probably forgot to mention, but again, I though it was worth sharing the experience. And to finish our Dockerfile, which is also pretty simple (using poetry and leveraging it's dev-dependencies logic something that was mentioned here as well #1 :

FROM python:3.10-slim
WORKDIR /app

COPY pyproject.toml .
COPY poetry.lock* .

RUN apt-get update -y && \
    apt-get install gcc -y && \
    apt-get install libpq-dev -y && \
    python -m venv .venv && \
    .venv/bin/pip install poetry && \
    .venv/bin/poetry export -f requirements.txt --output requirements.txt --no-dev --without-hashes && \
    .venv/bin/pip install -r requirements.txt && \
    apt-get remove gcc -y && \
    apt autoremove -y

ADD . /app
EXPOSE 8000
CMD [".venv/bin/uvicorn", "app.asgi:app", "--host", "0.0.0.0"]

Handling the nested response

Hey, this article is fantastic and inspired a lot to me, appreciate!

for No.19: https://github.com/zhanymkanov/fastapi-best-practices#19-sql-first-pydantic-second, I have some attempts.

Recently I'm working on a small toolkit to handle the nested part.

TLDR: https://github.com/allmonday/pydantic_resolve#demo-2-integrated-with-aiodataloader

We use GraphQL in our project, which is very flexible and allows for easy definition of new fields. When combined with dataloader, it can solve the potential N+1 query problem.

However, as an internal API entry, I feel that GraphQL is too flexible. FastAPI's JSON schema, combined with various client-codegen tools, can reduce a lot of front-end and back-end workload, such as type definitions in front-end.

codegen: https://fastapi.tiangolo.com/advanced/generate-clients/

Therefore, I thought of leveraging the advantages of dataloader and Pydantic:

Pydantic can provide schema, while dataloader can solve the N+1 problem for nested field queries.

        BOOKS_DB = {
            1: [{'name': 'book1'}, {'name': 'book2'}],
            2: [{'name': 'book3'}, {'name': 'book4'}],
        }

        class Book(BaseModel):
            name: str

        class BookLoader(DataLoader):
            async def batch_load_fn(self, keys):
                books = [[Book(**bb) for bb in BOOKS_DB .get(k, [])] for k in keys]
                return books

        book_loader = BookLoader()  

        class Student(BaseModel):
            id: int
            name: str

            books: Tuple[Book, ...] = tuple()
            def resolve_books(self):
                return book_loader.load(self.id)

        # usage
        students = [Student(id=1, name="jack"), Student(id=2, name="mike")]
        results = await resolve(students)

By this way it also simplify the SQL part by moving the nested part into query in loader.

And another bonus is you can reuse the loader anywhere.

Adding `exception_handlers` for mapping from exceptions to the error responses

Thank you for sharing these best practices!

In our projects, we usually define a set of custom exceptions. These are mostly translated into unified error responses.
Eg:

# exceptions.py
class InvalidInputError(Exception):
    error_code = ErrorCode.INVALID_INPUT
    error_message = "Invalid input error"

# Response
400 BadRequest
{
     "error": {
            "error_code": "INVALID_INPUT",
            "error_message": "Missing required field 'abc' ..."
      }
}

I think it would be great if we could have an exception_handlers.py file to handle the mappings from the exceptions to the corresponding error responses.
Eg:

# exceptions.py
class InvalidInputError(Exception):
    error_code = ErrorCode.INVALID_INPUT
    error_message = "Invalid input error"

# exception_handlers.py
def invalid_input_exception_handler(_: Request, exc: InvalidInputError):
    error = ErrorItem(
        error_code=exc.error_code, error_message=exc.error_message
    )
    return JSONResponse(
        status_code=status.HTTP_400_BAD_REQUEST,
        content=jsonable_encoder(ErrorResponse(error=error)),
    )

def register_error_handlers(app: FastAPI) -> None:
    app.add_exception_handler(InvalidInputError, invalid_input_exception_handler)

# main.py
from exception_handlers import register_error_handlers

...
register_error_handlers(app=app)
...

Pydantic 2+

Is it possible to update 8 to use pydantic 2+? I'm not pro enough to figure out the migration path for that.

Where to place OAuth2 functions

from jose import JWTError, jwt
from datetime import datetime, timedelta
from . import schemas, database, models
from fastapi import Depends, status, HTTPException
from fastapi.security import OAuth2PasswordBearer
from sqlalchemy.orm import Session
from .config import settings

oauth2_scheme = OAuth2PasswordBearer(tokenUrl='login')

# SECRET_KEY
# Algorithm
# Expriation time

SECRET_KEY = settings.secret_key
ALGORITHM = settings.algorithm
ACCESS_TOKEN_EXPIRE_MINUTES = settings.access_token_expire_minutes


def create_access_token(data: dict):
    to_encode = data.copy()

    expire = datetime.utcnow() + timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES)
    to_encode.update({"exp": expire})

    encoded_jwt = jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)

    return encoded_jwt


def verify_access_token(token: str, credentials_exception):

    try:

        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        id: str = payload.get("user_id")
        if id is None:
            raise credentials_exception
        token_data = schemas.TokenData(id=id)
    except JWTError:
        raise credentials_exception

    return token_data


def get_current_user(token: str = Depends(oauth2_scheme), db: Session = Depends(database.get_db)):
    credentials_exception = HTTPException(status_code=status.HTTP_401_UNAUTHORIZED,
                                          detail=f"Could not validate credentials", headers={"WWW-Authenticate": "Bearer"})

    token = verify_access_token(token, credentials_exception)

    user = db.query(models.User).filter(models.User.id == token.id).first()

    return user

Pydantic2 & double conversions (#22)

Hello,

Love this repo!

I was trying to modify your example from #22 to check if nested models are round-tripped unnecessarily if contained within the dict I return in my endpoint function. Before I could do that, I had to update the example code in #22 to work with pydantic 2.0 (fastapi==0.100.1, pydantic==2.4.2). I first changed the root_validator to model_validator:

    @model_validator(mode="before")
    @classmethod
    def debug_usage(cls, data: dict):
        print("created pydantic model")

        return data

and when I run the app and hit that endpoint, I see "created pydantic model" once, and do not get "called dict" logged at all.

The dict method is deprecated in favor of model_dump, but if I also override model_dump and model_dump_json:

    def model_dump(self, *args, **kwargs):
        print("called model_dump")
        return super().model_dump(*args, **kwargs)

    def model_dump_json(self, *args, **kwargs):
        print("called model_dump_json")
        return super().model_dump_json(*args, **kwargs)

I don't get any of those "called ..." messages printed. If I use jsonable_encoder on a model in a terminal, I can see it uses model_dump_json, but FastAPI doesn't seem to use any of these!

So my questions are:

  1. in recent versions of pydantic and fastapi, what happens if I return an object whose type matches response_model?
  2. is this double-encoding problem still a problem? That I only see the object created once makes me think it isn't, but since I can't replicate the full example I'm a but unsure what's going on.

Where to place CRUD operations?

First of all - thanks for this beautiful repo!
I noticed that there is no mention of where to store any crud operations in the project structure.
I wonder how you would implement these?
Some implementations I've considered.

  1. No separate crud operations - would lead to duplicated code.
  2. Following @tiangolo's full stack example structure, there is a separate crud folder. I could create a crud folder for each module.
  3. Another implementation I've seen is to declare the crud operations directly in the model's Base class. E.g.:
@classmethod
async def create(cls, **kwargs):
    obj = cls(**kwargs)
    db.add(obj)
    try:
        await db.commit()
    except Exception:
        await db.rollback()
        raise
    return obj

However, this way you don't get any autocompletion.

So far, we're using a separate crud folder for each module, but I would love to hear any other recommendations.

Idea: filename structure

Filename suggestion

Filenames inside the app/module specific can be example: {module}_service.py.
At some point it would be nice to create a generator for these like in django.
django startapp auth.

Reason

We are already importing them from src.auth import constants as auth_constants it is easier to just change the filename so we don't have to keep track and it doesn't create any conflicts.

PR - #12

Example

image

Potential Improvements

First of all - it's an amazing set of best practices, I also want to share some things that I use:

Project Structure

src could be added to PYTHONPATH to avoid
prefixing every app import with src, IDEs like PyCharm also support that.

Models also could be stored in same package, it's easier to import your models and make sure all modules were executed when you generate your migrations:

src/
  db/
    models/
      __init__.py
      comments.py
      posts.py
      tags.py
    base.py
    dependencies.py
# __init__.py
from . commends import Comment
from . posts import Post
from . tags import Tag

__all__ = ["Comment", "Post", "Tag"]
# Anywhere in the code
from db.models import Post

Continuous Integration

Absolutely use CI in gitlab/github to automate your tests and linters!

Dependency Management

Use poetry instead of requirements.txt, it's awesome!

Custom base model from day 0

This could also be used to enable orm_mode
and set up custom alias_generator if client (for example a JS app) requires it.

Use Starlette's Config object Pydantic BaseSettings!

Pydantic has its own class to manage environment variables:

class AppSettings(BaseSettings):
    class Config:
        env_prefix = "app_"

    domain: str

Need for a template

IMO, there is a need for a starter template to get accustomed to these practice or for better reference. Not compulsory, just my personal opinion.

Initialization of global clients

@zhanymkanov thanks for the write up. This article is very useful for me!

In our work, we often deal with a large number of integrations with third-party systems, for this we need to create global clients, what is the most correct way to do this in your opinion?

Could you recommend an actual project that employs this project structure?

I'm eager to explore the overall arrangement in practical application. Primarily, I'm interested in examining the logical separation of dependencies and services. For instance, while working on the login functionality, I placed JWT within utils and encapsulated user-related operations within a service class, such as get_current_user. However, later on, I realized that this might be categorized as dependencies. I'm familiar with Django but not particularly well-versed in FastAPI's dependencies.

thx

No module named 'src'

I am following folder structure and following steps to import from other folders like this

from src.dashboard.schemas import Audio

But an error got me, and if i remove the src i got circular import
ModuleNotFoundError: No module named 'src'

How to start the project

How should I start the project, docs says that main.py file will run the app but what will contain that main.py file and where will point?

Example how I did in a recent course
uvicorn app:app --reload --host 0.0.0.0 --port 7070

app.py content:
import fastapi as _fastapi
import fastapi.security as _security
import sqlalchemy.orm as _orm
import schemas as _schemas
import services as _services
from typing import List
from fastapi.middleware.cors import CORSMiddleware

app = _fastapi.FastAPI()

app.add_middleware(
CORSMiddleware,
allow_origins=[""],
allow_credentials=True,
allow_methods=["
"],
allow_headers=["*"]
)

@app.post("/api/v1/users")
async def register_user(user: _schemas.UserRequest, db: _orm.Session = _fastapi.Depends(_services.get_db)):
db_user = await _services.get_user_by_email(email=user.email, db=db)
if db_user:
raise _fastapi.HTTPException(status_code=400, detail="Email already exists")
# create user and return token
db_user = await _services.create_user(user=user, db=db)
return await _services.create_token(user=db_user)

@app.post("/api/v1/login")
async def login_user( form_data: _security.OAuth2PasswordRequestForm = _fastapi.Depends(),
db: _orm.Session = _fastapi.Depends(_services.get_db)):
db_user = await _services.login(email=form_data.username, password=form_data.password, db=db)
if not db_user:
raise _fastapi.HTTPException(status_code=401, detail="Wrong login credentials")
return await _services.create_token(db_user)

In this case point to app.py which is llinked to services, so the question is how to structure the main.py in
fastapi-best-practices, where should i point to?

Where this code (@app.post("/api/v1/users") and @app.post("/api/v1/login") ) should be, what namefile should have regarding to fastapi-best-practices?

why dispatch using src/dispatch/ as highest level of app

in this repo the highest level of app is src/

fast-api-project/
└── src/
    ├── domain1
    └── domain2

and then i see the Dispatch repo that mention on the description
highest level of app is src/distpatch/

dispatch/
└── src/
    └── dispatch/
        ├── domain1
        └── domain2

dispatch folder inside src is called domain to?
what should i choose?

Questions about 3. Use Dependencies data valadation vs DB

async def valid_post_id(post_id: UUID4) -> Mapping:
    post = await service.get_by_id(post_id)
    if not post:
        raise PostNotFound()

    return post

How should service be written, my question is how to maintain session when you don't put session into dependencies? Is there a way to put session and valid data into same dependencies?

Better sqla core

Since you are using sqlalchemy core to make raw sql queries, you might like to use a more maintainable approach to declare and use the tables and columns. ORMs aren't very flexible, but a huge help for type driven development. For example using obj.column is more maintainable than using table.c.column, since the latter is dynamic, and lack type hints.

We can take the middle ground by re-declaring the columns like column = table.c.column, and using the redeclared version everywhere. But that will add a lot of boilerplate code.

One way to avoid boilerplate code is using a factory class like in https://github.com/sayanarijit/sqla-fancy-core which lets us declare the columns almost the same way we declare in the orms, but without the orm magic.

We can also subclass the factory to create custom column type like:

from sqla_fancy_core import TableFactory as _TableFactory

class TableFactory(_TableFactory):
    def col(self, *args, nullable=False, **kwargs):
        kwargs["nullable"] = nullable
        return super().col(*args, **kwargs)

    def foreign_key(
        self,
        name: str,
        ref: str | sa.Column,
        *args,
        onupdate="CASCADE",
        ondelete="CASCADE",
        **kwargs
    ):
        fk = sa.ForeignKey(ref, onupdate=onupdate, ondelete=ondelete)
        return self.col(name, fk, *args, **kwargs)

    def name(self, *args, **kwargs):
        return self.string("name", *args, **kwargs)

    def slug(self, *args, **kwargs):
        return self.text("slug", *args, **kwargs)

What are some best practices for Fastapi + Mongodb

Almost all of of here have used some ORM library like Sqlalchemy which is equivalent to how django does its things. I found myself using mongodb as the main database choice for most of my projects. Also in all company projects that Ive been working on since this year, I have not used any sql database. I usually use pymongo and Its always a hassle to configure everything for every new project. What ive done is create a DBConnection class which has all the collections I need as class attributes and possibly some methods does some db operations.

The problem

As you already figured out, Its nothing close to using an ORM, Are there some libraries that support ODM that i can use in fastapi like Sqlalchemy (when dealing with SQL). I mean relationsips, and most important being able to still my favourite mongodb operation which is the aggregate searches.

Response Handling

I am planning to use Response Structure like this (https://google.github.io/styleguide/jsoncstyleguide.xml?showone=error#error)

{
  "error": {
    "code": 404,
    "message": "File Not Found",
    "errors": [{
      "domain": "Calendar",
      "reason": "ResourceNotFoundException",
      "message": "File Not Found
    }]
  }
}

Can anyone suggest the best way to handle this?
I handled it for RequestValidationException of Pydantic, but for other Exceptions, I am not sure how to handle it.

For RequestValidationException

@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
    error_response = {
        "error": {
            "code": status.HTTP_422_UNPROCESSABLE_ENTITY,
            "message": "Request Validation Failed",
            "errors": exc.errors(),
        }
    }
    return JSONResponse(
        content=jsonable_encoder(error_response),
        status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
    )

API Rate Limits

Really like this work, guys! I want to limit my API to [say] 10 requests/per/minute for each IP. Have you any best practices how to achieve this in FastAPI?

Use `run_in_threadpool` or not?

Thank you again for this useful repo.
I have a question regarding your tip 23. I wonder what the differences between using def function and run_in_threadpool in async function. They works the same way to my knowledge. My use case is that I'm using google_storage_python to download file from GCS but it's a blocking I/O library

Debugging

I'm fairly new to both Python and FastAPI, so I might be missing an obvious solution here. I joined a team that's already working with said stack and they've set up a dev Docker image that runs the app through uvicorn. I see a similar setup in your repo.

Coming from Java, I'd prefer to run and debug the application without the need to run it inside a container (and also setting up remote debugging looks like a pain). Why is there a preference to run it in a container? I tried following FastAPI's suggested setup, but I quickly ran into some module import issues (probably due to my lack of deep Python understanding).

Any guidance on the reason for the design choice and possible bast practices would be welcome :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.