Architecture

This document describes the architecture and design decisions behind Codex.

Overview

Codex is built with Rust for performance and safety. It follows a modular architecture that separates concerns and enables horizontal scaling.

Core Principles

Stateless Design

Codex is designed to be stateless:

No server-side sessions
All state stored in database
JWT tokens for authentication
Perfect for horizontal scaling

Database Agnostic

Supports multiple database backends:

SQLite - Simple, embedded, perfect for small deployments
PostgreSQL - Production-grade, supports complex queries and scaling

Modular Parser System

Format parsers are pluggable and isolated:

Each format has its own parser module
Easy to add new formats
Failures in one parser don't affect others

System Architecture

┌─────────────────────────────────────────┐
│           HTTP API Layer                │
│         (Axum Web Framework)           │
└──────────────┬──────────────────────────┘
               │
┌──────────────▼──────────────────────────┐
│         Request Handlers                │
│  (Libraries, Books, Series, Users)     │
└──────────────┬──────────────────────────┘
               │
┌──────────────▼──────────────────────────┐
│      Repository Layer                    │
│   (Database Abstraction)                │
└──────────────┬──────────────────────────┘
               │
┌──────────────▼──────────────────────────┐
│      Database Layer                     │
│   (SeaORM - SQLite/PostgreSQL)          │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│         Scanner System                  │
│  (File Detection & Analysis)            │
└──────────────┬──────────────────────────┘
               │
┌──────────────▼──────────────────────────┐
│         Parser System                    │
│  (CBZ, CBR, EPUB, PDF)                  │
└─────────────────────────────────────────┘

Component Overview

API Layer

Built with Axum, a modern Rust web framework:

Type-safe routing - Compile-time route validation
Middleware support - Authentication, CORS, logging
Async/await - Non-blocking I/O
OpenAPI/Scalar - Auto-generated API docs

Database Layer

Uses SeaORM for database operations:

Type-safe queries - Compile-time SQL validation
Migration system - Version-controlled schema changes
Multi-database support - SQLite and PostgreSQL
Connection pooling - Efficient resource usage

Parser System

Modular parser architecture:

parsers/
├── mod.rs           # Parser registry
├── traits.rs        # Parser trait definition
├── cbz/            # CBZ parser
├── cbr/            # CBR parser (optional)
├── epub/           # EPUB parser
├── pdf/            # PDF parser
└── metadata.rs     # Metadata extraction

Each parser implements the Parser trait:

pub trait Parser: Send + Sync {
    fn detect(&self, path: &Path) -> Result<bool>;
    fn parse(&self, path: &Path) -> Result<ParsedBook>;
    fn extract_pages(&self, path: &Path) -> Result<Vec<Page>>;
}

Scanner System

The scanner discovers and analyzes files:

Detector - Identifies file types
Analyzer - Extracts metadata and structure
Repository - Stores results in database

Authentication System

JWT-based authentication:

Argon2 password hashing
JWT tokens for stateless auth
Permission system - Role-based access control
API keys - For programmatic access

Data Models

Core Entities

Library
  ├── Series
  │     └── Book
  │           └── Page
  │
  └── User
        └── API Key

Relationships

Library → Series (one-to-many)
Series → Book (one-to-many)
Book → Page (one-to-many)
User → API Key (one-to-many)

Metadata Sources

Books can have metadata from multiple sources:

ComicInfo.xml - Embedded metadata
Filename parsing - Extracted from file names
Database - User-provided metadata
External APIs - Future: integration with metadata providers

Scalability

Horizontal Scaling

Codex is designed for horizontal scaling:

Stateless servers - Any instance can handle any request
Shared database - All instances share the same database
Load balancing - Use any load balancer (Nginx, HAProxy, etc.)
Kubernetes ready - Deploy multiple replicas

Database Scaling

SQLite:

Single-file database
Good for small to medium deployments
Limited concurrent writes

PostgreSQL:

Supports read replicas
Connection pooling
Can handle high concurrency
Supports sharding (future)

Caching Strategy

Current:

Database query caching (via SeaORM)
File system caching for extracted pages

Planned:

Redis for distributed caching
CDN integration for media files
In-memory metadata cache

Security

Authentication

Password hashing - Argon2 with configurable cost
JWT tokens - Stateless, signed tokens
Token expiration - Configurable expiry
API keys - For programmatic access

Authorization

Permission system - Fine-grained permissions
Role-based access - Admin, user, read-only roles
Resource-level permissions - Per-library access control

Input Validation

Type-safe parsing - Rust's type system
Request validation - Validate all inputs
SQL injection protection - Parameterized queries via SeaORM
Path traversal protection - Validate file paths

Error Handling

Error Types

Codex uses a hierarchical error system:

pub enum CodexError {
    Database(DatabaseError),
    Parser(ParserError),
    Authentication(AuthError),
    Validation(ValidationError),
    // ...
}

Error Responses

All API errors follow a consistent format:

{
  "error": "ErrorType",
  "message": "Human-readable message",
  "details": {}
}

Logging

Structured logging with tracing:

Log levels - error, warn, info, debug, trace
Structured fields - JSON-formatted logs
File and console - Configurable outputs
Request tracing - Track requests across services

Testing

Test Structure

tests/
├── api/              # API integration tests
├── db/               # Database tests
├── parsers/          # Parser tests
└── scanner/          # Scanner tests

Test Databases

SQLite - Fast, in-memory for unit tests
PostgreSQL - Integration tests with real database

Performance Optimizations

Database

Connection pooling - Reuse database connections
Prepared statements - Via SeaORM
Indexes - Optimized database indexes
Query optimization - Efficient queries

File Processing

Parallel processing - Process multiple files concurrently
Lazy extraction - Extract pages on-demand
Caching - Cache extracted metadata
Streaming - Stream large files

Memory Management

Zero-copy where possible - Rust's ownership system
Efficient data structures - Minimal allocations
Resource cleanup - Proper cleanup of file handles

Future Enhancements

Planned Features

Full-text search - Search book content
Thumbnail generation - Automatic thumbnails
Webhook support - Event notifications
Plugin system - Extensible architecture
Metadata providers - Integration with external APIs
Multi-language support - i18n for UI

Performance Improvements

Async file I/O - Non-blocking file operations
Distributed caching - Redis integration
CDN support - Serve media from CDN
Background jobs - Async task processing

Technology Stack

Core

Rust - Systems programming language
Tokio - Async runtime
Axum - Web framework
SeaORM - ORM framework

Database

SQLite - Embedded database
PostgreSQL - Production database
SeaORM - Database abstraction

Parsing

zip - ZIP archive support
unrar - RAR archive support (optional)
lopdf - PDF parsing
quick-xml - XML/EPUB parsing
image - Image processing

Security

argon2 - Password hashing
jsonwebtoken - JWT tokens
sha2 - File hashing

Overview​

Core Principles​

Stateless Design​

Database Agnostic​

Modular Parser System​

System Architecture​

Component Overview​

API Layer​

Database Layer​

Parser System​

Scanner System​

Authentication System​

Data Models​

Core Entities​

Relationships​

Metadata Sources​

Scalability​

Horizontal Scaling​

Database Scaling​

Caching Strategy​

Security​

Authentication​

Authorization​

Input Validation​

Error Handling​

Error Types​

Error Responses​

Logging​

Testing​

Test Structure​

Test Databases​

Performance Optimizations​

Database​

File Processing​

Memory Management​

Future Enhancements​

Planned Features​

Performance Improvements​

Technology Stack​

Core​

Database​

Parsing​

Security​

Next Steps​

Overview

Core Principles

Stateless Design

Database Agnostic

Modular Parser System

System Architecture

Component Overview

API Layer

Database Layer

Parser System

Scanner System

Authentication System

Data Models

Core Entities

Relationships

Metadata Sources

Scalability

Horizontal Scaling

Database Scaling

Caching Strategy

Security

Authentication

Authorization

Input Validation

Error Handling

Error Types

Error Responses

Logging

Testing

Test Structure

Test Databases

Performance Optimizations

Database

File Processing

Memory Management

Future Enhancements

Planned Features

Performance Improvements

Technology Stack

Core

Database

Parsing

Security

Next Steps