Skip to content

Code Style Guide

This guide defines the coding standards and conventions for the CalcBridge codebase. Consistent code style improves readability, reduces cognitive load during code review, and helps maintain a professional codebase.


Python Code Style

Formatter and Linter

CalcBridge uses Ruff for both linting and formatting. Ruff combines the functionality of multiple tools (Black, isort, Flake8) into a single, fast tool.

# Check for style issues
ruff check src/ tests/

# Auto-fix fixable issues
ruff check src/ tests/ --fix

# Format code
ruff format src/ tests/

Configuration

Ruff is configured in pyproject.toml:

pyproject.toml
[tool.ruff]
line-length = 120
target-version = "py311"

[tool.ruff.lint]
select = [
    "E",    # pycodestyle errors
    "W",    # pycodestyle warnings
    "F",    # Pyflakes
    "I",    # isort
    "B",    # flake8-bugbear
    "C4",   # flake8-comprehensions
    "UP",   # pyupgrade
    "SIM",  # flake8-simplify
]
ignore = [
    "E501",  # Line too long (handled by formatter)
]

[tool.ruff.lint.isort]
known-first-party = ["src"]

Line Length

  • Maximum line length: 120 characters
  • Use line continuation for longer expressions
# Good: Break long function calls across lines
result = calculate_weighted_average(
    values=portfolio_values,
    weights=position_weights,
    skip_nulls=True,
    decimal_places=4,
)

# Good: Break long strings with implicit concatenation
error_message = (
    f"Validation failed for workbook '{workbook_id}': "
    f"sheet '{sheet_name}' contains {error_count} errors"
)

Type Hints

Requirements

Type hints are required for all public functions and methods. They improve code documentation and enable static analysis.

# Good: Full type annotations
def calculate_cushion(
    current_value: float,
    threshold: float,
    direction: Literal["above", "below"] = "below",
) -> float:
    """Calculate the cushion between current value and threshold."""
    if direction == "below":
        return threshold - current_value
    return current_value - threshold


# Good: Complex types with Union and Optional
def get_cell_value(
    sheet_data: dict[str, Any],
    cell_ref: str,
    default: str | int | float | None = None,
) -> str | int | float | None:
    """Get a cell value from sheet data with optional default."""
    return sheet_data.get(cell_ref, default)

Type Import Style

Use Python 3.11+ native type syntax (no from typing import for basic types):

# Good: Python 3.11+ style
def process_data(
    items: list[dict[str, Any]],
    filters: dict[str, str] | None = None,
) -> list[str]:
    ...

# Avoid: Legacy typing style (only for complex types)
from typing import List, Dict, Optional  # Don't use for basic types

Complex Types

For complex type definitions, create type aliases:

from typing import TypeAlias

# Type aliases for clarity
CellValue: TypeAlias = str | int | float | bool | None
SheetData: TypeAlias = dict[str, dict[str, CellValue]]
FormulaResult: TypeAlias = tuple[CellValue, str, str | None]  # (value, type, error)

Naming Conventions

General Rules

Type Convention Example
Modules snake_case formula_parser.py
Classes PascalCase WorkbookRepository
Functions snake_case calculate_weighted_average
Variables snake_case tenant_id
Constants UPPER_SNAKE_CASE MAX_FILE_SIZE_MB
Private Leading underscore _internal_helper

Naming Guidelines

# Classes: Noun or noun phrase
class WorkbookParser:
    ...

class ComplianceTestRunner:
    ...

# Functions: Verb or verb phrase
def parse_formula(formula_text: str) -> AST:
    ...

def validate_workbook(workbook_id: str) -> ValidationResult:
    ...

# Boolean variables: Use is_, has_, can_, should_
is_valid = True
has_errors = False
can_edit = user.role in ["admin", "owner"]
should_retry = attempt < max_retries

# Collections: Use plural names
workbook_ids: list[str] = []
sheets: dict[str, SheetData] = {}

Import Organization

Imports are organized in three groups, separated by blank lines:

  1. Standard library imports
  2. Third-party imports
  3. Local application imports
# Standard library
import asyncio
import logging
from datetime import datetime, timezone
from typing import Any, Optional

# Third-party
import numpy as np
import pandas as pd
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel, Field

# Local application
from src.core.config import get_settings
from src.db.repositories.workbook import WorkbookRepository
from src.calculations.parser import parse_formula

Ruff handles import sorting automatically with ruff check --fix.


Documentation Standards

Docstrings

Use Google-style docstrings for all public functions, classes, and modules:

def evaluate_formula(
    formula: str,
    context: dict[str, Any],
    sheet_name: str | None = None,
) -> tuple[Any, str, str | None]:
    """Evaluate an Excel formula with the given context.

    Parses the formula into an AST and evaluates it using the provided
    cell values and sheet data as context.

    Args:
        formula: The Excel formula to evaluate (e.g., "=SUM(A1:A10)").
        context: Dictionary mapping cell references to their values.
        sheet_name: Optional sheet name for cross-sheet references.

    Returns:
        A tuple of (result, result_type, error):
            - result: The computed value (number, string, or boolean)
            - result_type: Type classification ("number", "text", "boolean", "error")
            - error: Error message if evaluation failed, None otherwise

    Raises:
        FormulaParseError: If the formula syntax is invalid.
        CircularReferenceError: If the formula contains circular references.

    Example:
        >>> result, rtype, error = evaluate_formula(
        ...     "=SUM(A1, A2)",
        ...     {"A1": 10, "A2": 20}
        ... )
        >>> print(result)
        30
    """

Module Docstrings

Every module should have a docstring explaining its purpose:

"""Vectorized calculation functions for Excel formula translation.

This module provides numpy/pandas-based vectorized implementations
of common Excel functions (IF, XLOOKUP, SUMIFS, etc.) optimized
for processing large financial datasets.

Performance Note:
    Vectorized operations are typically 100x faster than iterating
    with df.iterrows() or df.apply().
"""

Comments

# Good: Explain WHY, not WHAT
# Use vectorized operations to avoid O(n^2) complexity
# when processing large portfolios (10k+ positions)
df["weighted"] = df["value"] / total_value * df["weight"]

# Avoid: Obvious comments
# Loop through items
for item in items:
    ...

Code Patterns

Error Handling

# Good: Specific exceptions with context
from src.api.exceptions import NotFoundError, ValidationError

async def get_workbook(workbook_id: str) -> Workbook:
    workbook = await repo.get_by_id(workbook_id)
    if not workbook:
        raise NotFoundError(
            resource_type="Workbook",
            resource_id=workbook_id,
        )
    return workbook


# Good: Use try/except for expected errors
try:
    result = parse_formula(formula_text)
except FormulaParseError as e:
    logger.warning(f"Invalid formula: {e}")
    return {"error": str(e), "valid": False}

Async Patterns

# Good: Use async for I/O operations
async def fetch_workbook_data(workbook_id: str) -> dict:
    async with get_db_session() as session:
        workbook = await repo.get_by_id(session, workbook_id)
        sheets = await repo.get_sheets(session, workbook_id)
        return {"workbook": workbook, "sheets": sheets}


# Good: Parallel async operations
async def get_compliance_data(workbook_ids: list[str]) -> list[dict]:
    tasks = [get_workbook_compliance(wid) for wid in workbook_ids]
    return await asyncio.gather(*tasks)

Repository Pattern

# Good: Repository encapsulates data access
class WorkbookRepository:
    def __init__(self, session: AsyncSession):
        self.session = session

    async def get_by_id(self, workbook_id: str) -> Workbook | None:
        stmt = select(Workbook).where(Workbook.id == workbook_id)
        result = await self.session.execute(stmt)
        return result.scalar_one_or_none()

    async def create(self, data: WorkbookCreate) -> Workbook:
        workbook = Workbook(**data.model_dump())
        self.session.add(workbook)
        await self.session.flush()
        return workbook

Frontend Style (TypeScript)

ESLint and Prettier

The frontend uses ESLint and Prettier for consistent code style:

cd frontend

# Lint check
npm run lint

# Format code
npm run format

TypeScript Guidelines

// Good: Explicit types for function parameters and returns
interface WorkbookSummary {
  id: string;
  name: string;
  sheetCount: number;
  lastModified: Date;
}

async function fetchWorkbooks(tenantId: string): Promise<WorkbookSummary[]> {
  const response = await fetch(`/api/v1/workbooks?tenant=${tenantId}`);
  return response.json();
}

// Good: Use const for values that don't change
const MAX_UPLOAD_SIZE = 100 * 1024 * 1024; // 100MB

// Good: Prefer interfaces over type aliases for object shapes
interface ComplianceTest {
  id: string;
  name: string;
  threshold: number;
  operator: 'lt' | 'gt' | 'eq' | 'lte' | 'gte';
}

Pre-commit Checks

Install pre-commit hooks to catch issues before committing:

# Install pre-commit
pip install pre-commit

# Install hooks
pre-commit install

# Run manually on all files
pre-commit run --all-files

The .pre-commit-config.yaml includes:

  • Ruff (lint and format)
  • Type checking with mypy
  • YAML and JSON validation
  • Trailing whitespace removal

Summary Checklist

Before submitting code, verify:

  • Code passes ruff check with no errors
  • Code is formatted with ruff format
  • All public functions have type hints
  • All public functions have docstrings
  • Variable names are descriptive and follow conventions
  • Imports are organized correctly
  • No hardcoded secrets or credentials
  • Tests pass locally