Code Style Guide¶

This guide defines the coding standards and conventions for the CalcBridge codebase. Consistent code style improves readability, reduces cognitive load during code review, and helps maintain a professional codebase.

Python Code Style¶

Formatter and Linter¶

CalcBridge uses Ruff for both linting and formatting. Ruff combines the functionality of multiple tools (Black, isort, Flake8) into a single, fast tool.

# Check for style issues
ruff check src/ tests/

# Auto-fix fixable issues
ruff check src/ tests/ --fix

# Format code
ruff format src/ tests/

Configuration¶

Ruff is configured in pyproject.toml:

pyproject.toml

[tool.ruff]
line-length = 120
target-version = "py311"

[tool.ruff.lint]
select = [
    "E",    # pycodestyle errors
    "W",    # pycodestyle warnings
    "F",    # Pyflakes
    "I",    # isort
    "B",    # flake8-bugbear
    "C4",   # flake8-comprehensions
    "UP",   # pyupgrade
    "SIM",  # flake8-simplify
]
ignore = [
    "E501",  # Line too long (handled by formatter)
]

[tool.ruff.lint.isort]
known-first-party = ["src"]

Line Length¶

Maximum line length: 120 characters
Use line continuation for longer expressions

# Good: Break long function calls across lines
result = calculate_weighted_average(
    values=portfolio_values,
    weights=position_weights,
    skip_nulls=True,
    decimal_places=4,
)

# Good: Break long strings with implicit concatenation
error_message = (
    f"Validation failed for workbook '{workbook_id}': "
    f"sheet '{sheet_name}' contains {error_count} errors"
)

Type Hints¶

Requirements¶

Type hints are required for all public functions and methods. They improve code documentation and enable static analysis.

# Good: Full type annotations
def calculate_cushion(
    current_value: float,
    threshold: float,
    direction: Literal["above", "below"] = "below",
) -> float:
    """Calculate the cushion between current value and threshold."""
    if direction == "below":
        return threshold - current_value
    return current_value - threshold


# Good: Complex types with Union and Optional
def get_cell_value(
    sheet_data: dict[str, Any],
    cell_ref: str,
    default: str | int | float | None = None,
) -> str | int | float | None:
    """Get a cell value from sheet data with optional default."""
    return sheet_data.get(cell_ref, default)

Type Import Style¶

Use Python 3.11+ native type syntax (no from typing import for basic types):

# Good: Python 3.11+ style
def process_data(
    items: list[dict[str, Any]],
    filters: dict[str, str] | None = None,
) -> list[str]:
    ...

# Avoid: Legacy typing style (only for complex types)
from typing import List, Dict, Optional  # Don't use for basic types

Complex Types¶

For complex type definitions, create type aliases:

from typing import TypeAlias

# Type aliases for clarity
CellValue: TypeAlias = str | int | float | bool | None
SheetData: TypeAlias = dict[str, dict[str, CellValue]]
FormulaResult: TypeAlias = tuple[CellValue, str, str | None]  # (value, type, error)

Naming Conventions¶

General Rules¶

Type	Convention	Example
Modules	snake_case	`formula_parser.py`
Classes	PascalCase	`WorkbookRepository`
Functions	snake_case	`calculate_weighted_average`
Variables	snake_case	`tenant_id`
Constants	UPPER_SNAKE_CASE	`MAX_FILE_SIZE_MB`
Private	Leading underscore	`_internal_helper`

Naming Guidelines¶

# Classes: Noun or noun phrase
class WorkbookParser:
    ...

class ComplianceTestRunner:
    ...

# Functions: Verb or verb phrase
def parse_formula(formula_text: str) -> AST:
    ...

def validate_workbook(workbook_id: str) -> ValidationResult:
    ...

# Boolean variables: Use is_, has_, can_, should_
is_valid = True
has_errors = False
can_edit = user.role in ["admin", "owner"]
should_retry = attempt < max_retries

# Collections: Use plural names
workbook_ids: list[str] = []
sheets: dict[str, SheetData] = {}

Import Organization¶

Imports are organized in three groups, separated by blank lines:

Standard library imports
Third-party imports
Local application imports

# Standard library
import asyncio
import logging
from datetime import datetime, timezone
from typing import Any, Optional

# Third-party
import numpy as np
import pandas as pd
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel, Field

# Local application
from src.core.config import get_settings
from src.db.repositories.workbook import WorkbookRepository
from src.calculations.parser import parse_formula

Ruff handles import sorting automatically with ruff check --fix.

Documentation Standards¶

Docstrings¶

Use Google-style docstrings for all public functions, classes, and modules:

def evaluate_formula(
    formula: str,
    context: dict[str, Any],
    sheet_name: str | None = None,
) -> tuple[Any, str, str | None]:
    """Evaluate an Excel formula with the given context.

    Parses the formula into an AST and evaluates it using the provided
    cell values and sheet data as context.

    Args:
        formula: The Excel formula to evaluate (e.g., "=SUM(A1:A10)").
        context: Dictionary mapping cell references to their values.
        sheet_name: Optional sheet name for cross-sheet references.

    Returns:
        A tuple of (result, result_type, error):
            - result: The computed value (number, string, or boolean)
            - result_type: Type classification ("number", "text", "boolean", "error")
            - error: Error message if evaluation failed, None otherwise

    Raises:
        FormulaParseError: If the formula syntax is invalid.
        CircularReferenceError: If the formula contains circular references.

    Example:
        >>> result, rtype, error = evaluate_formula(
        ...     "=SUM(A1, A2)",
        ...     {"A1": 10, "A2": 20}
        ... )
        >>> print(result)
        30
    """

Module Docstrings¶

Every module should have a docstring explaining its purpose:

"""Vectorized calculation functions for Excel formula translation.

This module provides numpy/pandas-based vectorized implementations
of common Excel functions (IF, XLOOKUP, SUMIFS, etc.) optimized
for processing large financial datasets.

Performance Note:
    Vectorized operations are typically 100x faster than iterating
    with df.iterrows() or df.apply().
"""

Comments¶

# Good: Explain WHY, not WHAT
# Use vectorized operations to avoid O(n^2) complexity
# when processing large portfolios (10k+ positions)
df["weighted"] = df["value"] / total_value * df["weight"]

# Avoid: Obvious comments
# Loop through items
for item in items:
    ...

Code Patterns¶

Error Handling¶

# Good: Specific exceptions with context
from src.api.exceptions import NotFoundError, ValidationError

async def get_workbook(workbook_id: str) -> Workbook:
    workbook = await repo.get_by_id(workbook_id)
    if not workbook:
        raise NotFoundError(
            resource_type="Workbook",
            resource_id=workbook_id,
        )
    return workbook


# Good: Use try/except for expected errors
try:
    result = parse_formula(formula_text)
except FormulaParseError as e:
    logger.warning(f"Invalid formula: {e}")
    return {"error": str(e), "valid": False}

Async Patterns¶

# Good: Use async for I/O operations
async def fetch_workbook_data(workbook_id: str) -> dict:
    async with get_db_session() as session:
        workbook = await repo.get_by_id(session, workbook_id)
        sheets = await repo.get_sheets(session, workbook_id)
        return {"workbook": workbook, "sheets": sheets}


# Good: Parallel async operations
async def get_compliance_data(workbook_ids: list[str]) -> list[dict]:
    tasks = [get_workbook_compliance(wid) for wid in workbook_ids]
    return await asyncio.gather(*tasks)

Repository Pattern¶

# Good: Repository encapsulates data access
class WorkbookRepository:
    def __init__(self, session: AsyncSession):
        self.session = session

    async def get_by_id(self, workbook_id: str) -> Workbook | None:
        stmt = select(Workbook).where(Workbook.id == workbook_id)
        result = await self.session.execute(stmt)
        return result.scalar_one_or_none()

    async def create(self, data: WorkbookCreate) -> Workbook:
        workbook = Workbook(**data.model_dump())
        self.session.add(workbook)
        await self.session.flush()
        return workbook

Frontend Style (TypeScript)¶

ESLint and Prettier¶

The frontend uses ESLint and Prettier for consistent code style:

cd frontend

# Lint check
npm run lint

# Format code
npm run format

TypeScript Guidelines¶

// Good: Explicit types for function parameters and returns
interface WorkbookSummary {
  id: string;
  name: string;
  sheetCount: number;
  lastModified: Date;
}

async function fetchWorkbooks(tenantId: string): Promise<WorkbookSummary[]> {
  const response = await fetch(`/api/v1/workbooks?tenant=${tenantId}`);
  return response.json();
}

// Good: Use const for values that don't change
const MAX_UPLOAD_SIZE = 100 * 1024 * 1024; // 100MB

// Good: Prefer interfaces over type aliases for object shapes
interface ComplianceTest {
  id: string;
  name: string;
  threshold: number;
  operator: 'lt' | 'gt' | 'eq' | 'lte' | 'gte';
}

Pre-commit Checks¶

Install pre-commit hooks to catch issues before committing:

# Install pre-commit
pip install pre-commit

# Install hooks
pre-commit install

# Run manually on all files
pre-commit run --all-files

The .pre-commit-config.yaml includes:

Ruff (lint and format)
Type checking with mypy
YAML and JSON validation
Trailing whitespace removal

Summary Checklist¶

Before submitting code, verify:

Code passes ruff check with no errors
Code is formatted with ruff format
All public functions have type hints
All public functions have docstrings
Variable names are descriptive and follow conventions
Imports are organized correctly
No hardcoded secrets or credentials
Tests pass locally