Skip to content

Architecture Primer — Your 10-Minute Mental Model

Welcome to LCO DMA. Read this once and you'll understand the shape of the whole app well enough to navigate it. It's the gentle on-ramp to the dense docs/ARCHITECTURE.md — start here, go there when you need depth.

What is this app?

LCO DMA (Data Management Application) is a full-stack construction cost estimation and project-controls platform for LCO Construction Consulting. It replaces spreadsheet-based estimating with a single source of truth. The stack is a React 18 single-page app talking over REST to a FastAPI backend backed by Azure Cosmos DB (NoSQL). It serves four service types — estimation, loan-monitoring, project-controls, and claims — and models material takeoffs (MTOs) across 9 engineering disciplines (piping, electrical, mechanical, civil, concrete, structural, instrumentation, and more). Analysts sign in with Microsoft Entra ID; automation integrates via API keys.

The system at a glance

flowchart TD
    Browser["React 18 SPA<br/>(Vite, TanStack Query, Zustand)"]
    Browser -->|HTTPS REST + Bearer JWT / X-API-Key| API

    subgraph FastAPI["FastAPI backend (Azure Container Apps)"]
        Auth["require_auth<br/>(Entra JWT + API keys)"]
        Routers["Routers<br/>(~50 included)"]
        Repos["Repositories<br/>(BaseRepository[T])"]
        Auth --> Routers --> Repos
    end

    API[" "]:::hidden --> Auth
    Browser --> Auth
    Repos -->|parameterized SQL| Cosmos[("Azure Cosmos DB<br/>~30 containers")]

    subgraph Async["Async import pipeline"]
        Blob["Azure Blob Storage<br/>(staged Excel file)"]
        Bus["Azure Service Bus<br/>(task message)"]
        Worker["Azure Function worker<br/>backend/functions/import-processor"]
        Blob --> Bus --> Worker
    end

    Routers -->|upload| Blob
    Worker -->|100-item batches| Cosmos

    classDef hidden fill:none,stroke:none;

Two paths into the data layer: a synchronous request path (left) for normal CRUD, and an asynchronous import path (right) for large MTO Excel files that would time out an HTTP request.

The four services (LCO's lines of business)

The product exists to serve LCO Construction Consulting's four lines of business (LOBs). These map 1:1 to the four Service.serviceType values — a Project can host one or more Services, and the service's type gates which features, dashboards, and routers it can use. Understanding these four is the business core of the whole app.

flowchart LR
    EST["①  Estimation<br/><i>build the cost estimate</i>"]
    PC["②  Project Controls<br/><i>track budget vs actuals</i>"]
    LM["③  Loan Monitoring<br/><i>audit for the bank</i>"]
    CL["④  Claims<br/><i>end-of-project discrepancy analysis</i>"]

    EST -->|"deep-copy<br/>(BudgetDeepCopyService)"| PC
    PC -.->|final budget vs actuals| CL
    EST -.->|baseline reference| LM

    classDef est fill:#1f6feb,stroke:#1f6feb,color:#fff;
    classDef pc fill:#2da44e,stroke:#2da44e,color:#fff;
    classDef lm fill:#bf8700,stroke:#bf8700,color:#fff;
    classDef cl fill:#8957e5,stroke:#8957e5,color:#fff;
    class EST est; class PC pc; class LM lm; class CL cl;
Service (LOB) What LCO does Key deliverables / notes
① Estimation The cost-estimation core. An analyst builds and maintains line-item estimates for a project — importing the Excel Estimation Master, reviewing items across 9 MTO disciplines, and assembling WBS / discipline / KPI structure. A project can hold several estimation revisions (Rev A/B/…). The Estimation Master is the headline deliverable. Estimation is the source of truth that seeds Project Controls.
② Project Controls Tracks a live project's budget vs. actuals — budgets, change orders, forecasting (ETC), and the reporting suite (S-curve, waterfall, ECR, executive summary). Requires an estimation source: a project-controls service is created by deep-copying an approved estimation (BudgetDeepCopyService snapshots EstimationItems → BudgetItems, preserving the numbers exactly).
③ Loan Monitoring A bank hires LCO to monitor a project it is co-financing. LCO audits the developer's budget, schedule, and progress on the bank's behalf. Opens with the IPR (Initial Project Review — a ~40h one-time baseline audit), then issues periodic Draw Reports (~16h each) that validate the developer's request to draw the next tranche of funding before the bank releases it. A downstream variant of the estimation/PC baseline.
④ Claims End-of-project discrepancy analysis. A claims-type service consumes the final budget vs. actuals to analyze and substantiate cost claims. Downstream consumer of the completed project's budget data.

The dependency that matters: Estimation feeds everything. Project Controls can't exist without an estimation to deep-copy from, and Loan Monitoring / Claims are downstream variants that build on that baseline. When in doubt, follow the estimate.

(Business definitions sourced from the lco-domain-mcp glossary — see lco_glossary for LOB, IPR, and Draw Report, all cross-referenced to the Confluence intro-to-lcos-business page.)

The monorepo in 60 seconds

Directory What lives here
frontend/ React 18 + TypeScript + Vite SPA. All UI, hooks, stores, API clients.
backend/ FastAPI app under backend/app/ — routers, schemas, repositories, services, the Cosmos client.
backend/functions/import-processor/ The Azure Function worker (function_app.py) that drains the Service Bus import queue.
tools/ Standalone dev/ops tools: daily-report, epic-generator, jira sync, the Cosmos MCP.
infra/ Terraform / Azure infrastructure-as-code. Read the Terraform-safety notes in CLAUDE.md before any apply.
docs/ Human-facing docs (this file, ARCHITECTURE.md, per-layer guides, diagrams).
wiki/ Confluence mirror — business context, FSDs, PRDs. Generated; don't hand-edit.
jira/ Per-ticket folders (jira/LCO-NNN/) with spec.md / plan.md / notes.md.
.agent-context/ URM-maintained source of truth for product, tech, structure, conventions (read these first as an agent).
.claude/ Skills, agents, and hard codegen rules (.claude/rules/*.md) for AI-assisted development.

All dev workflows run through the justfilejust dev, just quality, just test, just ci. Run just --list to see everything. Python always runs through uv (uv run python), never bare pip/python.

Request lifecycle — a GET from click to Cosmos

When a page needs data, it doesn't call fetch directly. It uses a custom hook (often built from createInfiniteResourceHook) that wraps TanStack Query, which calls a typed API client (frontend/src/services/api/*.ts). The client is an Axios instance (config.ts) that attaches the auth header (silent Entra token acquisition, or an API key). On the backend, every endpoint declares _user: require_auth, so auth happens before your handler runs. The router calls a repository, which issues a parameterized Cosmos query scoped to a partition key, and the response flows back as camelCase JSON.

sequenceDiagram
    participant C as Component
    participant H as Hook (TanStack Query)
    participant A as apiClient (Axios)
    participant R as FastAPI Router
    participant D as require_auth
    participant Repo as Repository
    participant Cosmos as Cosmos DB

    C->>H: useInfiniteMTOs(filters)
    H->>A: mtoApi.getMTOsByProject(...)
    A->>R: GET /api/v1/... (Bearer JWT / X-API-Key)
    R->>D: resolve _user
    D-->>R: AuthenticatedUser
    R->>Repo: query_with_pagination(...)
    Repo->>Cosmos: SELECT ... WHERE c.projectId = @pid (parameterized)
    Cosmos-->>Repo: documents
    Repo-->>R: list[Model]
    R-->>A: { data, total, skip, limit, hasMore }
    A-->>H: typed response
    H-->>C: items, loadMore, hasNextPage

Key contract: the API speaks camelCase JSON; Python code is snake_case. Pydantic Field(alias=...) bridges the two automatically.

The data layer

The database is Azure Cosmos DB (Core/SQL API, Session consistency, serverless). There are roughly ~30 containers (the canonical CONTAINERS config lives in backend/app/db/cosmos.py). Cosmos is partitioned, so:

  • Every container has a partition key (e.g. projects/clientId, EstimationItems/projectId, api_keys/keyPrefix). Queries that include the partition key are cheap; cross-partition queries cost more.
  • Data access goes through the repository pattern: every repo extends BaseRepository[T] and overrides get_partition_key(). Repos own all SQL, always parameterized — never string-interpolate values into a query (injection risk, enforced by .claude/rules/backend-repositories.md).
  • Containers are multi-tenant by type: documents carry a type discriminator field so several entity types can share one container (SELECT * FROM c WHERE c.type = 'project').

One model worth knowing early: the Estimation Master tab and the MTO discipline tabs are two views of one document in the EstimationItems container. There's no sync logic between them — both read and write the same row. See the "Unified Estimation-Item Model" section of docs/ARCHITECTURE.md.

Container reference: docs/database/INDEX.md and docs/database/CONTAINERS_OVERVIEW.md.

The async import pipeline

MTO Excel files can carry thousands of rows across 9 disciplines (piping alone has 140+ columns). Parsing and inserting that inside one HTTP request would time out, so imports are asynchronous:

  1. User uploads an Excel file → FastAPI streams it to Azure Blob Storage and writes a task record (status queued).
  2. FastAPI enqueues a Service Bus message carrying taskId, projectId, blobUrl, and taskType.
  3. The Azure Function worker (backend/functions/import-processor/function_app.py) is Service Bus-triggered. It downloads the blob, parses the discipline-specific sheet, and writes rows to Cosmos in 100-item batches.
  4. The frontend polls the task record until it reports complete or failed.

Resilience is layered (3-level retry): each batch retries up to 3× on Cosmos throttling (429); if the message handler raises, Service Bus redelivers up to 10×; and progress is checkpointed to Cosmos so a redelivery resumes rather than restarts. One subtle rule (see .claude/rules/em-template-columns.md): the parser raises ValueError, never HTTPException, because the worker context dead-letters on unhandled exceptions.

Deep dive: docs/backend/IMPORT_PROCESSING_ARCHITECTURE.md.

Auth in one minute

Authentication is dual, and a single dependency handles both:

  • Microsoft Entra ID JWT — interactive users. The SPA uses MSAL to get a token, sends it as Authorization: Bearer …, and the backend validates it via fastapi-azure-auth. Roles (Reader / Contributor) come from token claims.
  • API keys — programmatic clients. Format lco_ + random chars, sent as X-API-Key. Only the bcrypt hash is stored (the prefix is the partition key).

require_auth (in backend/app/core/auth.py) is the canonical dependency — it resolves API key, Entra JWT, or a mock user in one step, and every endpoint must include _user: require_auth. For local dev, auth can be relaxed (AZURE_AUTH_ENABLED=false) so you don't need a live token to hit the API. Role gates layer on top: require_reader_role, require_contributor_role.

backend/app/core/auth.py is a sacred path — breaking it locks out every endpoint. Details: docs/auth/AUTH_IMPLEMENTATION.md.

Frontend patterns you'll hit immediately

  • Hook factories. Paginated lists are built with createInfiniteResourceHook (in hooks/factories/), which gives you pagination, ID-based dedup, request abortion on filter change, and a uniform { items, loadMore, refetch, hasNextPage } API. Lookup maps use createLookupMapHook. Don't hand-roll fetch state — reach for the factory.
  • Only 3 Zustand stores. useAuthStore, useProjectStore, useDemoDataStore. That's the whole list. Before adding a fourth, ask whether a custom hook suffices (it usually does — see .claude/rules/frontend-stores.md). Server data is TanStack Query's job, not Zustand's.
  • TanStack Table. The reusable DataTable (components/table/) handles column visibility, drag-reorder, query-builder filters, virtualized rows, selection, and editable cells. Most data screens are configurations of it.
  • Design tokens — not raw Tailwind. Typography is always text-lco-* (text-lco-body-02, text-lco-heading-01, …), never text-sm/text-xl. Colors are semantic (bg-card, text-foreground, text-muted-foreground, border-border), never bg-white/text-neutral-*. Merge classes with cn() from @/lib/utils. This is enforced by .claude/rules/frontend-styling.md.
  • Lazy routes. Heavy analysis pages are React.lazy + Suspense in App.tsx to keep the initial bundle small.

Reference: docs/frontend/ARCHITECTURE.md, docs/frontend/STYLES.md, and run the /design-system skill before any UI work.

Where to go deeper

Topic Doc
Full system architecture docs/ARCHITECTURE.md
Backend (routers, schemas, repos, services) docs/backend/ARCHITECTURE.md
Frontend (hooks, stores, tables, design system) docs/frontend/ARCHITECTURE.md
Database (containers, partition keys, schemas) docs/database/INDEX.md
Authentication implementation docs/auth/AUTH_IMPLEMENTATION.md
Async import pipeline docs/backend/IMPORT_PROCESSING_ARCHITECTURE.md
Rendered diagrams (system, ERD, auth, import, CI/CD) docs/diagrams/
Full documentation index docs/KNOWLEDGE_BASE.md

When you're ready to make a change, the codegen rules in .claude/rules/ and the skills listed in CLAUDE.md (e.g. /api-endpoint, /backend-standards, /design-system) encode the conventions this primer only summarizes.