Skip to content

MTO Import/Export System

Version: 4.0.0 Last Updated: 2025-11-14 Status: ✅ Production Ready


Table of Contents

  1. Overview
  2. Async Import System (NEW)
  3. Composite Key System
  4. Import Strategies
  5. Export API
  6. Template System
  7. Validation Rules
  8. Data Model
  9. Workflows
  10. Troubleshooting
  11. Performance
  12. Testing
  13. File Locations
  14. Version History

Overview

The MTO (Material Take-Off) system manages construction material lists across 9 disciplines using Excel files with automatic validation, smart merging, and asynchronous background processing.

Key Features: - Async background import with real-time progress tracking - Multi-sheet Excel import with auto-discipline detection - 4-field composite key: (projectId, sourceFileName, discipline, itemNo) - MERGE strategy: intelligent updates preserving existing data - REPLACE strategy: complete refresh per discipline - Export → Edit → Import round-trip workflow - Progress callbacks with percentage and messages - Idempotency protection (prevents duplicate imports)

Supported Disciplines: Electrical Equipment, Mechanical Equipment, Bulk Electrical, Instrumentation, Piping, Civil, Concrete, Structural, Architectural


Async Import System (NEW)

Why Async Import?

Problem: Large Excel files (500+ rows) can take 30+ seconds to import, causing: - HTTP timeout errors (browser/server) - Unresponsive UI (no progress feedback) - Failed imports that can't be retried

Solution: Background processing with task tracking: - ✅ API returns immediately (< 1 second) - ✅ Import runs in background - ✅ Real-time progress tracking (percentage + messages) - ✅ No HTTP timeouts - ✅ Retry-safe with idempotency protection - ✅ UI remains responsive

Workflow Diagram

┌─────────────┐
│   Frontend  │
└──────┬──────┘
       │ 1. POST /api/v1/mto/import/async (file, projectId, strategy)
┌──────────────────────────┐
│   FastAPI Router         │
│  ─ Save file to memory   │
│  ─ Create task record    │
│  ─ Queue background job  │
└──────┬───────────────────┘
       │ 2. Returns task ID immediately (< 1 sec)
┌─────────────┐
│   Frontend  │
│  Polling... │
└──────┬──────┘
       │ 3. GET /api/v1/mto/import/status/{taskId}?projectId=... (every 2-3 sec)
┌──────────────────────────┐
│   Background Worker      │
│  ─ Parse Excel sheets    │
│  ─ Validate all data     │
│  ─ Import with strategy  │
│  ─ Update progress       │ ───> Task record in Cosmos DB (Tasks container)
└──────────────────────────┘
       │ 4. Status: pending → running → completed
┌─────────────┐
│   Frontend  │
│  Success!   │
└─────────────┘

Async Import API

POST /api/v1/mto/import/async

Upload Excel/CSV file for async background import.

Request:

curl -X POST "http://localhost:8000/api/v1/mto/import/async" \
  -F "file=@mto_data.xlsx" \
  -F "projectId=proj-123" \
  -F "strategy=merge"

Parameters: - file - Excel (.xlsx) or CSV file (required) - projectId - Project identifier (required) - strategy - merge (default) or replace (optional) - serviceId - Optional service ID (optional)

Response (202 Accepted):

{
  "taskId": "abc-123-def-456",
  "status": "pending",
  "message": "Import queued successfully",
  "pollUrl": "/api/v1/mto/import/status/abc-123-def-456?projectId=proj-123"
}

Idempotency Protection:

If you submit the same file twice within 10 minutes, you'll get the existing task back:

{
  "taskId": "abc-123-def-456",
  "status": "running",
  "message": "Import already in progress. Returning existing task.",
  "pollUrl": "/api/v1/mto/import/status/abc-123-def-456?projectId=proj-123"
}

Duplicate Detection Criteria: - Same projectId - Same filename - Same file_size (bytes) - Same strategy - Created within last 10 minutes


GET /api/v1/mto/import/status/{taskId}

Check the status of an async import task.

Request:

curl "http://localhost:8000/api/v1/mto/import/status/abc-123?projectId=proj-123"

Parameters: - taskId - Task identifier from submit response (path parameter, required) - projectId - Project identifier (query parameter, required)

Polling Strategy: - Poll every 2-3 seconds while status is "pending" or "running" - Stop polling when status is "completed" or "failed" - Show progress bar using progress.percentage - Display message using progress.message

Response States:

1. Pending (Task queued, not started yet):

{
  "taskId": "abc-123",
  "taskType": "mto_import",
  "projectId": "proj-123",
  "status": "pending",
  "createdAt": "2025-01-10T10:00:00Z",
  "startedAt": null,
  "completedAt": null,
  "progress": null,
  "result": null,
  "error": null,
  "retryCount": 0,
  "metadata": {
    "filename": "mto_data.xlsx",
    "strategy": "merge",
    "file_size": 1024000
  }
}

2. Running (Import in progress):

{
  "taskId": "abc-123",
  "taskType": "mto_import",
  "projectId": "proj-123",
  "status": "running",
  "createdAt": "2025-01-10T10:00:00Z",
  "startedAt": "2025-01-10T10:00:01Z",
  "completedAt": null,
  "progress": {
    "current": 3,
    "total": 5,
    "percentage": 60.0,
    "message": "Processing Electrical Equipment (3/5)"
  },
  "result": null,
  "error": null,
  "retryCount": 0
}

3. Completed (Import successful):

{
  "taskId": "abc-123",
  "taskType": "mto_import",
  "projectId": "proj-123",
  "status": "completed",
  "createdAt": "2025-01-10T10:00:00Z",
  "startedAt": "2025-01-10T10:00:01Z",
  "completedAt": "2025-01-10T10:00:45Z",
  "progress": null,
  "result": {
    "success": true,
    "totalSheets": 5,
    "processedSheets": 5,
    "results": [
      {
        "discipline": "electrical_equipment",
        "success": true,
        "imported": 150,
        "created": 20,
        "updated": 130,
        "errors": []
      },
      {
        "discipline": "piping",
        "success": true,
        "imported": 200,
        "created": 50,
        "updated": 150,
        "errors": []
      }
    ]
  },
  "error": null,
  "retryCount": 0
}

4. Failed (Import error):

{
  "taskId": "abc-123",
  "taskType": "mto_import",
  "projectId": "proj-123",
  "status": "failed",
  "createdAt": "2025-01-10T10:00:00Z",
  "startedAt": "2025-01-10T10:00:01Z",
  "completedAt": "2025-01-10T10:00:15Z",
  "progress": null,
  "result": null,
  "error": "Validation failed for discipline 'electrical_equipment': Field 'itemNo' is required at row 5",
  "retryCount": 0
}


Frontend Implementation Example

React + TypeScript:

// 1. Submit import
const submitImport = async (file: File, projectId: string) => {
  const formData = new FormData();
  formData.append('file', file);
  formData.append('projectId', projectId);
  formData.append('strategy', 'merge');

  const response = await fetch('/api/v1/mto/import/async', {
    method: 'POST',
    body: formData
  });

  const data = await response.json();
  return data.taskId;
};

// 2. Poll for status
const pollImportStatus = async (taskId: string, projectId: string) => {
  const response = await fetch(
    `/api/v1/mto/import/status/${taskId}?projectId=${projectId}`
  );
  return await response.json();
};

// 3. Complete workflow
const handleImport = async (file: File, projectId: string) => {
  // Submit
  const taskId = await submitImport(file, projectId);
  console.log(`Import started: ${taskId}`);

  // Poll until complete
  let status = 'pending';
  while (status === 'pending' || status === 'running') {
    await new Promise(resolve => setTimeout(resolve, 2000)); // Wait 2 seconds

    const task = await pollImportStatus(taskId, projectId);
    status = task.status;

    // Update UI with progress
    if (task.progress) {
      console.log(`Progress: ${task.progress.percentage}% - ${task.progress.message}`);
      // Update progress bar in UI
      setProgress(task.progress.percentage);
      setProgressMessage(task.progress.message);
    }

    // Check if complete
    if (status === 'completed') {
      console.log('Import successful:', task.result);
      // Show success message
      showSuccess(`Imported ${task.result.processedSheets} disciplines`);
      return task.result;
    }

    if (status === 'failed') {
      console.error('Import failed:', task.error);
      // Show error message
      showError(task.error);
      throw new Error(task.error);
    }
  }
};

Vue Composable:

// composables/useAsyncImport.ts
import { ref } from 'vue';

export function useAsyncImport() {
  const isImporting = ref(false);
  const progress = ref(0);
  const progressMessage = ref('');
  const error = ref<string | null>(null);

  const importMTO = async (file: File, projectId: string, strategy = 'merge') => {
    isImporting.value = true;
    progress.value = 0;
    progressMessage.value = 'Uploading file...';
    error.value = null;

    try {
      // Step 1: Submit import
      const formData = new FormData();
      formData.append('file', file);
      formData.append('projectId', projectId);
      formData.append('strategy', strategy);

      const submitResponse = await fetch('/api/v1/mto/import/async', {
        method: 'POST',
        body: formData
      });

      const { taskId } = await submitResponse.json();
      progressMessage.value = 'Processing import...';

      // Step 2: Poll for status
      let taskStatus = 'pending';
      while (taskStatus === 'pending' || taskStatus === 'running') {
        await new Promise(resolve => setTimeout(resolve, 2500));

        const statusResponse = await fetch(
          `/api/v1/mto/import/status/${taskId}?projectId=${projectId}`
        );
        const task = await statusResponse.json();
        taskStatus = task.status;

        // Update progress
        if (task.progress) {
          progress.value = task.progress.percentage;
          progressMessage.value = task.progress.message;
        }

        // Handle completion
        if (taskStatus === 'completed') {
          progress.value = 100;
          progressMessage.value = 'Import completed successfully!';
          isImporting.value = false;
          return task.result;
        }

        if (taskStatus === 'failed') {
          throw new Error(task.error);
        }
      }
    } catch (err) {
      error.value = err instanceof Error ? err.message : 'Import failed';
      isImporting.value = false;
      throw err;
    }
  };

  return {
    importMTO,
    isImporting,
    progress,
    progressMessage,
    error
  };
}

Composite Key System

How Uniqueness Works

Every MTO record is uniquely identified by 4 fields:

(projectId, sourceFileName, discipline, itemNo)

Field Sources: - projectId - From API form parameter (required) - sourceFileName - From Excel column "Source File Name" (required) - discipline - Auto-detected from sheet name (required) - itemNo - From Excel column "Item No" (required)

Example:

Project: "proj-123"
Source: "equipment_list_v1.xlsx"
Discipline: "electrical_equipment"
Item No: "1"

Key Behavior: - Same itemNo can exist in different source files - Same itemNo can exist in different disciplines - MERGE only updates when ALL 4 fields match


Import Strategies

Behavior: 1. Extract composite key: (projectId, sourceFileName, discipline, itemNo) 2. Query: WHERE projectId=X AND sourceFileName=Y AND discipline=Z AND itemNo=W 3. If all 4 match → UPDATE (partial, preserves unmentioned fields) 4. If not found → CREATE new record

Use Cases: - ✅ Incremental updates - ✅ Fixing errors in existing data - ✅ Export → Edit → Import workflow - ✅ Adding new items

Example:

Database:

{
  "projectId": "proj-123",
  "sourceFileName": "equip_v1.xlsx",
  "discipline": "electrical_equipment",
  "itemNo": "1",
  "description": "Motor 100HP",
  "quantity": 10,
  "manufacturer": "Siemens"
}

Import File:

| Source File Name  | Item No | Quantity |
|-------------------|---------|----------|
| equip_v1.xlsx     | 1       | 15       |

Result: Quantity updated to 15, all other fields preserved.

REPLACE (Destructive)

Behavior: 1. For each discipline in file: - DELETE all existing items for that discipline - CREATE all items from import file

Use Cases: - Complete data refresh - Starting from scratch - Removing orphaned data

⚠️ Warning: Irreversible deletion. Not for incremental updates.


Export API

GET /api/v1/mto/export

Export MTO data to Excel with one sheet per discipline (multi-sheet format).

Request:

curl "http://localhost:8000/api/v1/mto/export?projectId=proj-123" \
  -o mto_export.xlsx

Parameters: - projectId - Project identifier (required) - serviceId - Filter by service (optional) - discipline - Filter by discipline (optional) - format - multi-sheet (default) or single-sheet (optional)

Export Formats:

Multi-Sheet (Default): - One sheet per discipline - All disciplines in a single file - Best for complete project export

Single-Sheet (Requires discipline filter):

curl "http://localhost:8000/api/v1/mto/export?projectId=proj-123&discipline=piping&format=single-sheet" \
  -o piping_only.xlsx

Column Headers: - Uses space-separated format: "Source File Name", "Item No", "Description" - NOT camelCase: sourceFileName, itemNo, description

Excluded Fields: - id - Internal Cosmos DB ID - mtoId - Internal UUID - projectId - Already known from query - createdAt, updatedAt - Timestamps - createdBy, updatedBy - Audit trails

Included Fields: - sourceFileName - REQUIRED for MERGE - itemNo - REQUIRED for MERGE - All data fields (description, quantity, prices, etc.)

Round-Trip: Export → Edit → Import with MERGE preserves data perfectly.


Template System

Download Templates

Unified Template (All 9 Disciplines):

curl "http://localhost:8000/api/v1/mto/template" -o mto_template.xlsx

Single Discipline:

curl "http://localhost:8000/api/v1/mto/template/electrical_equipment" \
  -o electrical_template.xlsx

Available Templates: - electrical_equipment - mechanical_equipment - bulk_electrical - instrumentation - piping - civil - concrete - structural - architectural

Required Columns

CRITICAL: Use space-separated format (NOT camelCase):

CORRECT:

Source File Name | Item No | Description | Quantity

WRONG:

sourceFileName | itemNo | description | quantity

Minimum Required for MERGE: - Source File Name (composite key) - Item No (composite key)


Validation Rules

Required Fields

RULE 1: itemNo is REQUIRED - Must not be empty - Numeric values converted to strings automatically - Error: 'itemNo' is required (unique identifier within project)

RULE 2: sourceFileName is REQUIRED - Must not be empty - Numeric values converted to strings automatically - Error: 'sourceFileName' is required (part of composite key: projectId, sourceFileName, discipline, itemNo)

RULE 3: Duplicate itemNo Detection - Checked within same discipline in import file - Prevents ambiguous MERGE operations - Error: Duplicate itemNo '1' (also in row 3)

Validation Flow

Excel File → FileParser → Validator → Pydantic Schema → Database
            ↓             ↓           ↓
         Normalize    Check Rules  Type Check
         Headers                    & Convert

All-or-Nothing Validation: - If ANY sheet fails validation, entire import aborts - No partial imports - Prevents inconsistent data


Data Model

Common Fields

Composite Key Fields (All Required): - projectId - Project identifier (partition key) - sourceFileName - Source file name (REQUIRED for MERGE) - discipline - Discipline enum (from sheet name) - itemNo - Item number (REQUIRED for MERGE)

Identification: - tagNo, lineNo, pfdNumber, drawingNoOrPid, etc.

Quantities & Costs: - quantity, uom, materialUnitPrice, laborUnitPrice, etc.

Assignment Fields: - wbsPath, wbsText - Work Breakdown Structure - crewCode, crewName - Assigned crew - coaCode, coaName - Chart of Accounts - kpiCode, kpiName - Key Performance Indicator

Field Count by Discipline: - Piping: 143 fields (most comprehensive) - Mechanical Equipment: 85 fields - Electrical Equipment: 83 fields - Others: 60-80 fields


Workflows

# Step 1: Export current data
curl "http://localhost:8000/api/v1/mto/export?projectId=proj-123" \
  -o mto_current.xlsx

# Step 2: Edit in Excel
# - Modify quantities, descriptions, etc.
# - Keep "Source File Name" and "Item No" columns
# - Add/remove rows as needed

# Step 3: Submit async import with MERGE
curl -X POST "http://localhost:8000/api/v1/mto/import/async" \
  -F "file=@mto_current.xlsx" \
  -F "projectId=proj-123" \
  -F "strategy=merge"

# Response:
# {
#   "taskId": "abc-123",
#   "status": "pending",
#   "pollUrl": "/api/v1/mto/import/status/abc-123?projectId=proj-123"
# }

# Step 4: Poll for status (every 2-3 seconds)
curl "http://localhost:8000/api/v1/mto/import/status/abc-123?projectId=proj-123"

# Keep polling until status = "completed" or "failed"

2. New Project from Template

# Step 1: Download template
curl "http://localhost:8000/api/v1/mto/template" -o template.xlsx

# Step 2: Fill with data
# - Add "Source File Name" value (e.g., "initial_import.xlsx")
# - Fill "Item No" sequentially (1, 2, 3...)
# - Add descriptions, quantities, etc.

# Step 3: Submit async import
curl -X POST "http://localhost:8000/api/v1/mto/import/async" \
  -F "file=@template.xlsx" \
  -F "projectId=new-proj" \
  -F "strategy=merge"

# Step 4: Poll for completion
curl "http://localhost:8000/api/v1/mto/import/status/{taskId}?projectId=new-proj"

3. Complete Refresh

# REPLACE strategy: Deletes existing, creates new
curl -X POST "http://localhost:8000/api/v1/mto/import/async" \
  -F "file=@new_data.xlsx" \
  -F "projectId=proj-123" \
  -F "strategy=replace"

Troubleshooting

Error: "itemNo is required"

Cause: Missing or empty "Item No" column

Solution: 1. Check Excel has column named "Item No" (not "itemNo") 2. Ensure all rows have values 3. Numeric values are OK (auto-converted to strings)

Error: "sourceFileName is required"

Cause: Missing or empty "Source File Name" column

Solution: 1. Add column "Source File Name" (not "sourceFileName") 2. Fill with source identifier (e.g., "equipment_list_v1.xlsx") 3. Use same value for items from same source

Error: "No sheets were processed"

Cause: Sheet names don't match expected discipline names

Solution: Rename sheets to exact matches: - "Electrical Equipment" (not "Electrical" or "EE") - "Mechanical Equipment" - "Piping" - etc.

Error: Duplicate Items Created

Cause: Composite key fields don't match exactly

Check: - Same projectId in API call? - Same sourceFileName in Excel? - Correct sheet name (discipline)? - Same itemNo value?

All 4 must match for UPDATE vs CREATE.

Task Stuck in "pending" Status

Cause: Background worker crashed or not running

Solution: 1. Check backend logs for errors 2. Restart backend server 3. Resubmit import (idempotency protection will handle duplicates)

Task Status "failed" with No Error Message

Cause: Unexpected exception in background worker

Solution: 1. Check backend logs for stack trace 2. Verify Excel file format is valid 3. Ensure all required environment variables are set (COSMOS_ENDPOINT, COSMOS_KEY)

Import Takes Too Long (> 5 minutes)

Cause: Very large file or slow database connection

Solution: 1. Break file into smaller batches (< 1000 rows per discipline) 2. Check Cosmos DB RU/s provisioning 3. Use MERGE instead of REPLACE (faster for updates)


Performance

Limits: - Max file size: 10MB (5MB recommended) - Recommended batch: <1000 items per discipline - Timeout: 5 minutes per background task - Polling interval: 2-3 seconds (optimal balance)

Optimization: - Break large files into smaller batches - Use MERGE for incremental updates (faster than REPLACE) - Minimize columns in import files (less data to process) - Use batch operations (100 items per transaction)

Batch Operations: - Cosmos DB Batch API: Up to 100 items per transaction - Automatic chunking for >100 items - 500ms delay between batches to avoid throttling - Exponential backoff retry for 429 errors (rate limiting)

Progress Tracking Granularity: - Progress updates after each discipline sheet - Not per-row (too expensive) - Percentage calculated as: (completedSheets / totalSheets) * 100


Testing

Test Files

Location: backend/tests/

  • test_mto_import.sh - 14 import scenarios
  • test_mto_export.sh - Export validation
  • test_mto_roundtrip.sh - Complete workflow + composite key testing
  • test_mto_batch.sh - Batch operation testing
  • test_mto_async.sh - Async import workflow testing

Run Tests

cd backend/tests

# Test async import workflow
./test_mto_async.sh [PROJECT_ID]

# Test round-trip (export → import → verify)
./test_mto_roundtrip.sh [PROJECT_ID]

# Test batch operations
./test_mto_batch.sh [PROJECT_ID]

# Run all MTO tests
./test_mto_import.sh && ./test_mto_export.sh && ./test_mto_roundtrip.sh

Phase 8 Tests Composite Key: - Creates same itemNo in different sources/disciplines - Verifies MERGE updates only exact 4-field match - Confirms other items remain unchanged


File Locations

Code: - Router: backend/app/api/v1/routers/mto.py (1344 lines) - Repository: backend/app/repositories/mto.py - Schemas: backend/app/schemas/mto.py (1361 lines) - Validator: backend/app/services/validators/mto_validator.py - File Parser: backend/app/services/parsers/file_parser.py - Excel Writer: backend/app/services/parsers/excel_writer.py - MTO Export Service: backend/app/services/exporters/mto_export_service.py - Task Service: backend/app/services/tasks/task_service.py - Task Schemas: backend/app/schemas/task.py

Templates: - backend/templates/mto/mto_template.xlsx - Unified template - backend/templates/mto/electrical_equipment_template.xlsx - backend/templates/mto/mechanical_equipment_template.xlsx - backend/templates/mto/bulk_electrical_template.xlsx - backend/templates/mto/instrumentation_template.xlsx - backend/templates/mto/piping_template.xlsx - backend/templates/mto/civil_template.xlsx - backend/templates/mto/concrete_template.xlsx - backend/templates/mto/structural_template.xlsx - backend/templates/mto/architectural_template.xlsx

Tests: - backend/tests/test_mto_*.sh - backend/tests/test_data/ - backend/tests/helpers/create_test_excel.py - backend/tests/helpers/generate_mto_perf_data.py

Scripts: - Template generation: backend/scripts/create_*_template.py


API Reference Summary

Async Import Endpoints

Method Endpoint Description Response Time
POST /api/v1/mto/import/async Submit import (returns task ID) < 1 second
GET /api/v1/mto/import/status/{taskId} Poll task status < 500ms

Export Endpoints

Method Endpoint Description
GET /api/v1/mto/export Export to Excel (multi-sheet)
GET /api/v1/mto/export?format=single-sheet&discipline=piping Export single discipline

Template Endpoints

Method Endpoint Description
GET /api/v1/mto/template Download unified template (9 sheets)
GET /api/v1/mto/template/{discipline} Download discipline template

Other MTO Endpoints

Method Endpoint Description
POST /api/v1/mto Create single MTO item
GET /api/v1/mto List MTOs (paginated)
GET /api/v1/mto/{mtoId} Get MTO by ID
PUT /api/v1/mto/{mtoId} Update MTO
DELETE /api/v1/mto/{mtoId} Delete MTO
POST /api/v1/mto/bulk Bulk create
DELETE /api/v1/mto/bulk Bulk delete
POST /api/v1/mto/search Advanced search
GET /api/v1/mto/project/{projectId}/disciplines List disciplines
GET /api/v1/mto/project/{projectId}/summary Project summary
DELETE /api/v1/mto/project/{projectId}/all Delete all MTOs

Version History

4.0.0 (2025-11-14)

  • Async import system with background task processing
  • ✅ Real-time progress tracking (percentage + messages)
  • ✅ Task polling API with status endpoint
  • ✅ Idempotency protection (prevents duplicate imports)
  • ✅ Batch operations with progress callbacks
  • ✅ Retry logic with exponential backoff
  • ✅ Comprehensive documentation update

3.0.0 (2025-11-04)

  • 4-field composite key: Added sourceFileName and discipline to uniqueness
  • ✅ Made sourceFileName required field
  • ✅ Updated validator to enforce sourceFileName
  • ✅ Updated documentation for composite key

2.1.0 (2025-10-30)

  • ✅ itemNo-based MERGE strategy
  • ✅ Excluded mtoId from exports
  • ✅ Round-trip testing (829 lines)
  • ✅ Comprehensive documentation (2,530 lines)

2.0.0 (2025-10-29)

  • ✅ Multi-sheet Excel support
  • ✅ Auto-discipline detection
  • ✅ Unified template system

Additional Resources

API Documentation: - Swagger UI: http://localhost:8000/api/docs - ReDoc: http://localhost:8000/api/redoc - OpenAPI Spec: http://localhost:8000/api/openapi.json

Related Documentation: - Backend Architecture - Overall architecture - Backend README - Getting started - Quick Start Guide - Quick reference

Database: - Container: MaterialTakeoff - Partition Key: /projectId - Document Type: type: "mto"

Task Tracking: - Container: Tasks - Partition Key: /projectId - Document Type: type: "task"


End of Guide

For questions or issues: 1. Check this documentation 2. Review the Architecture Guide 3. Read the Quick Start 4. Check Swagger UI for endpoint details 5. Review backend logs for error messages