MTO Import/Export System¶
Version: 4.0.0 Last Updated: 2025-11-14 Status: ✅ Production Ready
Table of Contents¶
- Overview
- Async Import System (NEW)
- Composite Key System
- Import Strategies
- Export API
- Template System
- Validation Rules
- Data Model
- Workflows
- Troubleshooting
- Performance
- Testing
- File Locations
- Version History
Overview¶
The MTO (Material Take-Off) system manages construction material lists across 9 disciplines using Excel files with automatic validation, smart merging, and asynchronous background processing.
Key Features:
- Async background import with real-time progress tracking
- Multi-sheet Excel import with auto-discipline detection
- 4-field composite key: (projectId, sourceFileName, discipline, itemNo)
- MERGE strategy: intelligent updates preserving existing data
- REPLACE strategy: complete refresh per discipline
- Export → Edit → Import round-trip workflow
- Progress callbacks with percentage and messages
- Idempotency protection (prevents duplicate imports)
Supported Disciplines: Electrical Equipment, Mechanical Equipment, Bulk Electrical, Instrumentation, Piping, Civil, Concrete, Structural, Architectural
Async Import System (NEW)¶
Why Async Import?¶
Problem: Large Excel files (500+ rows) can take 30+ seconds to import, causing: - HTTP timeout errors (browser/server) - Unresponsive UI (no progress feedback) - Failed imports that can't be retried
Solution: Background processing with task tracking: - ✅ API returns immediately (< 1 second) - ✅ Import runs in background - ✅ Real-time progress tracking (percentage + messages) - ✅ No HTTP timeouts - ✅ Retry-safe with idempotency protection - ✅ UI remains responsive
Workflow Diagram¶
┌─────────────┐
│ Frontend │
└──────┬──────┘
│ 1. POST /api/v1/mto/import/async (file, projectId, strategy)
▼
┌──────────────────────────┐
│ FastAPI Router │
│ ─ Save file to memory │
│ ─ Create task record │
│ ─ Queue background job │
└──────┬───────────────────┘
│ 2. Returns task ID immediately (< 1 sec)
▼
┌─────────────┐
│ Frontend │
│ Polling... │
└──────┬──────┘
│ 3. GET /api/v1/mto/import/status/{taskId}?projectId=... (every 2-3 sec)
▼
┌──────────────────────────┐
│ Background Worker │
│ ─ Parse Excel sheets │
│ ─ Validate all data │
│ ─ Import with strategy │
│ ─ Update progress │ ───> Task record in Cosmos DB (Tasks container)
└──────────────────────────┘
│ 4. Status: pending → running → completed
▼
┌─────────────┐
│ Frontend │
│ Success! │
└─────────────┘
Async Import API¶
POST /api/v1/mto/import/async¶
Upload Excel/CSV file for async background import.
Request:
curl -X POST "http://localhost:8000/api/v1/mto/import/async" \
-F "file=@mto_data.xlsx" \
-F "projectId=proj-123" \
-F "strategy=merge"
Parameters:
- file - Excel (.xlsx) or CSV file (required)
- projectId - Project identifier (required)
- strategy - merge (default) or replace (optional)
- serviceId - Optional service ID (optional)
Response (202 Accepted):
{
"taskId": "abc-123-def-456",
"status": "pending",
"message": "Import queued successfully",
"pollUrl": "/api/v1/mto/import/status/abc-123-def-456?projectId=proj-123"
}
Idempotency Protection:
If you submit the same file twice within 10 minutes, you'll get the existing task back:
{
"taskId": "abc-123-def-456",
"status": "running",
"message": "Import already in progress. Returning existing task.",
"pollUrl": "/api/v1/mto/import/status/abc-123-def-456?projectId=proj-123"
}
Duplicate Detection Criteria:
- Same projectId
- Same filename
- Same file_size (bytes)
- Same strategy
- Created within last 10 minutes
GET /api/v1/mto/import/status/{taskId}¶
Check the status of an async import task.
Request:
Parameters:
- taskId - Task identifier from submit response (path parameter, required)
- projectId - Project identifier (query parameter, required)
Polling Strategy:
- Poll every 2-3 seconds while status is "pending" or "running"
- Stop polling when status is "completed" or "failed"
- Show progress bar using progress.percentage
- Display message using progress.message
Response States:
1. Pending (Task queued, not started yet):
{
"taskId": "abc-123",
"taskType": "mto_import",
"projectId": "proj-123",
"status": "pending",
"createdAt": "2025-01-10T10:00:00Z",
"startedAt": null,
"completedAt": null,
"progress": null,
"result": null,
"error": null,
"retryCount": 0,
"metadata": {
"filename": "mto_data.xlsx",
"strategy": "merge",
"file_size": 1024000
}
}
2. Running (Import in progress):
{
"taskId": "abc-123",
"taskType": "mto_import",
"projectId": "proj-123",
"status": "running",
"createdAt": "2025-01-10T10:00:00Z",
"startedAt": "2025-01-10T10:00:01Z",
"completedAt": null,
"progress": {
"current": 3,
"total": 5,
"percentage": 60.0,
"message": "Processing Electrical Equipment (3/5)"
},
"result": null,
"error": null,
"retryCount": 0
}
3. Completed (Import successful):
{
"taskId": "abc-123",
"taskType": "mto_import",
"projectId": "proj-123",
"status": "completed",
"createdAt": "2025-01-10T10:00:00Z",
"startedAt": "2025-01-10T10:00:01Z",
"completedAt": "2025-01-10T10:00:45Z",
"progress": null,
"result": {
"success": true,
"totalSheets": 5,
"processedSheets": 5,
"results": [
{
"discipline": "electrical_equipment",
"success": true,
"imported": 150,
"created": 20,
"updated": 130,
"errors": []
},
{
"discipline": "piping",
"success": true,
"imported": 200,
"created": 50,
"updated": 150,
"errors": []
}
]
},
"error": null,
"retryCount": 0
}
4. Failed (Import error):
{
"taskId": "abc-123",
"taskType": "mto_import",
"projectId": "proj-123",
"status": "failed",
"createdAt": "2025-01-10T10:00:00Z",
"startedAt": "2025-01-10T10:00:01Z",
"completedAt": "2025-01-10T10:00:15Z",
"progress": null,
"result": null,
"error": "Validation failed for discipline 'electrical_equipment': Field 'itemNo' is required at row 5",
"retryCount": 0
}
Frontend Implementation Example¶
React + TypeScript:
// 1. Submit import
const submitImport = async (file: File, projectId: string) => {
const formData = new FormData();
formData.append('file', file);
formData.append('projectId', projectId);
formData.append('strategy', 'merge');
const response = await fetch('/api/v1/mto/import/async', {
method: 'POST',
body: formData
});
const data = await response.json();
return data.taskId;
};
// 2. Poll for status
const pollImportStatus = async (taskId: string, projectId: string) => {
const response = await fetch(
`/api/v1/mto/import/status/${taskId}?projectId=${projectId}`
);
return await response.json();
};
// 3. Complete workflow
const handleImport = async (file: File, projectId: string) => {
// Submit
const taskId = await submitImport(file, projectId);
console.log(`Import started: ${taskId}`);
// Poll until complete
let status = 'pending';
while (status === 'pending' || status === 'running') {
await new Promise(resolve => setTimeout(resolve, 2000)); // Wait 2 seconds
const task = await pollImportStatus(taskId, projectId);
status = task.status;
// Update UI with progress
if (task.progress) {
console.log(`Progress: ${task.progress.percentage}% - ${task.progress.message}`);
// Update progress bar in UI
setProgress(task.progress.percentage);
setProgressMessage(task.progress.message);
}
// Check if complete
if (status === 'completed') {
console.log('Import successful:', task.result);
// Show success message
showSuccess(`Imported ${task.result.processedSheets} disciplines`);
return task.result;
}
if (status === 'failed') {
console.error('Import failed:', task.error);
// Show error message
showError(task.error);
throw new Error(task.error);
}
}
};
Vue Composable:
// composables/useAsyncImport.ts
import { ref } from 'vue';
export function useAsyncImport() {
const isImporting = ref(false);
const progress = ref(0);
const progressMessage = ref('');
const error = ref<string | null>(null);
const importMTO = async (file: File, projectId: string, strategy = 'merge') => {
isImporting.value = true;
progress.value = 0;
progressMessage.value = 'Uploading file...';
error.value = null;
try {
// Step 1: Submit import
const formData = new FormData();
formData.append('file', file);
formData.append('projectId', projectId);
formData.append('strategy', strategy);
const submitResponse = await fetch('/api/v1/mto/import/async', {
method: 'POST',
body: formData
});
const { taskId } = await submitResponse.json();
progressMessage.value = 'Processing import...';
// Step 2: Poll for status
let taskStatus = 'pending';
while (taskStatus === 'pending' || taskStatus === 'running') {
await new Promise(resolve => setTimeout(resolve, 2500));
const statusResponse = await fetch(
`/api/v1/mto/import/status/${taskId}?projectId=${projectId}`
);
const task = await statusResponse.json();
taskStatus = task.status;
// Update progress
if (task.progress) {
progress.value = task.progress.percentage;
progressMessage.value = task.progress.message;
}
// Handle completion
if (taskStatus === 'completed') {
progress.value = 100;
progressMessage.value = 'Import completed successfully!';
isImporting.value = false;
return task.result;
}
if (taskStatus === 'failed') {
throw new Error(task.error);
}
}
} catch (err) {
error.value = err instanceof Error ? err.message : 'Import failed';
isImporting.value = false;
throw err;
}
};
return {
importMTO,
isImporting,
progress,
progressMessage,
error
};
}
Composite Key System¶
How Uniqueness Works¶
Every MTO record is uniquely identified by 4 fields:
Field Sources:
- projectId - From API form parameter (required)
- sourceFileName - From Excel column "Source File Name" (required)
- discipline - Auto-detected from sheet name (required)
- itemNo - From Excel column "Item No" (required)
Example:
Project: "proj-123"
Source: "equipment_list_v1.xlsx"
Discipline: "electrical_equipment"
Item No: "1"
Key Behavior:
- Same itemNo can exist in different source files
- Same itemNo can exist in different disciplines
- MERGE only updates when ALL 4 fields match
Import Strategies¶
MERGE (Default, Recommended)¶
Behavior:
1. Extract composite key: (projectId, sourceFileName, discipline, itemNo)
2. Query: WHERE projectId=X AND sourceFileName=Y AND discipline=Z AND itemNo=W
3. If all 4 match → UPDATE (partial, preserves unmentioned fields)
4. If not found → CREATE new record
Use Cases: - ✅ Incremental updates - ✅ Fixing errors in existing data - ✅ Export → Edit → Import workflow - ✅ Adding new items
Example:
Database:
{
"projectId": "proj-123",
"sourceFileName": "equip_v1.xlsx",
"discipline": "electrical_equipment",
"itemNo": "1",
"description": "Motor 100HP",
"quantity": 10,
"manufacturer": "Siemens"
}
Import File:
| Source File Name | Item No | Quantity |
|-------------------|---------|----------|
| equip_v1.xlsx | 1 | 15 |
Result: Quantity updated to 15, all other fields preserved.
REPLACE (Destructive)¶
Behavior: 1. For each discipline in file: - DELETE all existing items for that discipline - CREATE all items from import file
Use Cases: - Complete data refresh - Starting from scratch - Removing orphaned data
⚠️ Warning: Irreversible deletion. Not for incremental updates.
Export API¶
GET /api/v1/mto/export¶
Export MTO data to Excel with one sheet per discipline (multi-sheet format).
Request:
Parameters:
- projectId - Project identifier (required)
- serviceId - Filter by service (optional)
- discipline - Filter by discipline (optional)
- format - multi-sheet (default) or single-sheet (optional)
Export Formats:
Multi-Sheet (Default): - One sheet per discipline - All disciplines in a single file - Best for complete project export
Single-Sheet (Requires discipline filter):
curl "http://localhost:8000/api/v1/mto/export?projectId=proj-123&discipline=piping&format=single-sheet" \
-o piping_only.xlsx
Column Headers: - Uses space-separated format: "Source File Name", "Item No", "Description" - NOT camelCase: sourceFileName, itemNo, description
Excluded Fields:
- id - Internal Cosmos DB ID
- mtoId - Internal UUID
- projectId - Already known from query
- createdAt, updatedAt - Timestamps
- createdBy, updatedBy - Audit trails
Included Fields:
- sourceFileName - REQUIRED for MERGE
- itemNo - REQUIRED for MERGE
- All data fields (description, quantity, prices, etc.)
Round-Trip: Export → Edit → Import with MERGE preserves data perfectly.
Template System¶
Download Templates¶
Unified Template (All 9 Disciplines):
Single Discipline:
Available Templates:
- electrical_equipment
- mechanical_equipment
- bulk_electrical
- instrumentation
- piping
- civil
- concrete
- structural
- architectural
Required Columns¶
CRITICAL: Use space-separated format (NOT camelCase):
✅ CORRECT:
❌ WRONG:
Minimum Required for MERGE: - Source File Name (composite key) - Item No (composite key)
Validation Rules¶
Required Fields¶
RULE 1: itemNo is REQUIRED
- Must not be empty
- Numeric values converted to strings automatically
- Error: 'itemNo' is required (unique identifier within project)
RULE 2: sourceFileName is REQUIRED
- Must not be empty
- Numeric values converted to strings automatically
- Error: 'sourceFileName' is required (part of composite key: projectId, sourceFileName, discipline, itemNo)
RULE 3: Duplicate itemNo Detection
- Checked within same discipline in import file
- Prevents ambiguous MERGE operations
- Error: Duplicate itemNo '1' (also in row 3)
Validation Flow¶
Excel File → FileParser → Validator → Pydantic Schema → Database
↓ ↓ ↓
Normalize Check Rules Type Check
Headers & Convert
All-or-Nothing Validation: - If ANY sheet fails validation, entire import aborts - No partial imports - Prevents inconsistent data
Data Model¶
Common Fields¶
Composite Key Fields (All Required):
- projectId - Project identifier (partition key)
- sourceFileName - Source file name (REQUIRED for MERGE)
- discipline - Discipline enum (from sheet name)
- itemNo - Item number (REQUIRED for MERGE)
Identification:
- tagNo, lineNo, pfdNumber, drawingNoOrPid, etc.
Quantities & Costs:
- quantity, uom, materialUnitPrice, laborUnitPrice, etc.
Assignment Fields:
- wbsPath, wbsText - Work Breakdown Structure
- crewCode, crewName - Assigned crew
- coaCode, coaName - Chart of Accounts
- kpiCode, kpiName - Key Performance Indicator
Field Count by Discipline: - Piping: 143 fields (most comprehensive) - Mechanical Equipment: 85 fields - Electrical Equipment: 83 fields - Others: 60-80 fields
Workflows¶
1. Async Export → Modify → Async Import (Recommended)¶
# Step 1: Export current data
curl "http://localhost:8000/api/v1/mto/export?projectId=proj-123" \
-o mto_current.xlsx
# Step 2: Edit in Excel
# - Modify quantities, descriptions, etc.
# - Keep "Source File Name" and "Item No" columns
# - Add/remove rows as needed
# Step 3: Submit async import with MERGE
curl -X POST "http://localhost:8000/api/v1/mto/import/async" \
-F "file=@mto_current.xlsx" \
-F "projectId=proj-123" \
-F "strategy=merge"
# Response:
# {
# "taskId": "abc-123",
# "status": "pending",
# "pollUrl": "/api/v1/mto/import/status/abc-123?projectId=proj-123"
# }
# Step 4: Poll for status (every 2-3 seconds)
curl "http://localhost:8000/api/v1/mto/import/status/abc-123?projectId=proj-123"
# Keep polling until status = "completed" or "failed"
2. New Project from Template¶
# Step 1: Download template
curl "http://localhost:8000/api/v1/mto/template" -o template.xlsx
# Step 2: Fill with data
# - Add "Source File Name" value (e.g., "initial_import.xlsx")
# - Fill "Item No" sequentially (1, 2, 3...)
# - Add descriptions, quantities, etc.
# Step 3: Submit async import
curl -X POST "http://localhost:8000/api/v1/mto/import/async" \
-F "file=@template.xlsx" \
-F "projectId=new-proj" \
-F "strategy=merge"
# Step 4: Poll for completion
curl "http://localhost:8000/api/v1/mto/import/status/{taskId}?projectId=new-proj"
3. Complete Refresh¶
# REPLACE strategy: Deletes existing, creates new
curl -X POST "http://localhost:8000/api/v1/mto/import/async" \
-F "file=@new_data.xlsx" \
-F "projectId=proj-123" \
-F "strategy=replace"
Troubleshooting¶
Error: "itemNo is required"¶
Cause: Missing or empty "Item No" column
Solution: 1. Check Excel has column named "Item No" (not "itemNo") 2. Ensure all rows have values 3. Numeric values are OK (auto-converted to strings)
Error: "sourceFileName is required"¶
Cause: Missing or empty "Source File Name" column
Solution: 1. Add column "Source File Name" (not "sourceFileName") 2. Fill with source identifier (e.g., "equipment_list_v1.xlsx") 3. Use same value for items from same source
Error: "No sheets were processed"¶
Cause: Sheet names don't match expected discipline names
Solution: Rename sheets to exact matches: - "Electrical Equipment" (not "Electrical" or "EE") - "Mechanical Equipment" - "Piping" - etc.
Error: Duplicate Items Created¶
Cause: Composite key fields don't match exactly
Check:
- Same projectId in API call?
- Same sourceFileName in Excel?
- Correct sheet name (discipline)?
- Same itemNo value?
All 4 must match for UPDATE vs CREATE.
Task Stuck in "pending" Status¶
Cause: Background worker crashed or not running
Solution: 1. Check backend logs for errors 2. Restart backend server 3. Resubmit import (idempotency protection will handle duplicates)
Task Status "failed" with No Error Message¶
Cause: Unexpected exception in background worker
Solution: 1. Check backend logs for stack trace 2. Verify Excel file format is valid 3. Ensure all required environment variables are set (COSMOS_ENDPOINT, COSMOS_KEY)
Import Takes Too Long (> 5 minutes)¶
Cause: Very large file or slow database connection
Solution: 1. Break file into smaller batches (< 1000 rows per discipline) 2. Check Cosmos DB RU/s provisioning 3. Use MERGE instead of REPLACE (faster for updates)
Performance¶
Limits: - Max file size: 10MB (5MB recommended) - Recommended batch: <1000 items per discipline - Timeout: 5 minutes per background task - Polling interval: 2-3 seconds (optimal balance)
Optimization: - Break large files into smaller batches - Use MERGE for incremental updates (faster than REPLACE) - Minimize columns in import files (less data to process) - Use batch operations (100 items per transaction)
Batch Operations: - Cosmos DB Batch API: Up to 100 items per transaction - Automatic chunking for >100 items - 500ms delay between batches to avoid throttling - Exponential backoff retry for 429 errors (rate limiting)
Progress Tracking Granularity: - Progress updates after each discipline sheet - Not per-row (too expensive) - Percentage calculated as: (completedSheets / totalSheets) * 100
Testing¶
Test Files¶
Location: backend/tests/
test_mto_import.sh- 14 import scenariostest_mto_export.sh- Export validationtest_mto_roundtrip.sh- Complete workflow + composite key testingtest_mto_batch.sh- Batch operation testingtest_mto_async.sh- Async import workflow testing
Run Tests¶
cd backend/tests
# Test async import workflow
./test_mto_async.sh [PROJECT_ID]
# Test round-trip (export → import → verify)
./test_mto_roundtrip.sh [PROJECT_ID]
# Test batch operations
./test_mto_batch.sh [PROJECT_ID]
# Run all MTO tests
./test_mto_import.sh && ./test_mto_export.sh && ./test_mto_roundtrip.sh
Phase 8 Tests Composite Key: - Creates same itemNo in different sources/disciplines - Verifies MERGE updates only exact 4-field match - Confirms other items remain unchanged
File Locations¶
Code:
- Router: backend/app/api/v1/routers/mto.py (1344 lines)
- Repository: backend/app/repositories/mto.py
- Schemas: backend/app/schemas/mto.py (1361 lines)
- Validator: backend/app/services/validators/mto_validator.py
- File Parser: backend/app/services/parsers/file_parser.py
- Excel Writer: backend/app/services/parsers/excel_writer.py
- MTO Export Service: backend/app/services/exporters/mto_export_service.py
- Task Service: backend/app/services/tasks/task_service.py
- Task Schemas: backend/app/schemas/task.py
Templates:
- backend/templates/mto/mto_template.xlsx - Unified template
- backend/templates/mto/electrical_equipment_template.xlsx
- backend/templates/mto/mechanical_equipment_template.xlsx
- backend/templates/mto/bulk_electrical_template.xlsx
- backend/templates/mto/instrumentation_template.xlsx
- backend/templates/mto/piping_template.xlsx
- backend/templates/mto/civil_template.xlsx
- backend/templates/mto/concrete_template.xlsx
- backend/templates/mto/structural_template.xlsx
- backend/templates/mto/architectural_template.xlsx
Tests:
- backend/tests/test_mto_*.sh
- backend/tests/test_data/
- backend/tests/helpers/create_test_excel.py
- backend/tests/helpers/generate_mto_perf_data.py
Scripts:
- Template generation: backend/scripts/create_*_template.py
API Reference Summary¶
Async Import Endpoints¶
| Method | Endpoint | Description | Response Time |
|---|---|---|---|
| POST | /api/v1/mto/import/async |
Submit import (returns task ID) | < 1 second |
| GET | /api/v1/mto/import/status/{taskId} |
Poll task status | < 500ms |
Export Endpoints¶
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/mto/export |
Export to Excel (multi-sheet) |
| GET | /api/v1/mto/export?format=single-sheet&discipline=piping |
Export single discipline |
Template Endpoints¶
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/mto/template |
Download unified template (9 sheets) |
| GET | /api/v1/mto/template/{discipline} |
Download discipline template |
Other MTO Endpoints¶
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/mto |
Create single MTO item |
| GET | /api/v1/mto |
List MTOs (paginated) |
| GET | /api/v1/mto/{mtoId} |
Get MTO by ID |
| PUT | /api/v1/mto/{mtoId} |
Update MTO |
| DELETE | /api/v1/mto/{mtoId} |
Delete MTO |
| POST | /api/v1/mto/bulk |
Bulk create |
| DELETE | /api/v1/mto/bulk |
Bulk delete |
| POST | /api/v1/mto/search |
Advanced search |
| GET | /api/v1/mto/project/{projectId}/disciplines |
List disciplines |
| GET | /api/v1/mto/project/{projectId}/summary |
Project summary |
| DELETE | /api/v1/mto/project/{projectId}/all |
Delete all MTOs |
Version History¶
4.0.0 (2025-11-14)¶
- ✅ Async import system with background task processing
- ✅ Real-time progress tracking (percentage + messages)
- ✅ Task polling API with status endpoint
- ✅ Idempotency protection (prevents duplicate imports)
- ✅ Batch operations with progress callbacks
- ✅ Retry logic with exponential backoff
- ✅ Comprehensive documentation update
3.0.0 (2025-11-04)¶
- ✅ 4-field composite key: Added
sourceFileNameanddisciplineto uniqueness - ✅ Made
sourceFileNamerequired field - ✅ Updated validator to enforce sourceFileName
- ✅ Updated documentation for composite key
2.1.0 (2025-10-30)¶
- ✅ itemNo-based MERGE strategy
- ✅ Excluded mtoId from exports
- ✅ Round-trip testing (829 lines)
- ✅ Comprehensive documentation (2,530 lines)
2.0.0 (2025-10-29)¶
- ✅ Multi-sheet Excel support
- ✅ Auto-discipline detection
- ✅ Unified template system
Additional Resources¶
API Documentation: - Swagger UI: http://localhost:8000/api/docs - ReDoc: http://localhost:8000/api/redoc - OpenAPI Spec: http://localhost:8000/api/openapi.json
Related Documentation: - Backend Architecture - Overall architecture - Backend README - Getting started - Quick Start Guide - Quick reference
Database:
- Container: MaterialTakeoff
- Partition Key: /projectId
- Document Type: type: "mto"
Task Tracking:
- Container: Tasks
- Partition Key: /projectId
- Document Type: type: "task"
End of Guide
For questions or issues: 1. Check this documentation 2. Review the Architecture Guide 3. Read the Quick Start 4. Check Swagger UI for endpoint details 5. Review backend logs for error messages