Data Migration Framework

Enterprise Salesforce Org Consolidation

Built a production-grade data migration framework to consolidate merchant data across multiple Salesforce organizations into a unified platform—with full data lineage, automated validation, and zero business disruption during the migration window.

Industry Food Delivery / Tech

Timeline 4 months

Engagement Project Delivery

Orgs Consolidated

99.9%

Data Accuracy

Zero

Business Disruption

The Challenge

Multiple orgs, fragmented data, operational complexity

The merchant business had grown through acquisitions and organic expansion, resulting in multiple Salesforce organizations with overlapping data models, inconsistent field mappings, and no unified view of merchant relationships.

The consolidation required migrating millions of records across complex object hierarchies—primary entities, related contacts, pipeline objects, quotes, orders, contracts, and historical activities—while maintaining referential integrity and ensuring zero disruption to active business operations.

Complex Object Dependencies

40+ objects with intricate parent-child relationships requiring precise sequencing
Limited Migration Window

Production migrations restricted to 03:00–06:00 EST to minimize business impact
Data Quality Variance

Inconsistent field formats, duplicate records, and missing referential keys across source orgs

Before: Fragmented Landscape

Source Org A

Source Org B

Source Org C

Source Org D

↓ Consolidation

Unified Target Org

Solution Architecture

Plan-driven migration framework

A modular Python framework orchestrated by Airflow, with Snowflake as the staging layer and Salesforce Bulk API v2 for high-throughput loads.

Source Systems

Figment / MINT

→

Snowflake RAW

Landing Zone

→

dbt Models

Transform Layer

→

Migration Engine

Python + Airflow

→

Salesforce

Bulk API v2

Core Engine

Step registry with decorators
DAG runner with sequencing
Window guard (production hours)
Retry with exponential backoff
Validation hooks per phase

Connectors

Snowflake read/write helpers
Salesforce Bulk API v2 client
Error table persistence
CDC incremental patterns
Rate limit handling

Utilities

YAML config loader
JSON structured logging
Field mapping engine
Lookup cache (RecordTypes)
Audit logger to Snowflake

Key Engineering Decisions

How we solved it

Technical decisions that enabled reliable, auditable, and repeatable migrations at scale.

Plan-Driven Execution

YAML plans as single source of truth

Migration plans define phases, parallel groups, and step dependencies in declarative YAML. The engine reads the plan and executes steps in the correct order—making reviews focused on the plan file, not scattered code.

phases:
  - name: "Phase_1"
    parallel_groups:
      - name: "Group_A"
        steps:
          - { type: domain.entity_primary }
          - { type: domain.entity_related }

Domain-First Architecture

Business domains as code modules

Each business domain (Foundation, Pre-Sales, CPQ, Orders, Contracts) is a separate Python module with registered steps. This keeps business logic organized and allows teams to own their domain without conflicts.

@StepRegistry.register("domain.entity_primary")
def migrate_entities(env, batch_id, **kwargs):
    # Load from staging view
    df = read_view(env, "STG_ENTITY_VIEW")
    # Upsert to target via Bulk API
    bulk_upsert(env, "TargetObject", df)

Adapter Pattern

Preserve existing contributor code

Rather than rewriting existing migration scripts, we wrapped them with a thin adapter shim. The framework calls the contributor's function, handling logging, retries, and sequencing automatically—zero rewrite required.

# Plan references existing code
- type: domain.entity_primary
  args:
    adapter:
      module: "adapters.team.migrate_entity"
      fn: "run"

Reconciliation Built-In

Audit-grade validation at every phase

After each phase, the engine runs validation hooks: source counts vs. attempted vs. success vs. failed. Referential integrity checks ensure all lookups resolve correctly before proceeding to dependent objects.

# Automatic reconciliation output
{
  "source_count": 50000,
  "attempted": 50000,
  "success": 49995,
  "failed": 5,
  "ri_check": "PASS"
}

Migration Sequence

Domain-first workflow

Object dependencies determine execution order. The framework respects these relationships and runs parallel groups where safe.

Phase 1

Foundation

Core entities, relationships

Phase 2

Pre-Sales

Lead objects, contacts

Phase 3

Sales

Pipeline objects

Phase 4

CPQ

Quote objects

Phase 5

Orders

Transaction records

Phase 6

Contracts

Agreement records

Phase 7

Activities

Historical records

10+ Foundation

Primary entity records
Related contact records
Team assignments
Reference data objects
Cross-reference lookups

8+ CPQ

Opportunity records
Quote headers
Line item groups
Product configurations
Pricing & discounts

6+ Post-Sales

Order headers & lines
Fulfillment records
Contract records
Agreement terms
Recurring revenue objects

4+ Activities

Task records
Calendar events
Activity history
Communication logs

Deployment Architecture

Serverless execution on AWS Fargate

The migration framework runs as containerized tasks on AWS Fargate, triggered by EventBridge schedules during approved production windows.

EventBridge

Scheduled trigger
03:00 EST daily

→

ECS Fargate

Serverless containers
Auto-scaling

→

Secrets Manager

Credentials injection
Zero secrets in code

→

Migration Engine

nv run --plan
--env prod

Containerized Execution

Docker images built via CI/CD, scanned for vulnerabilities, and pushed to ECR. Fargate pulls the latest image on each scheduled run.

Window-Guarded Scheduling

EventBridge triggers at 03:00 EST. The engine validates it's within the approved window before executing—double-protection against off-hours runs.

Centralized Logging

All logs stream to CloudWatch with structured JSON format. Batch ID, phase, step, and timing in every log line for easy debugging and auditing.

Zero Infrastructure Management

No servers to patch or maintain. Fargate handles provisioning, scaling, and teardown. Pay only for the compute time used during migration windows.

Results

Measured outcomes

Migration completed on schedule with full data integrity and zero business disruption.

Orgs Consolidated

Into unified platform

99.9%

Data Accuracy

Validated via reconciliation

40+

Objects Migrated

With full RI checks

Zero

Business Disruption

Window-guarded execution

Unified Merchant View

All merchant data now lives in a single Salesforce org, providing a 360° view of customer relationships and eliminating data silos between business units.

Reusable Framework

The migration framework is now a reusable asset for future consolidations. Plan-driven architecture means new migrations require YAML changes, not code rewrites.

Complete Audit Trail

Every migration run is fully auditable: batch IDs, phase logs, reconciliation counts, and error records persisted to Snowflake for compliance and debugging.

Operational Efficiency

Reduced Salesforce license costs, eliminated duplicate integrations, and streamlined reporting with a single source of truth for merchant data.

Technology Stack

What we used

Orchestration

Apache Airflow Python 3.11 Click CLI

Data Platform

Snowflake dbt Parquet

Salesforce

Bulk API v2 simple-salesforce CPQ

Infrastructure

AWS ECS Fargate EventBridge Secrets Manager

Observability

CloudWatch Logs JSON Logging Audit Tables

CI/CD

GitHub Actions Docker ECR

Start Your Project

Planning a data migration?

We build migration frameworks that are repeatable, auditable, and production-ready. Let's talk about your consolidation challenge.

Schedule a discovery call View more work

Enterprise Salesforce Org Consolidation

Multiple orgs, fragmented data, operational complexity

Complex Object Dependencies

Limited Migration Window

Data Quality Variance

Plan-driven migration framework

Source Systems

Snowflake RAW

dbt Models

Migration Engine

Salesforce

Core Engine

Connectors

Utilities

How we solved it

Plan-Driven Execution

Domain-First Architecture

Adapter Pattern

Reconciliation Built-In

Domain-first workflow

Foundation

Pre-Sales

Sales

CPQ

Orders

Contracts

Activities

10+ Foundation

8+ CPQ

6+ Post-Sales

4+ Activities

Serverless execution on AWS Fargate

EventBridge

ECS Fargate

Secrets Manager

Migration Engine

Containerized Execution

Window-Guarded Scheduling

Centralized Logging

Zero Infrastructure Management

Measured outcomes

Orgs Consolidated

Data Accuracy

Objects Migrated

Business Disruption

Unified Merchant View

Reusable Framework

Complete Audit Trail

Operational Efficiency

What we used

Orchestration

Data Platform

Salesforce

Infrastructure

Observability

CI/CD

Planning a data migration?

Start a conversation

Message sent!