Product How It Works Services Work Blog About Contact
Back to Work

Enterprise Salesforce Org Consolidation

Built a production-grade data migration framework to consolidate merchant data across multiple Salesforce organizations into a unified platform—with full data lineage, automated validation, and zero business disruption during the migration window.

Industry Food Delivery / Tech
Timeline 4 months
Engagement Project Delivery
4+
Orgs Consolidated
99.9%
Data Accuracy
Zero
Business Disruption

Multiple orgs, fragmented data, operational complexity

The merchant business had grown through acquisitions and organic expansion, resulting in multiple Salesforce organizations with overlapping data models, inconsistent field mappings, and no unified view of merchant relationships.

The consolidation required migrating millions of records across complex object hierarchies—primary entities, related contacts, pipeline objects, quotes, orders, contracts, and historical activities—while maintaining referential integrity and ensuring zero disruption to active business operations.

  • Complex Object Dependencies

    40+ objects with intricate parent-child relationships requiring precise sequencing

  • Limited Migration Window

    Production migrations restricted to 03:00–06:00 EST to minimize business impact

  • Data Quality Variance

    Inconsistent field formats, duplicate records, and missing referential keys across source orgs

Before: Fragmented Landscape
Source Org A
Source Org B
Source Org C
Source Org D
↓ Consolidation
Unified Target Org

Plan-driven migration framework

A modular Python framework orchestrated by Airflow, with Snowflake as the staging layer and Salesforce Bulk API v2 for high-throughput loads.

Source Systems

Figment / MINT

Snowflake RAW

Landing Zone

dbt Models

Transform Layer

Migration Engine

Python + Airflow

Salesforce

Bulk API v2

Core Engine

  • Step registry with decorators
  • DAG runner with sequencing
  • Window guard (production hours)
  • Retry with exponential backoff
  • Validation hooks per phase

Connectors

  • Snowflake read/write helpers
  • Salesforce Bulk API v2 client
  • Error table persistence
  • CDC incremental patterns
  • Rate limit handling

Utilities

  • YAML config loader
  • JSON structured logging
  • Field mapping engine
  • Lookup cache (RecordTypes)
  • Audit logger to Snowflake

How we solved it

Technical decisions that enabled reliable, auditable, and repeatable migrations at scale.

Plan-Driven Execution

YAML plans as single source of truth

Migration plans define phases, parallel groups, and step dependencies in declarative YAML. The engine reads the plan and executes steps in the correct order—making reviews focused on the plan file, not scattered code.

phases:
  - name: "Phase_1"
    parallel_groups:
      - name: "Group_A"
        steps:
          - { type: domain.entity_primary }
          - { type: domain.entity_related }

Domain-First Architecture

Business domains as code modules

Each business domain (Foundation, Pre-Sales, CPQ, Orders, Contracts) is a separate Python module with registered steps. This keeps business logic organized and allows teams to own their domain without conflicts.

@StepRegistry.register("domain.entity_primary")
def migrate_entities(env, batch_id, **kwargs):
    # Load from staging view
    df = read_view(env, "STG_ENTITY_VIEW")
    # Upsert to target via Bulk API
    bulk_upsert(env, "TargetObject", df)

Adapter Pattern

Preserve existing contributor code

Rather than rewriting existing migration scripts, we wrapped them with a thin adapter shim. The framework calls the contributor's function, handling logging, retries, and sequencing automatically—zero rewrite required.

# Plan references existing code
- type: domain.entity_primary
  args:
    adapter:
      module: "adapters.team.migrate_entity"
      fn: "run"

Reconciliation Built-In

Audit-grade validation at every phase

After each phase, the engine runs validation hooks: source counts vs. attempted vs. success vs. failed. Referential integrity checks ensure all lookups resolve correctly before proceeding to dependent objects.

# Automatic reconciliation output
{
  "source_count": 50000,
  "attempted": 50000,
  "success": 49995,
  "failed": 5,
  "ri_check": "PASS"
}

Domain-first workflow

Object dependencies determine execution order. The framework respects these relationships and runs parallel groups where safe.

Phase 1

Foundation

Core entities, relationships

Phase 2

Pre-Sales

Lead objects, contacts

Phase 3

Sales

Pipeline objects

Phase 4

CPQ

Quote objects

Phase 5

Orders

Transaction records

Phase 6

Contracts

Agreement records

Phase 7

Activities

Historical records

10+ Foundation

  • Primary entity records
  • Related contact records
  • Team assignments
  • Reference data objects
  • Cross-reference lookups

8+ CPQ

  • Opportunity records
  • Quote headers
  • Line item groups
  • Product configurations
  • Pricing & discounts

6+ Post-Sales

  • Order headers & lines
  • Fulfillment records
  • Contract records
  • Agreement terms
  • Recurring revenue objects

4+ Activities

  • Task records
  • Calendar events
  • Activity history
  • Communication logs

Serverless execution on AWS Fargate

The migration framework runs as containerized tasks on AWS Fargate, triggered by EventBridge schedules during approved production windows.

EventBridge

Scheduled trigger
03:00 EST daily

ECS Fargate

Serverless containers
Auto-scaling

Secrets Manager

Credentials injection
Zero secrets in code

Migration Engine

nv run --plan
--env prod

Containerized Execution

Docker images built via CI/CD, scanned for vulnerabilities, and pushed to ECR. Fargate pulls the latest image on each scheduled run.

Window-Guarded Scheduling

EventBridge triggers at 03:00 EST. The engine validates it's within the approved window before executing—double-protection against off-hours runs.

Centralized Logging

All logs stream to CloudWatch with structured JSON format. Batch ID, phase, step, and timing in every log line for easy debugging and auditing.

Zero Infrastructure Management

No servers to patch or maintain. Fargate handles provisioning, scaling, and teardown. Pay only for the compute time used during migration windows.

Measured outcomes

Migration completed on schedule with full data integrity and zero business disruption.

4+

Orgs Consolidated

Into unified platform

99.9%

Data Accuracy

Validated via reconciliation

40+

Objects Migrated

With full RI checks

Zero

Business Disruption

Window-guarded execution

Unified Merchant View

All merchant data now lives in a single Salesforce org, providing a 360° view of customer relationships and eliminating data silos between business units.

Reusable Framework

The migration framework is now a reusable asset for future consolidations. Plan-driven architecture means new migrations require YAML changes, not code rewrites.

Complete Audit Trail

Every migration run is fully auditable: batch IDs, phase logs, reconciliation counts, and error records persisted to Snowflake for compliance and debugging.

Operational Efficiency

Reduced Salesforce license costs, eliminated duplicate integrations, and streamlined reporting with a single source of truth for merchant data.

What we used

Orchestration

Apache Airflow Python 3.11 Click CLI

Data Platform

Snowflake dbt Parquet

Salesforce

Bulk API v2 simple-salesforce CPQ

Infrastructure

AWS ECS Fargate EventBridge Secrets Manager

Observability

CloudWatch Logs JSON Logging Audit Tables

CI/CD

GitHub Actions Docker ECR

Planning a data migration?

We build migration frameworks that are repeatable, auditable, and production-ready. Let's talk about your consolidation challenge.