Enterprise Salesforce Org Consolidation
Built a production-grade data migration framework to consolidate merchant data across multiple Salesforce organizations into a unified platform—with full data lineage, automated validation, and zero business disruption during the migration window.
Multiple orgs, fragmented data, operational complexity
The merchant business had grown through acquisitions and organic expansion, resulting in multiple Salesforce organizations with overlapping data models, inconsistent field mappings, and no unified view of merchant relationships.
The consolidation required migrating millions of records across complex object hierarchies—primary entities, related contacts, pipeline objects, quotes, orders, contracts, and historical activities—while maintaining referential integrity and ensuring zero disruption to active business operations.
-
Complex Object Dependencies
40+ objects with intricate parent-child relationships requiring precise sequencing
-
Limited Migration Window
Production migrations restricted to 03:00–06:00 EST to minimize business impact
-
Data Quality Variance
Inconsistent field formats, duplicate records, and missing referential keys across source orgs
Plan-driven migration framework
A modular Python framework orchestrated by Airflow, with Snowflake as the staging layer and Salesforce Bulk API v2 for high-throughput loads.
Source Systems
Figment / MINT
Snowflake RAW
Landing Zone
dbt Models
Transform Layer
Migration Engine
Python + Airflow
Salesforce
Bulk API v2
Core Engine
- Step registry with decorators
- DAG runner with sequencing
- Window guard (production hours)
- Retry with exponential backoff
- Validation hooks per phase
Connectors
- Snowflake read/write helpers
- Salesforce Bulk API v2 client
- Error table persistence
- CDC incremental patterns
- Rate limit handling
Utilities
- YAML config loader
- JSON structured logging
- Field mapping engine
- Lookup cache (RecordTypes)
- Audit logger to Snowflake
How we solved it
Technical decisions that enabled reliable, auditable, and repeatable migrations at scale.
Plan-Driven Execution
YAML plans as single source of truth
Migration plans define phases, parallel groups, and step dependencies in declarative YAML. The engine reads the plan and executes steps in the correct order—making reviews focused on the plan file, not scattered code.
phases: - name: "Phase_1" parallel_groups: - name: "Group_A" steps: - { type: domain.entity_primary } - { type: domain.entity_related }
Domain-First Architecture
Business domains as code modules
Each business domain (Foundation, Pre-Sales, CPQ, Orders, Contracts) is a separate Python module with registered steps. This keeps business logic organized and allows teams to own their domain without conflicts.
@StepRegistry.register("domain.entity_primary") def migrate_entities(env, batch_id, **kwargs): # Load from staging view df = read_view(env, "STG_ENTITY_VIEW") # Upsert to target via Bulk API bulk_upsert(env, "TargetObject", df)
Adapter Pattern
Preserve existing contributor code
Rather than rewriting existing migration scripts, we wrapped them with a thin adapter shim. The framework calls the contributor's function, handling logging, retries, and sequencing automatically—zero rewrite required.
# Plan references existing code - type: domain.entity_primary args: adapter: module: "adapters.team.migrate_entity" fn: "run"
Reconciliation Built-In
Audit-grade validation at every phase
After each phase, the engine runs validation hooks: source counts vs. attempted vs. success vs. failed. Referential integrity checks ensure all lookups resolve correctly before proceeding to dependent objects.
# Automatic reconciliation output { "source_count": 50000, "attempted": 50000, "success": 49995, "failed": 5, "ri_check": "PASS" }
Domain-first workflow
Object dependencies determine execution order. The framework respects these relationships and runs parallel groups where safe.
Foundation
Core entities, relationships
Pre-Sales
Lead objects, contacts
Sales
Pipeline objects
CPQ
Quote objects
Orders
Transaction records
Contracts
Agreement records
Activities
Historical records
10+ Foundation
- Primary entity records
- Related contact records
- Team assignments
- Reference data objects
- Cross-reference lookups
8+ CPQ
- Opportunity records
- Quote headers
- Line item groups
- Product configurations
- Pricing & discounts
6+ Post-Sales
- Order headers & lines
- Fulfillment records
- Contract records
- Agreement terms
- Recurring revenue objects
4+ Activities
- Task records
- Calendar events
- Activity history
- Communication logs
Serverless execution on AWS Fargate
The migration framework runs as containerized tasks on AWS Fargate, triggered by EventBridge schedules during approved production windows.
EventBridge
Scheduled trigger
03:00 EST daily
ECS Fargate
Serverless containers
Auto-scaling
Secrets Manager
Credentials injection
Zero secrets in code
Migration Engine
nv run --plan
--env prod
Containerized Execution
Docker images built via CI/CD, scanned for vulnerabilities, and pushed to ECR. Fargate pulls the latest image on each scheduled run.
Window-Guarded Scheduling
EventBridge triggers at 03:00 EST. The engine validates it's within the approved window before executing—double-protection against off-hours runs.
Centralized Logging
All logs stream to CloudWatch with structured JSON format. Batch ID, phase, step, and timing in every log line for easy debugging and auditing.
Zero Infrastructure Management
No servers to patch or maintain. Fargate handles provisioning, scaling, and teardown. Pay only for the compute time used during migration windows.
Measured outcomes
Migration completed on schedule with full data integrity and zero business disruption.
Orgs Consolidated
Into unified platform
Data Accuracy
Validated via reconciliation
Objects Migrated
With full RI checks
Business Disruption
Window-guarded execution
Unified Merchant View
All merchant data now lives in a single Salesforce org, providing a 360° view of customer relationships and eliminating data silos between business units.
Reusable Framework
The migration framework is now a reusable asset for future consolidations. Plan-driven architecture means new migrations require YAML changes, not code rewrites.
Complete Audit Trail
Every migration run is fully auditable: batch IDs, phase logs, reconciliation counts, and error records persisted to Snowflake for compliance and debugging.
Operational Efficiency
Reduced Salesforce license costs, eliminated duplicate integrations, and streamlined reporting with a single source of truth for merchant data.
What we used
Orchestration
Data Platform
Salesforce
Infrastructure
Observability
CI/CD
Planning a data migration?
We build migration frameworks that are repeatable, auditable, and production-ready. Let's talk about your consolidation challenge.