An automated cloud data platform connecting 5+ source systems into a unified data lake with quality gates, delta detection, and medallion architecture โ powering real-time dashboards for a large professional services firm.
Azure Data FactoryData Lake Gen2ParquetSQL ServerREST APIsPower BIKey VaultSelf-Hosted IR
ProblemSourcesPipelineData LakeImpact
Before Data Trapped in Silos
Critical business data was scattered across 5+ disconnected systems. Finance teams manually exported spreadsheets, reports were built by copy-pasting into Excel โ slow, error-prone, and not scalable.
๐
Manual Exports
Finance manually downloading spreadsheets from the ERP every week
๐
Survey Copy-Paste
Client feedback manually downloaded and reformatted
๐
No Automation
People & Culture had no automated access to partner data
Result: No single source of truth, stale data, Excel-based reporting, and no firm-wide visibility.
Sources 5 Connected Systems
The platform ingests data daily from 5 source systems โ cloud APIs, on-premises databases, and file servers.
โ๏ธ
Survey Platform API
Client surveys & feedback via REST API with async export polling
๐๏ธ
Finance ERP
Actuals, budgets, org structure via SQL Server
๐๏ธ
Practice Management
Partners & clients via SQL Server
๐
On-Prem File Servers
Excel & CSV exports via Self-Hosted IR
๐๏ธ
Legacy ERP
Org structure from legacy accounting system
Pipeline Azure Data Factory Orchestration
Three business domain pipelines run in parallel โ each with quality gates, delta detection, and domain-specific logic.
Finance
Row CountโQuality Gate ยฑ15%โCopy to ParquetโDelta HashโDim & Fact Models
Clients & Markets
List SurveysโForEach ร4โStart ExportโPoll StatusโDownload ZIP
People & Culture
Extract PartnersโCopy to ParquetโDelta Processing
Data Lake Medallion Architecture
Data lands in a three-tier medallion architecture โ Bronze (raw), Silver (modelled), Gold (analytics-ready).
๐ฅ Bronze โ Raw / Landing
Date-partitioned raw data in original format
ParquetJSONCSVDate-partitioned
๐ฅ Silver โ Cleansed & Modelled
Star schema with dimension and fact tables, delta-detected
Dim_OrgStructureDim_DatesFact_ActualsFact_Budget
๐ฅ Gold โ Analytics Ready
Optimised views powering real-time Power BI dashboards across the firm
Power BIDaily @ 2:00 AM AESTFirm-wide KPIs
Results Architecture & Impact
Automated daily data platform replacing manual Excel-based reporting across the entire firm.
Azure Data Factory
Orchestration engine with parallel pipeline execution