freiberufler Cloud & Data Engineer | AWS | Data Vault 2.0 | Python | SQL auf freelance.de

Cloud & Data Engineer | AWS | Data Vault 2.0 | Python | SQL

zuletzt online vor wenigen Tagen
  • auf Anfrage
  • 12207 Berlin
  • National
  • ur  |  de  |  en
  • 10.01.2026
  • Contract ready

Kurzvorstellung

Cloud Architect & Data Engineer specializing in AWS/Azure migrations, Databricks, enterprise data platforms. Expert in Data Modeling, Governance, IaC (Terraform), DevOps. Project Management for cloud transformations.

Geschäftsdaten

 Freiberuflich
 Steuernummer bekannt
 Berufshaftpflichtversicherung aktiv

Qualifikationen

  • Amazon Web Services (AWS)4 J.
  • Data Vault 2.0
  • Python2 J.
  • Apache Kafka1 J.
  • API-Entwickler3 J.
  • Big Data
  • Cloud-Services
  • Cloud Computing2 J.
  • Cloud Spezialist3 J.
  • Data Engineer1 J.
  • Data Warehousing5 J.
  • Databricks3 J.
  • Datenbank-Analytiker1 J.
  • Datenmanagement2 J.
  • DBt
  • ETL4 J.
  • Java-Entwickler1 J.
  • Microstrategy2 J.
  • Oracle Database2 J.
  • Projektmanagement (IT)2 J.
  • Python-Programmierer4 J.
  • SAP BusinessObjects (BO)4 J.
  • SqlDBM

Projekt‐ & Berufserfahrung

Project Data Manager / Date Engineer
Circle K, Berlin
4/2025 – 7/2025 (4 Monate)
Öl- und Gasindustrie
Tätigkeitszeitraum

4/2025 – 7/2025

Tätigkeitsbeschreibung

Architected and executed AWS-to-multicloud migration for Circle K's data platform (AWS/Azure Databricks), designing cross-cloud data transfer solutions that modernized infrastructure while maintaining zero-downtime operations for 600+ retail locations.
Cross-Cloud Migration Architecture:

Architected S3-to-Azure Blob pipeline via AWS Databricks for cross-cloud data transfer from legacy AWS to Circle K Azure
Phased migration strategy maintaining single legacy AWS account for synchronization while transitioning workloads to Circle K environments
Unity Catalog volumes collaboration with Circle K metastore admins, defining schema structures and access patterns
Migrated from low-level Spark RDDs to high-level DataFrame APIs, enabling Photon acceleration for 3-5x performance gains

Infrastructure as Code & Automation:

Implemented Terraform Redshift stack and modules, with RA3 node configuration
Implemented Redshift native schedules with associated IAM roles for automated ETL workflows
Managed database scripts for database and schema definitions / user access management

Data Pipeline Modernization:

Refactored multiple Python notebooks from AWS Databricks to Azure Databricks, optimizing for Photon accelerator
Implemented Unity Catalog Delta tables for daily POS data processing with automatic schema evolution
Created weekly KPI / reports using Databricks views analyzing product volumes and sales metrics
Migrated data pipelines from sequential processing to parallel Spark operations, reducing runtime by 60%

Outcome: Successfully orchestrated zero-downtime migration serving 6M+ daily transactions; scalable cross-cloud architecture supporting both AWS and Azure workloads, reduced operational costs by 40% through infrastructure consolidation.

Technologies: Azure Databricks, AWS Databricks, Terraform, Unity Catalog, Delta Lake, Redshift (RA3), S3 Cross-Account Access, IAM AssumeRole, Photon Engine, PySpark, Azure Blob Storage

Eingesetzte Qualifikationen

Data Engineer, Amazon Web Services (AWS), Apache Spark, Cloud Spezialist, Databricks, Microsoft Azure, Python-Programmierer

Lead Data Engineer - Data Platform für Appian
Kundenname anonymisiert, Frankfurt
11/2024 – offen (1 Jahr, 5 Monate)
Öffentliche Verwaltung
Tätigkeitszeitraum

11/2024 – offen

Tätigkeitsbeschreibung

Leading the design and implementation of enterprise data platform solutions, focusing on scalable data architecture patterns and data product development. Responsible for end-to-end solution architecture from conceptualization to hands-on implementation.
Enterprise Architecture & Solution Design:

Designed comprehensive data lake architecture supporting multiple data domains and products
Created architectural blueprints for event-driven data ingestion using AWS services
Established data integration patterns using API Gateway and Lambda for real-time data flows
Defined data product boundaries and interfaces following data mesh principles
Designed streaming and batch ETL architectures using AWS Glue

Data Platform Implementation:

Implemented serverless data processing pipelines using Python-based Lambda functions
Developed AWS Glue ETL jobs for structured and semi-structured data transformations
Built real-time data streaming solutions using Glue Streaming jobs
Created Java-based Lambda function using Apache TIKA for advanced file type validation
Architected Aurora RDS PostgreSQL clusters with writer/reader instance optimization

Infrastructure as Code & Governance:

Implemented Terraform modules for automated infrastructure provisioning
Designed S3 data lake structure with lifecycle policies for cost optimization
Configured KMS encryption keys for data security and compliance
Established Aurora RDS PostgreSQL cluster setup with high availability
Created reusable infrastructure templates promoting consistency across data products

Technical Advisory & Best Practices:

Provided architectural guidance to data engineering teams
Established data modeling standards for the data platform
Defined API design patterns for data product interfaces
Mentored teams on AWS best practices and cloud-native architectures
Conducted architectural reviews ensuring alignment with enterprise standards

Outcome: Enabled scalable data platform supporting multiple business domains with improved data product delivery velocity and standardized architectural patterns

Technologies: AWS (Lambda, Glue, S3, API Gateway, Aurora RDS), Python, Java, Terraform, PostgreSQL, Apache TIKA, Data Lake Architecture, Event-driven Architecture

Eingesetzte Qualifikationen

Apache Kafka, Datenbank-Analytiker, API-Entwickler, Java-Entwickler, Python-Programmierer, Data Engineer, Infrastrukturarchitektur, Amazon Web Services (AWS)

Project Data Manager
Couche-Tard, Berlin
11/2023 – 3/2025 (1 Jahr, 5 Monate)
Chemieindustrie
Tätigkeitszeitraum

11/2023 – 3/2025

Tätigkeitsbeschreibung

Managed cloud data infrastructure, focusing on Redshift optimization, Unity Catalog implementation, and AWS Databricks. Led performance tuning initiatives reducing query execution times across Power BI workloads.
Redshift Performance Engineering:

Implemented distribution keys, sort/interleaved sort keys based on query pattern analysis
Established Workload Management (WLM) queues with memory allocation optimization for concurrent Power BI refresh jobs and Pipeline workloads
Configured RA3 node clusters with managed storage enabling independent compute/storage scaling
Created admin monitoring views for table statistics, query performance, and disk usage tracking
Advised VACUUM strategies with scheduling during pipeline runs, optimizing table performance
Implemented automatic snapshot schedules for recovery scenarios

Databricks Platform Establishment:

Secured AWS Databricks access from parent organization through governance approval process
Implemented Terraform modules for Unity Catalog setup with three-level namespace (catalog.schema.table)
Developed POCs for AWS Databricks adoption over Redshift, for performance and cost improvements, using Delta table
Built data pipelines for public and internal data sources, using Spark RDD parallel processing
Built data quality monitoring completeness, uniqueness, and timeliness KPIs

AWS Infrastructure Management:

Managed subnet groups and CIDR block allocations for Redshift cluster isolation
Configured cluster resize operations from dc2.large to RA3.xlplus nodes with minimal downtime
Managed Power BI Gateway Data Sources, and Network Firewall Whitelisting b/w OnPrem and Cloud using FQDNs

BOXI to Cloud Data Integration:

Integrated SAP BOXI SFTP Exports with AWS Cloud, using AWS Transfer Family and Fingerprint calculation approach

Technologies: AWS Redshift (RA3), Databricks, Unity Catalog, Terraform, WLM, Distribution Keys, Interleaved Sort Keys, Power BI, S3, Spark RDD, Python, Delta Lake

Eingesetzte Qualifikationen

Datamanager, Amazon Web Services (AWS), Apache Spark, Cloud Spezialist, Databricks, Power Bi, Python-Programmierer, SAP BusinessObjects (BO)

Project Data Manager AWS (S3, Databricks)
TotalEnergies, Berlin
4/2022 – 6/2024 (2 Jahre, 3 Monate)
Chemieindustrie
Tätigkeitszeitraum

4/2022 – 6/2024

Tätigkeitsbeschreibung

Advised on AWS cloud adoption, data management, and infrastructure architecture. Proposed architecture solutions, evaluating AWS building blocks to align with client needs, project planning and team communications.
Project and Incident Management:

Coordinated with Cyber Security and Group Admins for secure implementations and incident management
Managed IT demands and change requests in ServiceNow
Handled GitHub team permissions
Facilitated Databricks GitHub app requests, managing repos for various environments

Data Lake Management and AWS Setup:

S3 Data Collection: Led sessions to define data categories for S3, per company classifications
AWS Transfer Family with Lambda: Implemented Transfer Family with Lambda for custom authentication
IAM Policies: Consulted on IAM policy setup for secure, cross-account S3 access
Redshift Setup: Managed Redshift cluster setup, consulting on keys and Copy Commands for optimization
Liquibase: Advised on Liquibase adoption for database versioning in CICD

Architecture and Security Management:

AWS Account Documentation: environment classification, IAM policies, S3 access, and VPC connections to on-prem via Transit Gateway
Subnet and CIDR Blocks: Designed and documented subnet layouts for network security
Firewall Management: Managed firewall openings/closings per compliance standards
Resource Monitoring: Monitored AWS resources, optimizing costs
Databricks Access: Configured SCIM passthrough and meta-IAM roles
GitHub OIDC Integration: IAM roles for GitHub-to-AWS access via OIDC

Databricks Data Engineering and Data Transfer:

Unity Catalog: Deployed with Terraform for data governance
Delta Tables and Workflows: Created Delta Tables and workflows for data processing
Python Notebooks: Built notebooks for analysis, including data cleanup and public datasets (ELWIS, PEGELONLINE WSV, DWD)
Data Transfers: Developed Shell/SFTP scripts enabling BOXI exports over SFTP for data collection

Technologies: AWS, Databricks, Unity Catalog, Terraform, S3, Redshift, IAM Policies, Lambda, AWS Transfer Family, Delta Lake, Python, ServiceNow, GitHub OIDC, Transit Gateway, BOXI, Shell/SFTP

Eingesetzte Qualifikationen

Power Bi, Amazon Web Services (AWS), Cloud Computing, Cloud Spezialist, Databricks, Datamanager, Datenmanagement, IT-Berater, Projektmanagement (IT), Python-Programmierer, SAP BusinessObjects (BO)

Data Modeler, Group Data Office
Insurance Branche, München
2/2022 – 9/2024 (2 Jahre, 8 Monate)
Versicherungen
Tätigkeitszeitraum

2/2022 – 9/2024

Tätigkeitsbeschreibung

Datenbankaufbau und -strukturen nach Data Vault Design, 3NF und Data Marts (Consumption Layer).

Engaged as a Data Vault 2.0 Modeler with a Group Data Office from Insurance and Reinsurance business, for Group implementation of Commercial Insurance (PLC), Reinsurance data collection, sourced from various Operating Entities across various Geographical locations.

• Updating and enhancing Group Commercial Common/Unified Data Model (nach Data Vault 2.0).
• Updating Data Standards for the Data Model.
• Enhancing Global Definitions (Global Business Glossary)
• Developing business Data Examples for Operating Entities.
• Writing Mapping Guidelines.
• Unifying CAT Risk 3NF Data Model into group Commercial DV 2.0 Data Model.
• Participation in Enterprise Ontology clarification Workshops
• Business examples and presentation slide decks for various Scopes in the model, Reinsurance (FAC/Treaty/Quota Share), Incident/Claim classifications, Policy and Object Terms (L&Ds), IIP (International Insurance Programs) to name a few
• Data Mapping and Validation Example preparations, show-casing bi-directional mapping between group Commercial DV 2.0 Data Model and CAT Risk and Cyber 3NF Data Models.
• Writing detailed Mapping Guideline for Data Deliveries into Group Commercial Common Ingestion DV2.0 Data Model.
• Led solutioning in Data Model Workshops.
• Unifying Claims Data Model into group Commercial DV 2.0 Data Model.
• Participation in Workshops with Data Architects from other Business units.
• Member of Community of Practitioners on Business Intelligence, Architecture and Data Modeling.

Outcome: Enabled Group Data Reporting for Portfolio steering on Common Data Standards and Global Business Glossary

Tools: SQLDBM, PostgresSQL, Azure Synapse Analytics Pool DB SQL, GitHub, Confluence, Informatica Axon (GBG), Sharepoint, Excel, Powerpoint, Enterprise Ontology

Eingesetzte Qualifikationen

Datenanalyse, Azure Synapse Analytics, Data Vault, Datenarchitekt, Datenmodelierung, Postgresql

Data Vault Consultant
Insurance Branche, München
10/2021 – 1/2022 (4 Monate)
Versicherungen
Tätigkeitszeitraum

10/2021 – 1/2022

Tätigkeitsbeschreibung

Engaged as a Data Vault 2.0 developer for implementation of IFRS17 Account Standard requirements and adaptions, at an Insurance company in Munich. As an Integration Layer developer, I was mainly responsible for:

• Overseeing operations in Integration Layer
• Analysing and Developing new Raw Vault and Business Vault hubs, links and satellites
• Ensuring relationships between entities loaded from different source systems.
• Integrating new file based data source for Top Adjustments for month end closings.
• Working with Effectivity Satellites
• Met requirements for Ledge Specific postings (GAAP Codes) and Automated Reversals.
• Provided Loading Templates for Premiums (Policy), Claims and Cash transactions.
• Analysing Data Quality errors
• Generating SQL packages and carrying out deployments.
• Preparing Visual Data Vault Diagrams!
• Documentation in Confluence
• Exchanging with other IL Developers, Engineering Manager and Product Managers.
• Raising Pull Requests in Github, and Merging them to Master (after a successful review).
• Creating Test Cases in Tricentis TOSCA
• Working in an agile 10 days Sprint basis, creating User Stories and allocating Story Points.
• Participating in Feature Grooming sessions.
• Using Jenkins to manage and schedule Runtime jobs

Tools: SQL Developer, Oracle, GitHub, Tricentics TOSCA, Eclipse, Jenkins

Eingesetzte Qualifikationen

Datenmodelierung, Eclipse, Oracle Database, SQL Entwickler

Project Manager GDPR Data Lake
Real State Platform, Berlin
5/2021 – 10/2021 (6 Monate)
Real State
Tätigkeitszeitraum

5/2021 – 10/2021

Tätigkeitsbeschreibung

Working as Project Manager for GDPR implementation in Data Lake storage at a Real State Platform company.

• Responsible for coordination between tech team and legal.
• Engagement with external DPO, to clarify GDPR requirements.
• Over-seeing team plannings.
• Stake holder management
• Communication with data consumers and producers.

Tools: Miro, Confluence, JIRA , Project documentation

Eingesetzte Qualifikationen

Datenanalyse, Amazon Web Services (AWS), Datenschutz, Projektmanagement

Python Data Migration Engineer
Fashion eCommerce, Berlin
1/2021 – 7/2021 (7 Monate)
Fashion eCommerce
Tätigkeitszeitraum

1/2021 – 7/2021

Tätigkeitsbeschreibung

Working as Data Engineer for Fashion eCommerce Shop, on a migration project, to migrate fashion
Products data from Product Information Management system (PIM / Akeneo) community edition to new Enterprise version.
The solution was developed in Python, to pull data from older PIM via APIs, join and transform, and Publish into new PIM via APIs.

• Performed Data Mapping between old Data Structures and new Data Structures.
• Finalized Data Transformation requirements analysis.
• Wrote data processing and transformation modules.
• Wrote module responsible for interacting with APIs.
• API Authentication
• Managed mapping and transformation rules in Json files.

Python Libraries:
requests, multiprocessing, Json, logging

Technologies: Python 3.7, PyCharm, Akeneo PIM, Restful APIs, JSON, CSV

Eingesetzte Qualifikationen

Product Information Management, API-Entwickler, Json, Produkt- / Sortimentsentwicklung, Python, Representational State Transfer (REST)

Middleware Data Integration Specialist
Telekomm Branche, Berlin
7/2020 – 9/2021 (1 Jahr, 3 Monate)
Telekommunikation
Tätigkeitszeitraum

7/2020 – 9/2021

Tätigkeitsbeschreibung

Developing and Maintaining Middleware Restful APIs for Integration use-cases, such as:

• between SAP and Salesforce
• DocuSign and Salesforce,
• Microservices, legacy ERPs and Internal Systems that maintain Parts information
• GCP

Working with Json, XML and iDoc (SAP) Formats, to develop Integrtations and Gateways. Analysis of source SAP Data structures and Data models, Salesforce and real-time micro-services. Data Mapping between SAP, Salesforce and the Micro-service.
Implementation of HMAC.

• SQL development
• Stored procedure development
• Use of Transaction Management
• Exception Handling.

Regularly prepared Swagger specifications, Test Evidence and UAT documentation.

Technologies: MSSQL Server 2014 (SQL/T-SQL), Middleware (IBM App Connect Studio), JSON/XML, CSV, Swagger, GCP, Restful APIs

Eingesetzte Qualifikationen

Transact-Sql, Microsoft SQL-Server (MS SQL), API-Entwickler, XML, Json

DWH/Data Vault Entwickler
Payment Provider, Berlin
5/2020 – 7/2020 (3 Monate)
Finanzdienstleister
Tätigkeitszeitraum

5/2020 – 7/2020

Tätigkeitsbeschreibung

As a Data Vault developer, I was responsible for enriching Raw Vault and Business Vault with new Satellites containing SAP Bookings data.

• Analysis of existing Data Vault Model and Data Loading routines.
• Performed Extension in existing Data Vault Model.
• Creation of new Links and Satellites in Raw and Business Vault.
• Working with Events Data Processing in Data Vault.

Creation of Stored Procedures for daily loading of new data source. Enabled data extraction through BCP and Powershell.

Eingesetzte Qualifikationen

Data Warehousing, Transact-Sql, Data Vault, Microsoft SQL-Server (MS SQL)

Python Data Engineer
Beratungshaus, Heidelberg
11/2019 – 2/2020 (4 Monate)
IT & Entwicklung
Tätigkeitszeitraum

11/2019 – 2/2020

Tätigkeitsbeschreibung

Fast paced intensive development of multiple data integration modules in Python on Linux in Docker.

These developments enabled data integration and provision for a new web based software:

Integration with Nifi over nifi-api. This component works with Json retrieved from Nifi Rest API, to traverse through Nifi Flow, Process Groups and Processors.
Metadata handling component. This component handles for target MySql database data type conversions, against Big Data AVRO primitive data types based meta data.
Component to download large CSV files over Rest API with Streams (use of requests iter_content) in Parallel threads.
Loading of CSV files into MySql using Data Load Infile command over Sqlalchemy(+pymysql)
Data Integration Job. A main python job that combines other components together.

Python Libraries:
requests, Sqlalchemy, multiprocessing, pandas, dotenv, logging

Technologies: Python 3.7, Linux, Docker, PyCharm, Liquibase, Putty, Real VNC, Citrix.

Eingesetzte Qualifikationen

Python-Programmierer, Data Engineer, ETL, Mysql, Python

Data Vault und Big Data Berater und Entwickler
Versicherungsunternehmen, Köln
4/2019 – 7/2019 (4 Monate)
Versicherungen
Tätigkeitszeitraum

4/2019 – 7/2019

Tätigkeitsbeschreibung

Erfahrung mit Python
Data Vault 2.0 basiert Data Lake Entwicklung und Beratung mit Hadoop Cloudera (CDH) und Amazon Stacks.
Hadoop Cloudera (CDH):
- GDPR Compliant HDFS Data Lake using AVRO file format.
- Hive/Impala based Data Vault Entities & Information Mart.
Amazon S3 and Redshift:
- S3 based Data Lake and external Athena/Redshift tables.
- Redshift based Data Vault and Virtualised Information Mart.
Pre-computed Hash keys Materialised as AVRO files in Lake.
Technologien: Python 3,7, AWS, S3, Redshift, DMS, SQS, Cloudera, Avro, Hive, Impala

Eingesetzte Qualifikationen

Datenmodelierung, Amazon Web Services (AWS), Apache Hadoop, Big Data, Python

Middleware Spezialist
Telekom Lösungsanbieter, Berlin
9/2018 – 8/2019 (1 Jahr)
Telekommunikation
Tätigkeitszeitraum

9/2018 – 8/2019

Tätigkeitsbeschreibung

● Rest APIs and Gateways with Json, Xmls, Idocs and Javascript.
● Data structure / model analysis between SAP, Salesforce and real-time micro-services and respective data mappings.
● Development in MS SQL Server 2014 SSMS.
● SQL development and stored procedure development with Transaction Management and Exception Handling.
● Test Evidence and UAT documentation.

Technologien: MSSQL Server 2014, Middleware, JSON/XML, CSV

Eingesetzte Qualifikationen

API-Entwickler, Idoc, Json, Microsoft SQL-Server (MS SQL), Representational State Transfer (REST)

MicroStrategy Entwickler
Einzelhandel, Ruhrgebiet
8/2017 – 3/2018 (8 Monate)
Großhandel
Tätigkeitszeitraum

8/2017 – 3/2018

Tätigkeitsbeschreibung

Part of FE team, responsible for implementation of MicroStrategy Use Cases for the retail business.

Responsibilities include:

Business Validation of Requirements, with RE & Arch. team.
Solution Concept Workshops, with Arch. & Business teams.
Implementation of MicroStrategy Use Cases (package 2 & 3).
Liaising between Backend and Frontend teams.

Extensive Development Experience with MSTR Documents.
Use of Panel Stacks, Selectors, Grids and Graph components.
Use of Multiple Datasets.

Extensive experience with Visual Insights and OLAP Metrics.

Datasets with Level and Derived Metrics.

Technical feats include:
● Use of Transaction Services.
● Mapping of Attributes (IDs, Forms).
● Parent-Child relationships & Hierarchies.
● Use of multiple Datasets, based on multiple Data Marts.

Advanced Topics include:
● Setting up MicroStrategy Job Prioritisations
● iCube Optimisation & Incremental Refresh reports.

Operational tasks include bi-weekly deployments.

Eingesetzte Qualifikationen

Microstrategy, Data Warehousing, Oracle-Anwendungen

Head of Data Technology
Crosslend GmbH, Berlin
9/2015 – 2/2017 (1 Jahr, 6 Monate)
FinTech, Consume Lending
Tätigkeitszeitraum

9/2015 – 2/2017

Tätigkeitsbeschreibung

I was responsible for leading BI and analytics function of the company, a member of management team. Close cooperation with other Heads, Team Leads and C-levels. Vendor management (Microstrategy). Streamlined many data acquisition, processing and KPI calculation challenges (e.g. Payment processing). Built visualizations and dashboards, together with maths intensive calculations for returns and portfolio performance.

 Responsible for leading BI and Analytics function of Crosslend.
 Close collaboration with Executives, Marketing, Operations, Finance, Product, Engineering and DevOps.
 Enabled self service BI, rolled-out Microstrategy.
 Investor Fact sheets and pitch-decks.
 Financial Metrics, IRRs, Annualized Net Returns (unadjusted) and Default Curves.
 Marketing Performance dashboards and reports (per Channel).
 Customer Insights for Operations and CC team.
 Payment processing and overdue related KPIs.
 Visualizations, simulations and correlations.
 Successful closing of audits (positive opinion).

Eingesetzte Qualifikationen

Microstrategy, Data Warehousing, Business Intelligence (BI), Mysql, ETL

Data Warehouse Architect
Kreditech Holdings SSL, Hamburg
7/2014 – 8/2015 (1 Jahr, 2 Monate)
Fintech, Consumer Lending
Tätigkeitszeitraum

7/2014 – 8/2015

Tätigkeitsbeschreibung

I was responsible for Data Warehouse Architecture and managing company relationship with Exasol (service provider), trained and hired people (DWH Engineers), built overall DWH Architecture, Infrastructure, integrated unstructured NoSQL data (Mongo DB), modeled company core business tables, wrote Finite State Machine (for IFRS based classification), successful closing auditing (a pre-req for series-B funding).

 Responsible for Data Warehouse Architecture and Data Engineering Team.
 Managing Data Warehouse technology infrastructure and service providers.
 Data Modelling company core Revenue and Accounting Fact tables.
 Marketing data mart, performance data at campaign and keyword level (Hierarchy).
 Finite state machine (for IFRS based classification) and Payment Waterfall calculations.
 Data historisation design concepts.
 Integration of unstructured NoSQL data (Mongo DB).
 Successful closing of audits and series B funding.
 Tech-stack: Exasol, Mongo DB, Postgres, Pentaho Kettle, Python and LUA.

Eingesetzte Qualifikationen

Online Analytical Processing, Data Warehousing, Open Source, Postgresql, Mongodb, ETL, Datenbankentwicklung, Lua Scripting, Python

Senior Manager BI
Zalando SE, Berlin
8/2012 – 6/2014 (1 Jahr, 11 Monate)
E-Commerce
Tätigkeitszeitraum

8/2012 – 6/2014

Tätigkeitsbeschreibung

I was part of ERP/MIS team, responsible for Customer Analytics pipeline. Carried out wide set of responsibilities and functions. Came across aggregation requirements using Hadoop (Java Map/Reduce). Oracle, Exasol, Pentaho Kettle technology stack. Lead Oracle DWH migration to a new HW. Re-wrote legacy ETLs, migrated IBM Unica CRM in-house, managed freelancers.

 Responsible for Customer pipeline within Zalando BI.
 Cohort trend analysis for customers -Hyperlink entfernt-
 Analysis of website click log files using Hadoop (Java Map/Reduce).
 Design and Development of Customer Survey Data (Oracle PL/SQL).
 Interfacing operational subset for forecast analysis (Exasol).
 Migration of IBM Unica CRM in house. Redesigning CRM Data Model and simplifying ETLs.
 Leading migration of Oracle DB to new HW, improving backup and recovery options.
 Tech-stack: Hadoop, Oracle, Exasol, Pentaho Kettle, PostgresSql and Business Objects

Eingesetzte Qualifikationen

Apache Hadoop, Data Warehousing, SAP BusinessObjects (BO), Postgresql, Oracle Database, ETL, CRM Beratung (allg.), Enterprise Resource Planning

Zertifikate

SqlDBM Fundamentals
SqlDBM
2024
dbt Fundamentals
dbt Labs
2024
Certified Data Vault 2.0 Practitioner
2018
Oracle Certified Professional Database 11g Administrator
2010

Ausbildung

Computer Science
Bachelors
4
Lahore, Pakistan

Über mich

Cloud Architecture & Migration
Expert in designing and executing complex cloud migrations and multi-cloud architectures. Specialized in AWS-to-Azure migrations, cross-cloud data transfer solutions, and zero-downtime migration strategies. Successfully architected enterprise data platforms for Appian, Circle K (600+ locations, 6M+ daily transactions), Couche-Tard, and TotalEnergies with proven results: cost reduction, performance improvement, and process automation.

Data Platform Engineering
Advanced expertise in building scalable data lake architectures, event-driven data ingestion patterns, and serverless data processing pipelines. Proficient in AWS (Lambda, Glue, S3, Redshift, Aurora RDS), Azure Databricks, Unity Catalog, and Delta Lake. Experienced in implementing API Gateway integrations and streaming/batch ETL architectures.

Infrastructure as Code & DevOps
Experienced with Terraform/Terraspace for automated infrastructure provisioning, including Redshift clusters, Unity Catalog deployments, and Aurora RDS Postgre Cluster setups. Skilled in GitHub OIDC integration, IAM role management, ECS/ECR Container Orchestration, and implementing CI/CD pipelines (Github/Gitlab) with Liquibase for database versioning.

Performance Optimization & Cost Management
Proven track record in database performance engineering - implementing distribution keys, sort strategies, and WLM queues. Achieved 60% runtime reduction through Spark optimization, 3-5x performance gains with Photon acceleration, and 40% cost reduction through infrastructure consolidation.

Python & Data Processing
Advanced Python development for data engineering, including Lambda functions, Spark/PySpark transformations, and data quality monitoring. Experience with Apache TIKA, Shell/SFTP scripting, and building notebooks for complex data analysis (ELWIS, PEGELONLINE WSV, DWD datasets).

Data Governance & Security
Strong expertise in Unity Catalog implementation, SCIM passthrough configuration, IAM policies, and KMS encryption. Experienced in GDPR compliance, data classification strategies, and establishing data product boundaries with comprehensive documentation standards.

Database & Analytics Platforms
Proficient in Redshift (RA3), PostgreSQL, Aurora RDS, Delta Tables, and workflow orchestration. Extensive experience with Power BI integration, SAP BOXI, and implementing real-time analytics solutions. Skilled in SQL optimization and managing cross-account data access patterns.

Enterprise Architecture & Advisory
Experienced in providing architectural guidance, conducting technical reviews, and mentoring engineering teams. Skilled in designing data modeling standards, API design patterns, and establishing cloud-native best practices for enterprise environments.

Weitere Kenntnisse

● AWS: Glue (Batch/Streaming), Lambda, API Gateway, S3, Kinesis, DMS, RDS, Aurora PostgreSQL
● Databricks: AWS & Azure platforms, Unity Catalog, Volumes, Spark/PySpark, Photon, Delta Lake, AWS→Azure migrations
● IaC/DevOps: Terraform/Terraspace, GitHub Actions, CI/CD, Liquibase, Container orchestration (ECS/ECR)
● Data Architecture: Data Vault 2.0 (Certified), Data Modeling, Governance, GDPR compliance
● Databases: Redshift, PostgreSQL, Aurora, Exasol, Oracle 11g (DBA Certified)
● Analytics: Power BI, MicroStrategy, SAP BOXI, dbt (Fundamentals Certified)
● Programming: Python, PySpark, SQL optimization, Shell scripting
● Certifications: Data Vault 2.0, SqlDBM Fundamentals, dbt Fundamentals, Oracle 11g DBA

Persönliche Daten

Sprache
  • Deutsch (Fließend)
  • Englisch (Fließend)
  • Urdu (Muttersprache)
Reisebereitschaft
National
Arbeitserlaubnis
  • Europäische Union
Home-Office
bevorzugt
Profilaufrufe
7635
Alter
42
Berufserfahrung
21 Jahre und 1 Monat (seit 02/2005)
Projektleitung
2 Jahre

Kontaktdaten

Nur registrierte PREMIUM-Mitglieder von freelance.de können Kontaktdaten einsehen.

Jetzt Mitglied werden