
Data Scientist and Architect
- Verfügbarkeit einsehen
- 1 Referenz
- 90€/Stunde
- 10777 Berlin
- auf Anfrage
- de | en
- 21.05.2025
Kurzvorstellung
Auszug Referenzen (1)
"Sehr angenehme Zusammenarbeit. C. hat unsere Anforderungen an das Projekt gut verstanden & stand uns mit Ratschlägen zur Umsetzung zur Seite"
6/2023 – 6/2024
TätigkeitsbeschreibungAutomation and optimization of MS PowerQuery-based workflows to structure and prepare multi-channel marketing campaigns (Google, Instagram, Youtube, etc.).
Eingesetzte QualifikationenPower Bi, Google Cloud, Microsoft Excel
Qualifikationen
Projekt‐ & Berufserfahrung
6/2024 – 3/2025
Tätigkeitsbeschreibung
Two-stage project focusing on (1) building a scalable data warehouse for integrating and processing heterogeneous biodiversity data, and (2) implementing advanced machine learning workflows to model species distributions under varying environmental conditions.
Design and implementation of a scalable database architecture
- Developed a DuckDB-based solution for local data storage and processing
- Designed a normalized schema to support diverse ecological data types
- Implemented spatial operations for geographic data processing
Development of automated data pipelines
- Created an R package offering a standardized interface for data ingestion, processing, and extraction
- Built automated workflows for orchestration and reproducibility using targets
- Integrated heterogeneous data sources, including species occurrences, taxonomy, and environmental raster layers
Predictive modeling using machine learning and deep learning
- Built end-to-end modeling pipeline to predict species distributions at continental scales
- Deployed different model types under shared feature processing, cross-validation, and model evaluation regime
- Developed embedding architecture to improve information sharing across species in Neural Networks
R (Programmiersprache), Datenmodelierung, SQL, Neuronale Netze, Torch, Python
5/2024 – offen
Tätigkeitsbeschreibung
Development of a data exploration and visualization platform to support plant breeding decisions at KWS Saat. The system provides an intuitive interface to internal data sources and enables staff researchers to explore complex interactions between gene variants, crop characteristics and physiological pathways.
Development of a REST API for graph database integration
- Built a FastAPI-based service layer for interacting with a Neo4j graph database
- Designed efficient, targeted query endpoints
- Integrated authentication and access control mechanisms
Design and implementation of an interactive analytics dashboard
- Developed a web interface for interactive data exploration using Plotly Dash
- Implemented network visualizations with Cytoscape for complex relationships
- Enabled real-time data access via API integration
- Incorporated AI-assisted features through external API services
API-Entwickler, Datenbank-Analytiker, Docker, Python, Web Entwicklung
6/2023 – 6/2024
TätigkeitsbeschreibungAutomation and optimization of MS PowerQuery-based workflows to structure and prepare multi-channel marketing campaigns (Google, Instagram, Youtube, etc.).
Eingesetzte QualifikationenPower Bi, Google Cloud, Microsoft Excel
8/2022 – 2/2024
Tätigkeitsbeschreibung
Large-scale project for the European Central Bank, managed by T-Systems, to migrate the ECB’s existing infrastructure for storing, aggregating and publishing economic data to a modernized technology stack. As part of the reports migration team, I developed infrastructure to support the generation of parameterized, standardized reports and documents across departments.
Development of a custom R-package ecosystem
- Created dynamic tools for the generation of ECB press releases and reports in various formats (html, pdf, txt, docx)
- Designed and implemented an R Shiny web application to enable collaborative report production across ECB teams
Integration with enterprise systems
- Integrated R-based solutions with internal systems including timeseries databases, document management platforms or business process metadata stores
- Embedded reporting workflows into Camunda for seamless process automation
Stakeholder engagement and presentation
- Operated within a Scrum-based agile team, contributing to sprint planning, development and review cycles
- Delivered regular presentations to ECB product management and end users to ensure alignment with business needs and technical feasibility
Agile Methodologie, Git, Jira, Python, R (Programmiersprache), SQL
11/2021 – 6/2022
TätigkeitsbeschreibungI built and deployed a web application that helps researchers to standardize and share vegetation data. The application provides an intuitive interface for uploading and editing various data types and converting them into a highly standardized XML exchange format. This project required a deep understanding of XML and R Shiny as well as a consistent implementation of software design principles and performance optimizations. The resulting application - VegXshiny – now provides a key tool for sharing and integrating vegetation data across the scientific community.
Eingesetzte QualifikationenDocker, R (Programmiersprache), Server Administration, Web Entwicklung, XML
Ausbildung
Georg-August University of Göttingen
Göttingen
Georg-August University of Göttingen
Göttingen
HTW Dresden
Dresden
Über mich
Professional Skills
Analytical thinking
Independent problem solving
Creativity
Effective communication
Scientific expertise (biodiversity & climate)
Programming
Python, R, SQL, JavaScript, M (Power Query language)
HTML, CSS, XML
Package development (Python & R)
API design (FastAPI, Flask)
ETL / ELT pipelines
Geospatial processing (GIS, terra, sf, raster)
Scripting & workflow automation
Machine Learning
Classification & regression: linear models, tree-based methods, neural networks
Hyperparameter tuning, cross-validation, model evaluation
Bayesian modeling
Dimensionality reduction, Clustering
Technologies: XGBoost, scikit-learn, caret, PyTorch, NumPy, dplyr, pandas, JAGS, Stan
Databases
Data modelling, schema design, data normalization
Query design and optimization
OLTP & OLAP systems
Technologies: MySQL, PostgreSQL, DuckDB, OracleDB, Neo4j
Visualization & Reporting
Interactive reports
Dashboards, Graph Visualization
Automated documentation
Technologies: R Shiny, Plotly Dash, quarto, markdown, plotly, ggplot
Development & Orchestration
Version control: Git, GitHub, GitLab
Cloud: AWS, GCP
Containers & environments: Docker, pip, conda, venv, renv
IDEs & notebooks: VS Code, RStudio, Jupyter, Positron
Workflow automation: make, targets, Camunda
OS & scripting: Linux, Bash, Windows
Agile tools: Jira, Confluence, Slack, Teams
Persönliche Daten
- Deutsch (Muttersprache)
- Englisch (Fließend)
- Europäische Union
Kontaktdaten
Nur registrierte PREMIUM-Mitglieder von freelance.de können Kontaktdaten einsehen.
Jetzt Mitglied werden