- Data Mining
- Data Science
- Enterprise Guide
- Enterprise Miner
- Predictive analytics
- SAS Base
- SAS Macro
- Statistik (allg.)
Projekt‐ & Berufserfahrung
Kundenname anonymisiert, Düsseldorf
10/2014 – 6/2017Tätigkeitsbeschreibung
Roles: SAS Program Manager, Project Manager Data Mining
Impact: Developing a business logic for monitoring cancelling brokers: As soon as brokers announce their intention to cancel, the system initiates timely action to prevent the customer relationship from shifting to a competitor. Business Impact: 900+ Million € p.a. Designed and implemented a future monetary risk warning system that facilitates sales force to identify and target high-value customers in time before leaving. KPI approach. Business Impact: 100+ millions € per annum. Implemented and extended several projects computing purchase probabilities. Data mining approach. Business case alone generated 12+ million € (wins). Several marketing campaigns confirmed the reliability of these models and estimates.
Responsibilities: Support management and team with advanced data management, data mining / analysis skills, adapt to fast-changing requirements and multi-task in a fast-paced insurance environment. Tools: Enterprise Miner (EM), Enterprise Guide (EG).
Tasks and Achievements: (1) KPIs for Future Monetary Risk Warning System: Identifies and ranks potential churners according to overall value, and time left for sales force to respond. One special feature presents automatically Top 5 high-value customers to sales force including data for immediate contact. (2) Data Mining: (i.) Elected member of team to design and implement a early warning system of lapse or persistency risk of customers (Stornofrühwarnsystem) based on SAS EM and ABTs. (ii.) Identification and analysis of Sales Force Agents who intend to churn. Calculation of financial risk parameters, quality of customer care, fraud indicators, as well as probability of client churn probability. (iii.) Migration of third-party EM projects. Implementation and automation of a fully automated EM/EG data mining approach to forecast the purchase of specific insurance products. Extended from yearly to monthly forecast. (iv.) Back-testing of predicted sales values to actual occurring events in proof of concept (e.g. by ASEs, ROCs, binning and lift charts). Transformation of PoC results into Business Case. Computation of several KPIs incl. added wins and saved investment. Risk analysis of various scenarios of sustainability of customer investment. BC achieved GO from top management. Developed several models totaling up to 12.5 millions € (wins), or 5 millions € (savings). (2) Reverse engineering of strategic business reporting system for sales force: Porting (migration) of dysfunctional ETL and analysis programs (SPSS) into a high-performance SAS version, while debugging, tuning and enhancing on-the-fly. (3) Analysis of bonus program for preferred insurance portfolios on behalf of Concern Development, e.g. ABC, top client, insurance classes, and on organizational level (unit, distribution channels, prior year). (4) Fraud detection: Identify fraudulent sales force by pattern analysis of terminating old and procuring new contracts, special pattern analysis of contract shifting, and identifying fake address data. (5) Evaluation of the on-site Customer Contact Management by sampling, ETL, and export of contact data of cases and controls to host (TSO), SAS and EXCEL according to specification. (6) Selection by advanced random sampling of customer data according to specifications for a partner mailing incl. technical and statistical documentation. (7) Others: (i.) Migration of SAS Code and EG Projects from SAS9.3 to SAS9.4 (ii.) Multi-layered random-based anonymization of SAS datasets. (iii.) Deduplication projects, e.g. of email address database. (iv.) Phonetics-based merging of customer data from multiple SAS sources using a Fuzzy Match. (v.) Implementation of top client key in host (TSO) and in all end systems including communication, test, acceptance and documentation of the various organizational, technical and personal interfaces. (vi.) Transfer of functionalities of Enterprise Guide/Miner projects into SAS batch code, and vice versa.
Major insurance company in Munich area, München
5/2014 – 8/2014Tätigkeitsbeschreibung
Roles: SAS Project Manager (Programmer, Analyst, Consultant)
Impact: Designing an Issue Map pinpointing threats and risks (even unknown before) in the client’s system. Condensing complex information down to concise interim results presented to higher management.
Responsibilities: “Task Force”. From System Analysis to Management Consulting.
It came to mgmt attention that a selected process in enterprise-wide system lost more and more performance, reliability and stability. Initial priority responsibility was to analyze possible causes; at first limited to analyzing Korn shell scripts, SAS programs and macros of this selected process. Analysis showed that the selected process was only a symptom, not the root cause. Role extended to gather information about causes possibly threatening the whole system. Approaches involved identification and communication with stakeholders, gather and validate information, designing an Issue Map pinpointing threats and risks in the client’s system, and condensing complex information down to concise interim results presented to higher management. Methods applied were Business Analysis (Stakeholders, Data Flows, OEs, etc.), Downtime Threat/Impact Analysis, Risk Analysis, Architectural Analysis, and Cost Estimates for Project Mgmt tackling identified issues for quick-wins. Responsibility shifted again to develop technical concept, road map, and project plan to stabilize the whole enterprise-wide system by also proposing a grand solution with sustainability provided by state-of-the-art hardware, software, and computing.
Achievements: Positive feedback for doing a multi-faceted job, solving some issues “on the fly”, successful identifying threats (even unknown before), and communicating professionalism, perspectives, and confidence. Short-term emergency contract was extended several times.
Environment: Complex. SAS v9.1.3 in programs, DB2 on host, files in EBCDIC and other formats, several systems for scheduling and transferring (UC4 etc.).
Tools: MS Project v2013, BIP (Batch Import Procedure) Tool v1.1.4, Alerting and Monitoring (AMT) Tool v1.6 (AZD)
SAS Business Intelligence (BI)
Pharma Intelligence Company, Frankfurt
3/2014 – 8/2014Tätigkeitsbeschreibung
Pharma Intelligence Company in Frankfurt area 2014.03.06–2014.08.22
Role: SAS Statistical Programmer
Impact: Developing several SAS programs for automatic statistical analysis of diabetes data.
Responsibilities: Developing, testing and running SAS programs for statistical analysis of pharmaceutical data from the diabetes area. Scientific consultancy includes statistical analysis, development and testing of complex ETL routines to access, process, and analyse data from different sources and formats, diabetes-related input from studies monitored in the past for international meetings, monitoring of quality of third-party data deliveries including intervention to protect client from dirty data, evaluation of third-party SAS program performance, and if necessary trouble-shooting and tuning of these SAS programs accordingly.
Achievements: Professional SAS programming and program management just by informal call and also developing a professional IT manual. SAS programming was done according SOP BIO5 (KKS), validation of the SAS programs according to SOP BIO6 (KKS). Due to satisfaction, customer extended short-term contract several times.
Languages: SAS Base, SAS Macro Facility and PROC SQL.
Environment: SAS v9.3 on a Windows environment.
SAS Business Intelligence (BI)
Zensus 2011, IT.NRW, Düsseldorf, Düsseldorf
11/2011 – 2/2012Tätigkeitsbeschreibung
Zensus 2011, IT.NRW, Düsseldorf 2012.11.02–2013.01.31
Role: SAS Statistical Programmer, Statistical Analyst
Impact: Developed the Zensus 2011 statistical algorithm as powerful, yet user-friendly SAS macros.
Responsibilities: Adjustment of cells to projected margins for 1,440 communities in 65 model variants (volume: 5.5+ billion data rows), a total of 93,600 models and visualization of the numerous GoF parameters. The remaining procedure followed Bishop, Fienberg and Holland (2007) model pre-selection by comparing the estimated table reference table based on AIC, Pearson chi2 and log-likelihood for the log-linear models 1 to 65 Model fine selection based on minimal deviation (deviance) of the pre-selected models of the cell cast reference table (combinations of age, nationality, marital status and gender), additionally taking into account the size of the community to the exclusion of estimation errors and possibly disproportionate cell frequencies. Application combines numerous functions in a single ETL module that can be run as a SAS Stored Process as a unsupervised macro. This complex ETL module consists of two functionally disparate phases: In the first phase, PROC IML (CALL IPF) iteratively provides the data tables for each municipality for each of the 65 models. The second phase iteratively calculated GoF tests, essential parameters were aggregated and formatted as SAS data sets. In addition, this macro specifies criteria for (un)successful convergence (e.g. Chi2, maximum difference, N iterations), as well as pre-set stop criteria (maximum difference, maximum iterations) into a separate SAS file. On top, development of a module focussing on delivering intuitively interpretable visual analysis of GoF and deviance values. A "cockpit" with various "switches" allows fine-tuning the preferred visualization, as well as the input (local, state, all).
Achievements: Please see positive reference.
Technical environment: System environment: Front-End: Enterprise Guide v4, back-end SAS 9.2 servers, directly or via CITRIX. Data sets: SAS data sets (sometimes ORACLE, some Teradata), N: 100,000 +. Size: 250 GB. Data volume: > 5.50 E +09 (+ 5.5 billion) data lines (main application). Programming languages: SAS Macro Facility, Base SAS, PROC IML, SAS hash Programming, PROC SQL and SAS procedures MEANS, TABULATE and GRAPH. Length of the two SAS programs for the main application in A4 Pages: 60 (ETL: 40, Analysis: 20) Program execution: Main application as SAS Stored Process directly on the server, development and testing on CITRIX. Running time: 96 (ninety-six) hours (main application), spread over four cores.
SAS Business Intelligence (BI)
More than 25 years of hands-on application of research methods and statistics at advanced level:
• Theory of measurement and testing, hands-on experience in Applied Statistics.
• Design of Studies, Surveys and Experiments (incl. sampling and weighting).
• Statistical and mathematical multivariate methods and analysis (incl. Bayes statistics).
• Preferred statistical analysis systems: SAS, SPSS, and a bit R.
• Scientific Consulting incl. concepts, demonstrations, appetizer analyses, and presentation.
• Development of professional statistics courses on a high scientific level.
• Teaching / lecturing and publication activities.
• Publications by DeGruyter/Oldenbourg. Translations in Chinese in progress.
For example, Lectures and Development of Teaching Materials:
• Development of numerous courses including didactics, materials, and scientific state-of-the-art.
• 3.390+ pages teaching materials published. Translations not counted. For details, please see 7., Appendix 3: Publications and Presentations.
• 4.000+ pages unpublished teaching material.
SAS Base Programming for SAS 9 * (2016).
SAS Advanced Programming for SAS 9 * (2016).
SAS Statistical Business Analyst Using SAS 9: Regression and Modeling (scheduled for 2017).
SAS Visual Analytics: Introduction by SAS Institute at ERGO (2017.03.13)
Einführung in die Datenanalyse mit der SAS Enterprise Miner Software (AAEM61, 2010).
Predictive Modeling with SAS Enterprise Miner: Practical Solutions (Sarma, 2013).
SAS Enterprise Miner Documentation 4.1 (August 2003).
Basic Statistics Using SAS Software (BSTS, August 1998).
Statistics 1: Introduction to ANOVA, Regression, and Logistic Regression (released 2010).
Statistische Datenanalyse mit der SAS/Software – Einführung und Grundlagen (STAT1, 2000).
Predictive Modeling Using Logistic Regression / Modellbildung mittels logistischer Regression (PMLR, 2000).
Multivariate Statistical Methods: Practical Applications (AMUL, March 1997).
Grundlagen der SAS Software (GKPRO, Juli 2005).
Neue Features in Version 8 des SAS Systems (NASH, Februar 2001).
SAS Macro Language 1: Essentials (released 2014).
SAS Programming 1: Essentials (released 2014).
SAS Programming 2: Data Manipulation Techniques (released 2014).
SAS Programming 3: Advanced Techniques and Efficiencies (released 2015).
SAS SQL 1: Essentials (released 2014).
SQL Processing with the SAS System (58231, May 1992).
Programmierung mit der Base SAS Software (PRGR, January 2000).
Advanced SAS Programming Techniques and Efficiencies (APRO, July 1998).
SAS Macro Language (MACR, August 1998).
SAS-Makro-Programmierung: Eine Einführung (URZ, H.Geißler & C.Ortseifen, 1996).
Einführung in SAS auf dem Großrechner (URZ, C.Ortseifen, 1990). *
Introduction to Statistical Concepts (released 2013). *
Statistics 1: Introduction to ANOVA, Regression, and Logistic Regression (released 2015). *
Predictive Modeling Using Logistic Regression (revised 2014). *
SAS Enterprise Guide 1: Querying and Reporting (EG 7.1).
SAS Enterprise Guide 2: Advanced Tasks and Querying (EG 7.1). *
SAS Enterprise Guide: ANOVA, Regression, and Logistic Regression.
Creating Reports and Graphs with SAS Enterprise Guide (EG 7.1).
SAS Enterprise Guide: Writing and Submitting SAS Code: SAS Enterprise Guide Editor.
SAS Enterprise Miner: Replacing unwanted numeric values using the HP Transform node.
Rev Up Your RPM's: A Modeling Sampler, Parts 1-4.
Profiling a Target Variable Before a Predictive Model.
Imputing Missing Values.
Modernizing Your SAS Code: Intermediate and Advanced Topics Video Library.
SAS Distributed Processing Video Library.
SAS Parallel Processing Video Library.
SAS Programming 1 - 3: Additional Topics Video Library.
Other Trainings: Analytics:
2012.10.31 PISA Workshop: Computations with weights (University of Bern, CH).
2010.01.14–15 Methods of Market/Media Research (GfK Switzerland, CH).
2008.11.05–06 Roche® Role “Statistician” at Roche Diagnostics, Penzberg DE.
Training on the job.
DFG Deutsche Forschungsgemeinschaft
CRISP-DM 1.0 Cross Industry Standard Process for Data Mining