Big Data Architect

freiberufler Big Data Architect auf freelance.de
Referenzen
offline
Verfügbarkeit einsehen
auf Anfrage
es  |  en  |  de
auf Anfrage
46021 ´Valencia
30.10.2018

Kurzvorstellung

I work in development and architecture of Big Data environments, ECM architecture and consultancy, administration of Oracle databases and analysis and development in different languages and programming environments including Scala, J2EE and Python.

Ich biete

IT, Entwicklung
  • Apache Hadoop

Projekt‐ & Berufserfahrung

Big Data Architect (Festanstellung)
Kundenname anonymisiert, Madrid
2/2016 – 10/2017 (1 Jahr, 9 Monate)
Dienstleistungsbranche
Tätigkeitszeitraum

2/2016 – 10/2017

Tätigkeitsbeschreibung

Big Data Architect in Stratio. This company developed its own platform based in PaaS and Spark, using the Scala framework. This platform has been certified by Apache Spark.
I’ve been working in those projects:
Customer: Carrefour, Data Lake de Marketing project
Build a lambda big data " Data Lake" ecosystem to lodge the information of 70 Tb of an initial ticket information load, and supporting the online ingestion of 700 GB of new tickets per month, coming all around all the Carrefour supermarkets of Spain. The raw information is stored in HDFS in Parquet format, and the surface of the lake is filled with transformed data through Spark Streaming processes that are constantly pulling data frames from several Kafka queues. These queues are fed up from a Tibco ESB.
Training clustering and regression machine learning models to detect customer behavior patterns, generating online vouchers all together in the till where customer is buying, in order to price him based on his customer profile (gold, silver) and his preferences.
Customer: Aguas de Valencia
This project aims to the replacement of an AS/400 system using a scalable and high available Big Data platform, including a lambda architecture working all together with high performance microservices. These will implement the old Cobol language algorithms, and will implement new functionalities, i.e. design use cases to detect water leaks and fraud consumption.
Also one of our goals is to design the data models for an HDFS Data Lake and NoSQL databases to lodge the figures and details of water consumption, picked every 8 seconds and sent to the central from 700.000 electronic water meter, installed in all around 420 populations from several provinces of Spain.
My tasks have likely the same in both projects:
• Collection of business requirements: identify sources of information and volumetry relevant on this business area
• Choose the most suitable tools, attending to their commitments and specifications.
• Predict speed of every stage of the ETL process, identifying possible bottlenecks and scaling the environment according to those predictions.
• Design the ingestion, transformation and classification ecosystem to extract and refine relevant features, and train different Classification, Regression and Clustering models, using real time Machine Learning with Apache Spark Streaming with Spark MLlib library.
• Develop the ingestion code: Spark Streaming process for ingestion and transformation, Akka Streaming ETL, we plan to include Kafka Streaming and Kafka connectors as soon as it is stable, Rest client web services to control the state of the frameworks
• Development of business microservices: using the Fabric8 framework, with Spring Boot, Spring Cloud, Angular.js to build web interfaces and Activiti as a workflow engine.
• Put the nuts and bolts to deliver a DevOps environment for Continuous Delivery, using Git to maintain the repository code, Jenkins to perform automatic test and deployments, Docker and Mesosphere PaaS, to have a scalable environment for the deployments.

Eingesetzte Qualifikationen

TIBCO Spotfire, Apache Hadoop, Spring, Git, AngularJS


ECM Architect (Festanstellung)
Kundenname anonymisiert, Spanien
7/2012 – 2/2016 (3 Jahre, 8 Monate)
High-Tech- und Elektroindustrie
Tätigkeitszeitraum

7/2012 – 2/2016

Tätigkeitsbeschreibung

I've been working in these areas:

• Function: Big Data pre-sales engineering.
Period: 2014-01-11 - 2016-02-11
I've been preparing POCS for the following projects:
Customer: POLICÍA NACIONAL - SISTEMA DE REGISTRO DE NOMBRE DE PASAJEROS (PNR PASSENGER NAME RECORD)
I'm a member of the administration team of a Hadoop server within the framework of a project that the National Police awarded to Fujitsu. The server consists of a Master NameNode a Secondary NameNode, and one cluster with ten DataNodes 250 GB each.
Following this initial collaboration, the National Police contacted us to prepare them a proof of concept for a new project called Passenger Name Record (PNR). Some of the requirements to be met are:
o Capture and processing of the structured and unstructured information whose origin is fundamentally social networks. The content can be documents of any format supported in the market, including plain text.
o The platform must store historical records of the extracted information, allowing search and dynamic taxonomies that can be configured by the authorized users
o It should also generate alerts based on user-configurable thresholds, based on text patterns, discovering similarities between different sources based on searched patterns, and generating reports and graphics with the results.
I've used these tools to prepare this POC:
o Apache Flume, as a connector to import data streaming of unstructured data to Hadoop file system
(HDFS). The data source connects to Twitter
o HCatalog, to build views on the basis of the imported data o Sqoop, to import the generated structured data
o Spark SQL, Hive and Pig, to query data and export the results in CSV mode
o Neo4j and OrientDB to load the extracted data into a graph, and query the resulting network with Cypher with Rest Web Services
o HP Haven, for OCR, document classification and text analysis. Also for tasks such as language identification, search of concepts, tokenization of text and sentiment analysis. o J2EE and JavaScript

Customer: EMT VALENCIA - SMARTCITY
EMT is the Council Public Transport of Valencia. They asked Fujitsu for a proof of concept in the context of one of its projects called SmartCity, and I'm working on it since then.

The requirements to be met are:
o capture and processing of structured information from historical tables that contain bus routes information, obtained from the signal emitted by GPS devices installed on the buses.
o the platform should allow the definition of metrics that the administrator user wants to analyze, such as completion times a route, start time of a trip, number of traffic incidents, etc. Also should allow to assign alert thresholds on these metrics, so that the end user has a control panel that displays a status color elements analyzed, for example buses, routes, etc., indicating whether it is in the correct position, if there is any anomaly with this item or if the status of it is critical.
o the control panel must allow to the administrator user to click on any item shown, opening a second level window that shows the components that comprise the clicked item. These components will be the set of
4
metrics that are part of the rule that defines the threshold of the main element, and in turn also have defined thresholds for them, so that their status is displayed with a color of normal, warning or danger.
o The system must be capable of preventive analysis, such as detecting vehicle reviews or calculate alternative routes.
o The system must have the capacity to generate alerts, and to perform reactive actions. In later stages, these alerts should be coupled to a messaging system that will be sent to the drivers through different supports.
For the preparation of the demo I'm using the following tools:
o Sqoop, to import the structured data of the bus routes
o SparQL, Hive y Pig, to query data and export the results in CSV format
o Neo4j, to load the extracted data into a graph, and query the resulting network with Cypher with Rest Web Services, through JSON client invocations.
o D3.js, AngularJS, Google Maps y Unity, for the graphic interface. o J2EE, JavaScript

• Function: Oracle DBA administrator and Oracle Weblogic 11g application server administrator.
Period: 2014-10-18 - 2014-01-10
Customer: Ministry of Justice
Common tasks:
DBA Administrator of an Oracle grid farm with 400 servers Oracle 10g Database and 500 Oracle 10g Application servers over Windows 2003 Enterprise.

• Function : Oracle DBA administrator and Oracle Weblogic 11g application server administrator.
Period: 2013-05-27 - 2014-10-17
Customer: Ministry of Education
Common tasks:
DBA administrator of a six nodes clustered database Oracle 11g and 10 clustered node Oracle WebLogic 11g application servers, plus one Weblogic Server 12c Oracle Grid Control.
• Function: ECM Architect
ECM Architect for those customers:
o BBVA
o Telemadrid
o Universitat de Valencia
o Ministry of Education
o Instituto de Medicina Legal
Common tasks:
EMC2 Documentum
Sizing estimation, operating systems and databases installation patch and upgrade, Documentum software installation, patch and upgrade with clustered Content Servers, define and create documental map, create and install xCP2 and Captiva Dispatcher processes. Develop DFS web services with Cmis interface. These web services main feature is that they offer a standard interface and avoid the dependency of the EMC2 Documentum Productivity Layer. This allows to the customer to have their client web services decoupled from the Documentum libraries like dfc.jar, because the web service generates and administers temporary Content Server tokens.
5
Alfresco Enterprise Edition
Sizing estimation, operating systems and databases installation patch and upgrade, Alfresco software installation over a clustered active passive JBoss application server environment, product patch and upgrade, define and create documental map, create document lifecycles, create users, groups, roles and permission sets.

Eingesetzte Qualifikationen

Actuate, Apache Hadoop, Oracle Database, Oracle WebLogic Server, J2EE (Java EE), Oracle Grid Engine, JavaScript Object Notation (JSON), Representational State Transfer (REST), JavaScript


Documentum Consultant (ECM Consultant) (Festanstellung)
Kundenname anonymisiert, London
12/2011 – 7/2012 (8 Monate)
Öffentliche Verwaltung
Tätigkeitszeitraum

12/2011 – 7/2012

Tätigkeitsbeschreibung

Worked as a ECM Consultant. I designed and deployed from development to production these three greenfield projects:
• St. George's Hospital
• Admin Re
• University of Leeds
The tasks that I carried out in these projects were:
• Installing a two nodes cluster database Oracle 11g
• Installing a two nodes cluster of Documentum 6.7, actived Kerberos authentication for DFS and WebTop, documentary map generation, created document types, created roles and permissions, etc.
• Installing Captiva InputAccel and Dispatcher and developing their process for the capture, OCR extraction, classification and storage in the Documentum repository
• Configuring Citrix remote desktop environments for remote scanning documents

Eingesetzte Qualifikationen

Oracle Database, Citrix Presentation Server


Documentum Architect
Kundenname anonymisiert, London
8/2011 – 12/2011 (5 Monate)
Gesundheitswesen
Tätigkeitszeitraum

8/2011 – 12/2011

Tätigkeitsbeschreibung

Function Documentum Architect
Integration architect for a multimillion dollar project for the National Health Service in the UK. This project consists of a Content Management System to capture documents generated from the scanners located in primary care centers, from emails and from network drives. Ingestion is done through Kofax and Captiva InputAccel processes, and Documents are processed in a business process defined in IBM Business Process Manager. Finally, documents and extracted metadata conform expedients that are stored in Documentum 6.7.
DBA Administration of the clustered Oracle 11g database, where Documentum repository was installed.

Eingesetzte Qualifikationen

Oracle Database


Documentum Consultant & Oracle DBA in the Health Ministry datacenter
Kundenname anonymisiert, London
10/2010 – 8/2011 (11 Monate)
Dienstleistungsbranche
Tätigkeitszeitraum

10/2010 – 8/2011

Tätigkeitsbeschreibung

Feb 2011-Aug 2011 Insa
Function Documentum Consultant
Documentum Consultant in an Iberdrola's (Scottish Power) project called "Employee Portal" with a distributed server environment over Solaris 10 and Windows 2003 Enterprise Edition clients. It is a system of high concurrency, with more than 500 content contributors and more than 8,000 Iberdrola's employees based in Europa and Americas.
• Development of jobs, methods, TBOS and SBOs, to achieve business requirements and provide Web services catalogs between different Docbases.
• Developing code to invoke DFC and DFS with UCF level, generation of workflows, lifecycles and ACLs to manage users and groups permissions.
• Backups with "dump and load" and generating docbase replicas with Oracle export and copy filesystem.
• Generating dars with Composer and installing them in the Docbase Global Registry and secondary Docbases.
• Installation, updating and patching Documentum Content Server, WebPublisher and DA. Troubleshoot possible problems in the Production environment. • Oracle database administration.
Oct 2010-Feb 2011 Insa
Function Oracle DBA in the Health Ministry datacenter
6
Database administrator in an Oracle 10g RAC cluster with 4 nodes, sized in 1.4 Tera Bytes of data, with seven Oracle 11g application servers.
• Working actively in improving the performance of the database and application servers, periodically performing high availability tests and disaster recovery simulations.
• Patching and updating databases and application servers.
• Development and programming backup scripts for Rman, using Veritas NetBackup control system.
• Development of bash scripts for Solaris and Aix to control the resources consumption and detect bottlenecks.
• Guards 24x7x365.

Eingesetzte Qualifikationen

Actuate, Oracle Database, Oracle Solaris (SunOS)


Senior Java Analyst and Developer
Kundenname anonymisiert, Spanien
2/2009 – 10/2010 (1 Jahr, 9 Monate)
Dienstleistungsbranche
Tätigkeitszeitraum

2/2009 – 10/2010

Tätigkeitsbeschreibung

Function Senior Java Analyst and Developer
Analyst and senior Java developer on a greenfield project called Amara
This is a BPM developed by Indra. This may be included in any public or private portal, and enables users to carry out demarches with the owner of the virtual agency portal.
• technologies used: Spring, Hibernate, EJB 3, DWR Ajax, Axis Web Services CSS, JavaScript. Maven, Archiva, Ant, CVS,Subversion, Tortoise and Eclipse.
• I was responsible for the design of the application, managing to make it modular enough to allow the registration process document management to be stored in any content management system, like Documentum, FileNet, Alfresco and any other platform that is on the market. Third party applications connection was resolved through Axis2 web services.
• I installed the Alfresco, FileNet and Documentum repositories and developed Spring connectors apis, so that the application could connect to those Content Management Systems endpoints, by just wiring the Spring controllers inside the api client module.

Eingesetzte Qualifikationen

Eclipse, Spring, JavaScript


Technical support engineer and WDK developer
Kundenname anonymisiert, Brentford
10/2007 – 2/2009 (1 Jahr, 5 Monate)
High-Tech- und Elektroindustrie
Tätigkeitszeitraum

10/2007 – 2/2009

Tätigkeitsbeschreibung

Function Technical support engineer and WDK developer
Documentum Technical Support for the EMEA. Patch development and delivery for the logical layer and the presentation layer of several Documentum products. My tasks were:
• Reproducing customer's environment and troubleshooting the problem reported by them in Powerlink.
• Development and delivery of patches and customizations for Documentum WDK products, to solve some of the identified problems.
• Installing and configuring Windows, Solaris, AIX and Linux machines with Oracle databases

Eingesetzte Qualifikationen

Oracle Database, Oracle Solaris (SunOS), AIX, Linux Entwicklung


Oracle DBA
Steria, Spanien
2/2005 – 10/2007 (2 Jahre, 9 Monate)
Dienstleistungsbranche
Tätigkeitszeitraum

2/2005 – 10/2007

Tätigkeitsbeschreibung

Database administrator in an Oracle 10g RAC cluster with 2 nodes, sized in 0.5 Tera Bytes of data, with four Oracle 11g application servers.
• Working actively in improving the performance of the database and application servers, periodically performing high availability tests and disaster recovery simulations.
• Patching and updating databases and application servers.
• Development and programming backup scripts for Rman, using Veritas NetBackup control system.
• Development of bash scripts for Solaris and Aix to control the resources consumption and detect bottlenecks.
• Guards 24x7x365.

Eingesetzte Qualifikationen

Oracle Database, Oracle Solaris (SunOS), AIX


Senior PL/SQL Analyst and Developer
Bull, Spanien
5/2001 – 2/2005 (3 Jahre, 10 Monate)
Dienstleistungsbranche
Tätigkeitszeitraum

5/2001 – 2/2005

Tätigkeitsbeschreibung

Development of the Population Information System, a highly transactional application to handle the population register of Valencia application. I deployed and maintained the application on a high performance system on IBM AIX and Oracle 8i database that was later migrated to 9i.

Tasks:
• I implemented the Entity-Relationship Diagram with Oracle Designed, and loaded the population data into repository tables with scripts PL / SQL and SQL Loader
• Development technologies where PL / SQL, HTML, CSS, Developer 6 (forms and reports) and Pro * C.
• Posterior migration of the application to the Struts framework.
• I installed and maintained the code repository environment with Microsoft Sourcesafe

Eingesetzte Qualifikationen

Oracle (allg.), AIX, PL/SQL, CSS (Cascading Style Sheet), HTML, Struts


Senior Java Analyst and Developer
Kundenname anonymisiert, Spanien
1/1993 – 5/2001 (8 Jahre, 5 Monate)
Dienstleistungsbranche
Tätigkeitszeitraum

1/1993 – 5/2001

Tätigkeitsbeschreibung

Function Senior Java Analyst and Developer
Analysis and development of virtual banking portal "Bancaja Proxima" .
The application architecture was designed, analyzed, implemented and deployed based on the needs of a high availability environment, high transaction banking and real-time system of machines IBM AIX on a database server and Oracle 8i applications.
Technologies used:
• Java Servlets and Java core for the business logic layer, JDBC for data access layer and JSP with javascript for the presentation level layer.
• PL/SQL and SQL Loader scripts to load table content
• Unix bash scripting

Eingesetzte Qualifikationen

UNIX, Java (allg.), PL/SQL, JavaScript, JSP (Java Server Pages)


Ausbildung

IT Engineering
(Bachalor)
Jahr: 1992
Ort: Valencia

Qualifikationen

Technical expertise:
Machine Learning: Near-real time Machine Learning with Apache Spark Streaming and MLlib
Big Data: Hadoop, Neo4j, Spark, Cassandra, Titan, OrientDB, HP Haven, ELK (Elasticsearch, Logstash, Kibana).
NoSQL: Cypher, MongoDB, Gremlin, Spark SQL, Tinkerpop
Microservices: JHipster + Spring Boot + Kafka + AngularJS + Rest Web Services
Programming Languages: Java, Scala, Python, JSP, HTML, C++, Visual C++, Visual Basic, PL/SQL
Content Management Systems: Documentum, Alfresco, Captiva InputAccel, Captiva InputAccel for Invoices, Captiva Dispatcher, xCP 2.2, ABBYY FlexyCapture, SAP, Portals, Kofax and IBM Process Manager.
Relational Databases: Oracle, MySQL, PostgreSQL, SQL Server.
Operating Systems: Installation and configuration of Linux, Windows, Aix and Solaris. Unix bash shell scripting.
Programming Languages: Java, Scala, Python, JSP, HTML, C++, Visual C++, Visual Basic, PL/SQL
Architectural Frameworks: J2EE, J2SE, Spring, J2ME, SOAP
Application Frameworks: Spring, Hibernate, Struts, Swing, Axis2, JUnit, XML Jaxb, XML JDom, DWR Ajax Web Technologies: D3.js, JSP, JavaScript, CSS, HTTP, HTML, JSON, JQuery, REST, CSS, XML, Servlets.
Application Servers: Oracle Application Server, Oracle Bea Weblogic, JBoss, Apache Tomcat and Websphere
Tools: Eclipse, Netbeans, JDeveloper, Ant, Maven, Archiva, Visual Studio, CVS, Tortoise, Subversion, Oracle Designer, Oracle Developer 6i Release 2 and 10g (Oracle Forms and Reports), Oracle Portal, Toad, Topstyle Pro, Pl/sql Developer.
Methodologies: UML, Object Oriented Programming


CERTIFICATIONS
• Oracle Database 11g R2 Administrator Certified Associate (OCA)
• Documentum System Architect
• Fujitsu's Import-Export Capture and Compliance Solution for Banking  • Abby Flexicapture 11 Certified Consultant
• Itil Foundation v3

SECTORS
• Public Administration
• Virtual Banking

TRAINING
• Hadoop
• NEO4J Spark
• Cassandra
• HP Haven
• ELK (ElasticSearch, LogsTash, Kibana)Oracle University XML Applications
• Oracle University Oracle Application Server Administration I
• Oracle University Oracle Database Administration I and II
• Oracle University Database 11g Performance Tunning
• Advanced Recovery with RMAN
• Oracle 9i SQL Tunning
• SOAP Web Services
• EMC2 Documentum Content Services for SAP
• EMC2 Documentum System Fundamentals
• EMC2 Documentum Web Development Kit Admin
• EMC2 Documentum Foundation Services (SOA) Admin
• EMC2 Documentum Compliance Manager Administrator
8
• EMC2 Documentum xCP 2.1 Process Designer
• Captiva InputAccel Administrator
• Captiva Dispatcher Fundamentals & Development
• ABBY Administration
• Sun Developing Applications for the Java EE Platform

Über mich

My profile covers the areas of development and architecture in Big Data environments, Enterprise Content Manager architecture and consultancy, administration of Oracle databases and analysis and development with different languages and programming environments, including J2EE, Scala and Python.
During the last three years I've been working in Big Data projects as an architect, using the following tools:
Apache Hadoop, Neo4j, Apache Spark, Cassandra, HP Haven, ELK (Elasticsearch, Logstash, Kibana).
Also, during the last year I've been improving my data scientist skills, using near-real time Machine Learning with Apache Spark Streaming MLlib.
I also have this working experience:
• More than 8 years playing roles between architect and consultant with Documentum, Alfresco, Liferay, Oracle Content Server, Captiva Captiva InputAccel, Captiva Dispatcher, Kofax and ABBY Flexycapture. 8 years as Oracle database administrator on Unix, Linux and Windows servers 4 years as J2EE analyst and developer.
• 3 years as a developer PL / SQL Developer.
I have these certifications: Documentum Architect, Architect ABBY FlexiCapture 11, Oracle Administration (OCA), and ITIL v3.0.
Also certified "Fujitsu's Import-Export Capture and Compliance Solution for Banking"
I contribute with optimum quality in all the phases of the lifecycle of a project, guaranteeing the requisites accomplishment in the delivered versions, and respecting the agreed deadlines.
Conversational level of English, I lived in London for more than four years.

Persönliche Daten

Sprache
  • Spanisch (Muttersprache)
  • Englisch (Muttersprache)
  • Deutsch (Grundkenntnisse)
Reisebereitschaft
auf Anfrage
Arbeitserlaubnis
  • Europäische Union
Home-Office
unbedingt
Profilaufrufe
1124
Alter
51
Berufserfahrung
27 Jahre und 4 Monate (seit 01/1993)

Kontaktdaten

Nur registrierte PREMIUM-Mitglieder von freelance.de können Kontaktdaten einsehen.

Jetzt Mitglied werden »