
ML / NLP Data Scientist
- Verfügbarkeit einsehen
- 0 Referenzen
- 125€/Stunde
- 75019 Paris
- auf Anfrage
- fr | en
- 21.08.2025
- Contract ready
Kurzvorstellung
Geschäftsdaten
Qualifikationen
Projekt‐ & Berufserfahrung
5/2024 – 6/2025
Tätigkeitsbeschreibung
At the national railway company, I worked on the two most used customer-facing chatbots, focusing on the evolution of the underlying intent and entity models:
•Supervise the migration of existing MaxEnt NLP models to transformer-based models.
•Oversee data gathering, annotation, and synthetic data generation and enhancement using generative models.
•Train models, assess quality and performance, and conduct traffic simulations prior to deployment on an optimized AWS infrastructure.
Additionally, I planned and supervised the integration of generative AI into the chatbot architecture:
•Design a new architecture incorporating Retrieval-Augmented Generation (RAG) and rephrasing techniques.
•Scrap, process data, optimize retrieval, and integrate with Claude and Cohere models on Amazon Bedrock and MongoDB.
•Enhance the bot infrastructure by adding memory and context capabilities and integrate them into the Kotlin codebase.
I also built the deployment infrastructure using SageMaker and Terraform, handling autoscaling, dev autoshutdown crons, rolling deployment and autorollback.
Git, Amazon Web Services (AWS), Document Retrieval, Generative KI, Mongodb, Natural Language Generation, Natural Language Processing, Natural Language Understanding, Python, Transformer, Typescript
1/2024 – 4/2024
Tätigkeitsbeschreibung
Simultaneously with my Iris Mission, I served as a standalone NLP/LLM engineer at Apicil, where I led the implementation of an RAG assistant for insurance brokers. The primary aim of this assistant was to seamlessly aid brokers in finding answers to their queries while presenting their offers. My responsibilities encompassed:
•Planning and organizing the project roadmap based on stakeholder expectations and available infrastructure.
•Conducting meetings with business clients to analyze and establish recurrent questions.
•Building a corpus of the most frequent and relevant questions.
•Creating a test set for evaluation and monitoring purposes.
•Identifying and adapting the most suitable embeddings model for the client's use case.
•Setting up Mistral and engineering prompts.
•Building and integrating the RAG tool into the current production infrastructure.
Document Retrieval, Entscheidungsbaum Lernen, Generative KI, Natural Language Understanding, Natural Language Processing, Git, Google Cloud, Docker, Python, Transformer
12/2021 – 12/2023
Tätigkeitsbeschreibung
As part of an AI assistant project aimed at executing user tasks by interacting with the internet and windows os programs, I have performed the following tasks:
•Plan and organize the project roadmap, orchestrating the development of various microservices in the Machine Learning aspects of the project.
•Collaborate with the interface development teams to ensure cohesion and seamless integration of all AI microservices with the graphical interface.
•Design a variety of models to generate instructions and system commands from natural language prompts:
•Develop a speech-to-text model for user command transcription based on Whisper.
•Develop an NLP model for translating instructions from Spanish and French into English.
•Finetune and quantize an LLM model for the precise decomposition of user instructions into elementary commands.
•Develop a solution based on a Mixture of Experts (MoE), including a supervised Spacy pipeline for key information identification and an SVM classifier for action type classification.
•Develop a Computer vision model for object detection, enabling the identification of interface components from hand-drawn sketches.
•Implement a pipeline for synthetic data to augment datasets and enhance project performance.
•Containerize and industrialize models using FastAPI, Docker, and GCP while developing bash scripts and Crontabs to efficiently automate and orchestrate microservices.
•Adhere to design patterns for modular and scalable code.
•Adopt an agile approach and perform versioning using Git. Conduct regular code reviews and mentor junior team members to improve code quality and ensure adherence to best practices.
•Maintain technical awareness by reading and presenting scientific articles on NLP, NLU, and Transfer Learning topics.
--------
As part of a project overseeing the digital transformation of functionalities within a public sector enterprise through the adoption of cutting-edge NLP technologies, my responsibilities were as follows:
•Work closely with the enterprise to understand its processes and ensure functional and technical design.
•Deliver an end-to-end Proof of Concept (POC) utilizing Transformers and implementing a StreamLit interface.
•Demonstrate proven experience in designing and implementing data architectures for large-scale and complex systems, with expertise in industrializing MLOps solutions on AWS.
•Build and develop various types of artificial intelligence models, including:
•Retrieval-Augmented Generation (RAG) model based on a Retriever, a Reranker, and a Large Language Model (LLM) named Mistral, using the React method to extract information from appeals formatted in PDF.
•Develop an NLP model for summarizing multipage legal documents.
•Implement OCR detection and anonymization of personal information in legal submissions (case tracking, appeals, filing requests) with Tesseract OCR.
•Implement classical approaches such as TF-IDF, token classification, as well as endpoints based on Language Models (LM) and Large Language Models (LLM).
•Conduct prompt engineering and establish alternative solutions using OpenAI GPT API.
•Ensure high-quality models by optimizing the trade-off between performance and resource efficiency through thorough hyperparameter optimization and continuous application of state-of-the-art techniques such as quantization and instruction tuning.
•Develop data pipelines including pre-processing and post-processing scripts on AWS SageMaker.
•Adopt an agile approach while rigorously adhering to Git versioning tools.
Gradient Boosting, Langchain, Computer Vision, Docker, Generative KI, Git, GPT, Jira, Natural Language Processing, Natural Language Understanding, Postgresql, Python, Scikit-learn, Transformer, Typescript
Zertifikate
TensorFlow
Ausbildung
Ecole Centrale de Lille
Lille, France
Jules Ferry Preparatory Classes
Über mich
As an NLP Engineer, I bring deep expertise in designing and developing Machine Learning architectures, with specialized focus on natural language processing. My skills encompass building robust systems using LLMs for text understanding and generation. I have proven experience in service deployment and optimizing content search within document databases, particularly for RAG applications.
My R&D background in NLP, especially with Transformers, provides solid capabilities to deliver mature, comprehensive machine learning solutions. I continuously evaluate and enhance model performance in production, ensuring regular updates to maintain relevance and efficiency.
Feel free to reach out for specific needs or questions.
Persönliche Daten
- Französisch (Muttersprache)
- Englisch (Fließend)
- Europäische Union
Kontaktdaten
Nur registrierte PREMIUM-Mitglieder von freelance.de können Kontaktdaten einsehen.
Jetzt Mitglied werden