levi.bernadine.pro
À propos du candidat
AWS Specialist | ML Engineer | Full Stack Data
Location
Education
Travail & Expérience
Developed and industrialized an end-to-end solution to help the business identify duplicate products within the catalog. Key responsibilities included : ∠ Industrialized the data pipeline : Set up robust data workflows, including processing inputs for embedding models. ∠ Optimized the embeddings pipeline : Developed an automated processing pipeline for product embeddings, incorporating image, text, analytics sessions, basket, and price data. ∠ Automated the clustering model inference : Built a scalable pipeline for generating clustering models. ∠ Feedback campaigns with AWS SageMaker Groundtruth : Launched campaigns to gather product similarity annotations from both business and casual users, leveraging AWS SageMaker Groundtruth. ∠ Custom feedback app development : Created and deployed a custom internal feedback application using Streamlit, which enabled : ∠ Real-time monitoring of model performance. ∠ Optimization of embedding weights choice ∠ Fine-tuning of clustering model parameters and prediction accuracy. Technologies : Python Databricks AWS Sagemaker Mlflow pytorch HuggingFace Spark Databricks Github Action Airflow Docker EKS Streamlit
Developed and industrialized a high-performance solution to predict user churn under strict operational constraints. Improved an existing model’s performance significantly (F1 score from 20 ∠ Enhanced sampling techniques : Refined the training data sampling method to improve model accuracy. ∠ Feature engineering : Conducted impactful feature engineering to boost model performance. ∠ Integrated machine learning pipeline : Built a parametric pipeline in SageMaker to automate preprocessing, sampling, training, and optimization steps. ∠ Applied MLOps best practices : Implemented robust MLOps practices, including : ∠ Model registry for efficient model versioning and management. ∠ Data drift and concept drift monitoring to maintain model reliability. ∠ Model explainability using SHAP (Shapley Additive Explanations) values for transparency in decision-making. Technologies : AWS Sagemaker Python Gitlab CICD Spark sklearn pandas airtable
∠ Built core application components for a data lake on the AWS platform, focusing on user accessibility for non-technical users to launch EMR and batch jobs seamlessly. ∠ Developed a cross-account table sharing solution to facilitate secure data sharing across different accounts. ∠ Created a custom processing framework for incremental ingestion of HDS tables from SAP, enabling efficient data updates. ∠ Designed and implemented data pipelines to support reliable data flow and processing. ∠ Managed platform operations by handling platform runs and resolving bugs to maintain system stability.