Spent the last eight years convincing computers to make decent predictions and executives that we cannot predict next week's lottery numbers. Expert at explaining why your favorite metric is probably wrong and why that 99.9% accuracy model definitely won't work in production. Fluent in the art of setting realistic expectations and building pipelines that don't wake you up at 3am. Has mastered the dark art of productive async collaboration across time zones without losing sanity.
Keywords:Python, SQL, Spark, AWS, GCP, Docker, MLflow, TensorFlow, Sklearn, XGBoost, LightGBM, NLP, Survival Analysis, LTV, Recommender Systems, A/B Testing
Experience
Senior Data Scientist · RevenueCat · Apr 2025 - present
Led LTV modeling initiatives, from ideation to deployment.
Redesigned the LTV evaluation framework, migrating from a random to a time-based split to improve robustness and interpretability.
Developed a survival model to estimate user LTV, achieving a +5% improvement in the main business metric.
Guided the implementation of the statistical framework used to compute statistical significance for conversion A/B tests.
Authored a methodology to compute confidence intervals for LTV estimates, enabling statistical significance testing.
Senior Data Scientist · Wallapop · Sep 2023 - Mar 2025
Lead ML initiatives for the search team focusing on matching, ranking, and software best practices. Used Solr as the search engine.
Trained a ranker ML model and deployed it to Solr, improving search-to-transaction ratio by +1.5%, adding 600K€ annually.
Trained and deployed a BERT model for a query classification service. Improved the search-to-transaction ratio by 1%, adding 400K€ annually.
Trained a PoC ranker model that used real-time features such as item popularity. Improved offline NDCG metrics by +10%.
Developed PoC solutions for query understanding (intent extraction from queries and structured attribute extraction from descriptions) using LLMs.
Refactored ETLs from manually executed notebooks to Spark jobs. Reduced execution time from days to hours, improving developer experience and scalability.
Organized events to increase machine learning visibility: internal hackathons, Meetups, and conferences.
Senior Data Scientist · Stuart · Nov 2019 - Sep 2023
Deployed a service that improved ETA accuracy by +30% using a deep learning model. Achieved a +28% improvement in cold-start locations.
Designed and developed pipelines to automatically train, evaluate, and deploy ETA models.
Built a distributed pipeline to process daily all the events dumped from Kafka to S3, allowing DS to analyze and train models on it.
Designed an experimental dispatcher engine to solve the assignment problem using Python and OR-Tools.
Planned a PoC using LightGBM to estimate the probability of a courier accepting a specific delivery.
Mentored a senior software engineer who wanted to specialize in machine learning and data science
Additional Data Science Experience
21 Buttons(Jun 2019-Oct 2019): Built a recommender system with implicit data, and an image + text-based clothing classifier.
Privalia (Veepee)(May 2018-Jun 2019): Built a forecasting model for clearance sales. Created a pricing engine on top of the model.
Gauss&Neumann(Oct 2017-Feb 2018): Developed tools for monitoring and optimizing SEM campaigns using Google AdWords and Python.
Projects
www.alexmolas.com: I've been maintaining since 2020 a blog about data science. Over 70k visits during 2023.