Iván Seldas — Data Scientist

BCN Noise Predictions Time Series

What if your ML model could hear the city and predict its next move?
I've developed a real-time noise forecasting system that leverages urban noise data.

Production pipeline: feature engineering, experiment tracking (MLflow), model registry & deployment.
Real-time stack: FastAPI service + Streamlit dashboard (Google Cloud Run) with Docker + CI/CD.

Results

70% error reduction vs baseline models (RMSE 3.0 dB,
MAE ~1.2 dB), accurately predicting when noise exceeds 65-70 dB human safety threshold.
Forecast ranges capture 93–95% of real outcomes.

Demo GitHub

Screenshot of the Barcelona Noise Forecasting project

Time Series Forecasting FastAPI Streamlit MLflow Docker CI/CD Google Cloud Platform

Sentiment-Driven Video Recommendation System

Ever wondered what AI fans really want to watch next?
I turned 180K+ YouTube comments into personalized video suggestions.

Video recommendation system using clustering over 2,500+ videos, sentiment analysis of 180,000+ comments in PySpark, and content similarity (TF‑IDF), generating personalized suggestions based on the current video.

Results

Web app in production, deployed with Docker on Google Cloud Platform.
Recommendations computed via a final score conditioned on the currently watched video.

Try Demo GitHub

Python RoBERTa Transformers Classification Docker Google Cloud Platform Embeddings CI/CD

Automated Content Generator - Agentic RAG Workflow

Manual drafts are slow. Generate long-form, cited content from your internal docs.

Automated research and article generation using your own content base, retrieved from a vectorized PostgreSQL database.
Delivered structured, verifiable content with citation tracking and quality controls.

Results

Reduces content creation time to a minimum
Maintains content structure and quality standards
Only stored docs + citations (no hallucination)

n8n AI Agents LLM Fine-Tuning Supabase Semantic Search Embeddings RAG Pipelines

Clients Churn Prediction

Can you predict when a customer is about to churn?
I developed a supervised ML pipeline (LogReg, Random Forest, XGBoost) to detect and rank customers at highest risk of churn.

EDA + Feature Engineering to expose churn patterns across categorical and numeric signals.
Trained and tuned Logistic Regression, Random Forest, and XGBoost models optimizing for recall and ROC-AUC.

Results

Recall 0.773, Precision 0.94, ROC-AUC 0.928 on validation data (tuned XGBoost).
Potential 20% churn reduction by applying a 30% effectiveness campaign to the top 500 high-risk customers ranked by the model.

Notebook

Screenshot of the Telecom Churn Prediction project

Classification

More details

Feature store powered pipeline with experiment tracking and SHAP-driven explanations. Calibrated ensemble for actionable retention strategies.

Segment-specific thresholds for contact prioritization.
Top drivers surfaced for each subscriber for tailored offers.

Hi, I'm Iván Seldas.
Welcome to my Portfolio.

Programming

Geospatial Analysis

Machine Learning

Generative AI & NLP

Visualization

Cloud & Big Data

Automation & MLOps

BCN Noise Predictions Time Series

Sentiment-Driven Video Recommendation System

Automated Content Generator - Agentic RAG Workflow

Clients Churn Prediction

More details

Who I Am

Experience

Education & Certifications

Hi, I'm Iván Seldas. Welcome to my Portfolio.

Programming

Geospatial Analysis

Machine Learning

Generative AI & NLP

Visualization

Cloud & Big Data

Automation & MLOps

BCN Noise Predictions Time Series

Sentiment-Driven Video Recommendation System

Automated Content Generator - Agentic RAG Workflow

Clients Churn Prediction

More details

Who I Am

Experience

Education & Certifications

Hi, I'm Iván Seldas.
Welcome to my Portfolio.