Emotion-Aware Recommendations: Building a YouTube Recommender That Understands Feelings

Ever wondered what AI fans really want to watch next?
I turned 180K+ YouTube comments into personalized video suggestions using sentiment analysis + clustering.

In an era where algorithms suggest what we watch, listen to, and read, I wanted to ask a deeper question: What if we could recommend content not just based on what it's about — but on how it makes people feel?

This idea led me to build the Sentiment-Driven Video Recommendation System, a fully deployed machine learning pipeline that recommends YouTube videos using not only content similarity but viewer emotions extracted from comments and unsupervised video clustering.

Let me walk you through how I engineered it — and what I learned along the way.

Live App

The Problem

Most recommender systems ignore emotions. What if someone liked a video but felt confused or disappointed? I built a system that captures emotional impact using NLP.

The Approach

The system computes a final score for each candidate video using the following formula:

Final Score = Content Similarity × Sentiment Score × Cluster Boost

Final Score = Content Similarity × Sentiment Score × Cluster Boost
  1. Content Similarity (TF-IDF + Cosine Similarity)

    To capture semantic similarity between videos, I transformed transcripts into high‑dimensional numeric vectors using TF‑IDF (Term Frequency–Inverse Document Frequency).

    Key steps:

    • Text cleaning: lowercase, stopwords, punctuation
    • Vectorization (TfidfVectorizer): ngram_range=(1,2) → unigrams and bigrams; max_features → dimensionality control
    • Result: TF‑IDF matrix ∈ ℝⁿˣᵐ where n = videos and m = unique terms
    • Cosine similarity: Symmetric matrix; each cell (i, j) reflects topical similarity
    • Content baseline: Serves as the base of the content‑driven recommender, independent of user behavior
    Cosine similarity matrix (TF‑IDF)
    Cosine similarity matrix (TF‑IDF)
  2. Sentiment Analysis (RoBERTa + PySpark)

    To incorporate audience perception, I used a RoBERTa model fine‑tuned on GoEmotions to run multi‑class sentiment analysis.

    Technical pipeline:

    • Parallelization: per‑video comments processed in parallel with PySpark
    • Inference: each comment yields a distribution over 29 emotions (softmax)
    • Aggregation: per‑video mean → a unique emotion vector per content

    Sentiment Score calculation:

    Sentiment Scorei = 1 + Σj wj · pij
    • pij: probability of emotion j for video i
    • wj: weight for each positively associated emotion
    • Effect: multiplier over semantic similarity; favors positive reactions
    Wordcloud of positive comments (sentiment features)
    Wordcloud of positive comments (sentiment features)
  3. Clustering (K-means + DBSCAN)

    To avoid redundant recommendations and ensure topical diversity, I applied unsupervised clustering on combined content, emotion, and engagement features.

    Features used:

    • Emotion vector: 29D (mean of per‑emotion probabilities)
    • Engagement: view count (log), like ratio, duration
    • Category: one‑hot/category embeddings

    Techniques applied:

    • K‑means: normalization with StandardScaler; k via elbow method + Silhouette Score
    • DBSCAN: detects dense clusters and outlier/viral content
    • Cluster Boost: ×1.2 if candidate and seed share a cluster → balances relevance and variety
    K‑means clustering of videos
    Video clusters (K‑means)

Tech Stack

Natural Language Processing (NLP)

Data Ingestion & Enrichment

Scale-out Processing

Production Deployment

Key Results