Behind the scenes

The Algo Logic

Vibe Canvas is powered by a real transformer model, trained on human-labelled emotion data. Here's how it works.

The Dataset : GoEmotions

Google Research's GoEmotions contains 58,009 Reddit comments, each annotated with one or more of 27 fine-grained emotion labels. We map these 27 labels down to the 7 emotions displayed in the app (anger, disgust, fear, joy, neutral, sadness, surprise) using a label-grouping scheme.

58,009

annotated comments

original labels

→ 7

mapped emotions

Source: Google Research

The Model : DistilRoBERTa

emotion-english-distilroberta-base is a DistilRoBERTa model fine-tuned on six diverse emotion datasets (including Twitter, Reddit, and more). It outputs probability scores across all 7 emotion classes simultaneously.

Architecture

DistilRoBERTa-base

Parameters

~82M

Training data

6 emotion datasets

Output

7-class probabilities

The model runs on CPU in the FastAPI backend. No paid inference APIs, everything is self-hosted.

Evaluation Results

Evaluated on the GoEmotions validation split (mapped to 7 classes).

Overall Accuracy

65.4%

Macro F1

61.2%

joy

F1 76%

anger

F1 69%

fear

F1 64%

sadness

F1 65%

neutral

F1 63%

disgust

F1 55%

surprise

F1 55%

Interpretability

The “Key signals detected” tokens shown on the result screen are extracted using a keyword-matching heuristic: we maintain a curated vocabulary of emotionally-charged words per class, then scan the input text for matches. Token weight is proportional to word length and specificity.

The reasoning sentence is a template filled with the top matching tokens, giving users a plain-language explanation of why the model landed on a given emotion.

Future versions will use attention-weight visualization or SHAP values for deeper explainability.