Behind the scenes
The Algo Logic
Vibe Canvas is powered by a real transformer model, trained on human-labelled emotion data. Here's how it works.
The Dataset : GoEmotions
Google Research's GoEmotions contains 58,009 Reddit comments, each annotated with one or more of 27 fine-grained emotion labels. We map these 27 labels down to the 7 emotions displayed in the app (anger, disgust, fear, joy, neutral, sadness, surprise) using a label-grouping scheme.
58,009
annotated comments
27
original labels
→ 7
mapped emotions
Source: Google Research
The Model : DistilRoBERTa
emotion-english-distilroberta-base is a DistilRoBERTa model fine-tuned on six diverse emotion datasets (including Twitter, Reddit, and more). It outputs probability scores across all 7 emotion classes simultaneously.
Architecture
DistilRoBERTa-base
Parameters
~82M
Training data
6 emotion datasets
Output
7-class probabilities
The model runs on CPU in the FastAPI backend. No paid inference APIs, everything is self-hosted.
Evaluation Results
Evaluated on the GoEmotions validation split (mapped to 7 classes).
Overall Accuracy
65.4%
Macro F1
61.2%
Interpretability
The “Key signals detected” tokens shown on the result screen are extracted using a keyword-matching heuristic: we maintain a curated vocabulary of emotionally-charged words per class, then scan the input text for matches. Token weight is proportional to word length and specificity.
The reasoning sentence is a template filled with the top matching tokens, giving users a plain-language explanation of why the model landed on a given emotion.
Future versions will use attention-weight visualization or SHAP values for deeper explainability.