Behind the scenes

The Algo Logic

Vibe Canvas is powered by a real transformer model, trained on human-labelled emotion data. Here's how it works.

01

The Dataset : GoEmotions

Google Research's GoEmotions contains 58,009 Reddit comments, each annotated with one or more of 27 fine-grained emotion labels. We map these 27 labels down to the 7 emotions displayed in the app (anger, disgust, fear, joy, neutral, sadness, surprise) using a label-grouping scheme.

58,009

annotated comments

27

original labels

→ 7

mapped emotions

Source: Google Research

02

The Model : DistilRoBERTa

emotion-english-distilroberta-base is a DistilRoBERTa model fine-tuned on six diverse emotion datasets (including Twitter, Reddit, and more). It outputs probability scores across all 7 emotion classes simultaneously.

Architecture

DistilRoBERTa-base

Parameters

~82M

Training data

6 emotion datasets

Output

7-class probabilities

The model runs on CPU in the FastAPI backend. No paid inference APIs, everything is self-hosted.

03

Evaluation Results

Evaluated on the GoEmotions validation split (mapped to 7 classes).

Overall Accuracy

65.4%

Macro F1

61.2%

joy
F1 76%
anger
F1 69%
fear
F1 64%
sadness
F1 65%
neutral
F1 63%
disgust
F1 55%
surprise
F1 55%
04

Interpretability

The “Key signals detected” tokens shown on the result screen are extracted using a keyword-matching heuristic: we maintain a curated vocabulary of emotionally-charged words per class, then scan the input text for matches. Token weight is proportional to word length and specificity.

The reasoning sentence is a template filled with the top matching tokens, giving users a plain-language explanation of why the model landed on a given emotion.

Future versions will use attention-weight visualization or SHAP values for deeper explainability.