Instructions to use DReggio/mrbert-es-esg with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DReggio/mrbert-es-esg with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="DReggio/mrbert-es-esg")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("DReggio/mrbert-es-esg", dtype="auto") - Notebooks
- Google Colab
- Kaggle
MrBERT-es ESG News Classifier (Spanish)
Model summary
Fine-tuned ESG news classifier for Spanish equity market headlines. Based on BSC-LT/MrBERT-es (ModernBERT, 150M parameters, bilingual ES/EN). Classifies Spanish financial news headlines into ESG pillars (Environmental, Social, Governance) and sentiment (Positive / Negative / Neutral / NA). Training regime: human-gold annotations (1,688 events) augmented with LLaMA-3.1-8B SFT silver labels (62,800 events).
Repository layout
This repository ships four sibling sub-models β one per task head β
because the fine-tuned architecture is a shared BSC-LT/MrBERT-es
encoder with separate classification heads (three binary pillar heads
plus one 4-class sentiment head), not a single multi-class classifier.
mrbert-es-esg/
βββ esg_E/ binary head: Environmental pillar (0 = not-E, 1 = E)
βββ esg_S/ binary head: Social pillar (0 = not-S, 1 = S)
βββ esg_G/ binary head: Governance pillar (0 = not-G, 1 = G)
βββ sentiment/ 4-class head: {Pos, Neg, Neu, NA}
Each sub-folder contains:
model.safetensorsβ TAPT-adapted ModernBERT-es encoder (~599 MB)separate_head.ptβ classification head weights for this taskconfig.jsonβModernBertModelencoder configseparate_classifier_config.jsonβ head metadata (pillar,hidden_size,num_sentiment_classes)tokenizer.json,tokenizer_config.jsonβ fast tokenizer (ModernBERT uses a singletokenizer.json; novocab.txt/special_tokens_map.json)
ESG pillar predictions are independent binary classifications β the same headline can be flagged on any combination of E, S, G, or none (multi-label by design, per the project codebook).
Intended use
- ESG signal extraction from Spanish business press
- Event study research on ESG news and equity market response
- Spanish financial NLP benchmarking
Out of scope: high-stakes automated decisions without human review; languages other than Spanish.
How to use
The classification head is a custom nn.Module stored in
separate_head.pt, separate from the encoder weights β so the standard
AutoModelForSequenceClassification / pipeline("text-classification", ...) path does not work out of the box. To run inference, load the
encoder with AutoModel and apply the matching separate_head.pt for
each task.
Minimum loading sketch (encoder only β head loading uses the project's
SeparateClassifier class):
import torch
from transformers import AutoTokenizer, AutoModel
head = "esg_E" # or "esg_S", "esg_G", "sentiment"
tok = AutoTokenizer.from_pretrained(f"DReggio/mrbert-es-esg/{head}")
enc = AutoModel.from_pretrained(f"DReggio/mrbert-es-esg/{head}")
state = torch.load(f"DReggio/mrbert-es-esg/{head}/separate_head.pt")
# instantiate SeparateClassifier(...) from project repo, load state, forward.
Training data
Gold set: 1,688 human-annotated Spanish ESG events (E / S / G binary + 4-class sentiment). Silver augmentation (R3): 62,800 events labelled by a LLaMA-3.1-8B SFT annotator (variant4 prompt, masked-loss QLoRA, mean ΞΊ = 0.761 vs human gold). R3 was selected over R2 (Qwen3-8B SFT silver) because LLaMA's higher Governance base-rate yields richer positive-class signal for the G pillar; the R3 fine-tune lifts macro-F1 by 0.7β3.1 pp and MCC_G by 2.2β6.6 pp over the gold-only R1b baseline.
Evaluation (gold_test, n = 267)
| Metric | Value |
|---|---|
| Macro F1 | 0.8460 |
| F1_E | 0.8866 |
| F1_S | 0.8485 |
| F1_G | 0.8029 |
| MCC_G | 0.5953 |
| ΞΊ_sentiment | 0.7006 |
Training procedure
- Base model:
BSC-LT/MrBERT-es - Architecture: separate binary classification head per pillar (E / S / G)
- 4-class sentiment head, over a shared TAPT-adapted ModernBERT-es encoder
- Training regimes: R1b (gold only) β R2 (gold + Qwen3 silver) β R3 (gold + LLaMA silver, this checkpoint)
- Framework: HuggingFace Transformers + PyTorch
- Hardware: Google Colab Pro A100 40GB
Limitations
- Governance (G) pillar F1 0.80, MCC_G 0.60 β structurally ambiguous category; use with caution for G-specific downstream tasks.
- Silver-label bias: R3 inherits the labelling distribution of the LLaMA-3.1-8B SFT annotator, which has a higher G base-rate (47.5 %) than the Qwen3 silver corpus (34.0 %); applications sensitive to G prevalence should consider the gold-only R1b checkpoint or the Qwen3-silver R2 checkpoint as sensitivity comparisons.
- Training data: Spanish peninsular financial press (2014β2024); Latin American Spanish and social media not covered.
License
Apache 2.0 β derived from BSC-LT/MrBERT-es
(Β© Barcelona Supercomputing Center, Apache 2.0).
Citation
@mastersthesis{reggio2026esg,
author = {Damien Reggio},
title = {ESG News Classification and Market Response
in Spanish Equity Markets},
school = {FernUni Switzerland},
year = {2026}