--- language: en license: mit tags: - text-summarization - summarization - bart - small-model - synthetic-data - tanaos - artifex base_model: - facebook/bart-base datasets: - tanaos/synthetic-summarization-dataset-v1 library_name: transformers task: type: summarization description: "Abstractive text summarization — condenses long documents into concise, fluent summaries." ---

Tanaos – Train task specific LLMs without training data, for offline NLP and Text Classification

# tanaos-text-summarization-v1: A small but performant text summarization model This model was created by Tanaos with the [Artifex Python library](https://github.com/tanaos/artifex). This is an **abstractive text summarization model** based on [facebook/bart-base](https://huggingface.co/facebook/bart-base) and fine-tuned on a synthetic dataset to produce concise, fluent summaries of longer texts. The model uses beam search decoding and is optimized for general-purpose summarization across a variety of domains. ## How to Use Use this model through the [Artifex library](https://github.com/tanaos/artifex): install Artifex with ```bash pip install artifex ``` use the model with ```python from artifex import Artifex summarizer = Artifex().text_summarization() text = """ The Amazon rainforest, often referred to as the "lungs of the Earth", produces about 20% of the world's oxygen and is home to an estimated 10% of all species on the planet. Deforestation driven by agriculture, logging, and infrastructure development has destroyed roughly 17% of the forest over the last 50 years, raising urgent concerns among scientists and policymakers about biodiversity loss and climate change. """ summary = summarizer(text) print(summary) # >>> "The Amazon rainforest produces 20% of the world's oxygen and harbors 10% of all species, but deforestation has been a major concern." ``` ## Model Description - **Base model:** `facebook/bart-base` - **Architecture:** `BartForConditionalGeneration` (sequence-to-sequence) - **Task:** Abstractive text summarization - **Language:** English - **Fine-tuning data:** A synthetic, custom dataset of document–summary pairs generated to cover a wide range of topics and writing styles. ## Training Details This model was trained using the [Artifex Python library](https://github.com/tanaos/artifex) ```bash pip install artifex ``` by providing the following instructions and generating synthetic training samples: ```python from artifex import Artifex summarizer = Artifex().text_summarization() summarizer.train( domain="general", num_samples=20000 ) ``` ## Intended Uses This model is intended to: - Condense long documents, articles, or reports into short, readable summaries. - Be used in applications such as news aggregators, document review tools, and content digests. - Serve as a general-purpose summarization model applicable across various industries and domains. Not intended for: - Highly technical or domain-specific texts where specialized terminology requires domain-adapted models. - Very short inputs (a few sentences) where summarization adds little value. - Tasks requiring factual grounding or citations.