๐Ÿ†• Haystack 2.30 is here! Pass a plain string to any ChatGenerator
Maintained by deepset

Integration: Hugging Face Transformers

Run Transformers models locally in your Haystack pipelines

Authors
deepset

Table of Contents

Overview

Transformers is Hugging Face’s library for state-of-the-art machine learning models. With this integration, you can run models from the Hugging Face Hub locally, on your own machine, in your Haystack pipelines.

Haystack supports Hugging Face models in other ways too:

  • Sentence Transformers for local embedding and ranking models
  • Hugging Face API to call models via Inference Providers, Inference Endpoints, or self-hosted TGI/TEI
  • Optimum for high-performance inference with ONNX Runtime

Installation

pip install transformers-haystack

Usage

Components

This integration provides several components that run Transformers models locally:

Chat Generation

Use TransformersChatGenerator to run a chat model locally:

from haystack_integrations.components.generators.transformers import TransformersChatGenerator
from haystack.dataclasses import ChatMessage

generator = TransformersChatGenerator(model="Qwen/Qwen3-0.6B")

messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))

Extractive Question Answering

Use TransformersExtractiveReader to extract answers from the relevant context:

from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack_integrations.components.readers.transformers import TransformersExtractiveReader

docs = [Document(content="Paris is the capital of France."),
        Document(content="Berlin is the capital of Germany."),
        Document(content="Rome is the capital of Italy."),
        Document(content="Madrid is the capital of Spain.")]
document_store = InMemoryDocumentStore()
document_store.write_documents(docs)

retriever = InMemoryBM25Retriever(document_store=document_store)
reader = TransformersExtractiveReader(model="deepset/roberta-base-squad2-distilled")

extractive_qa_pipeline = Pipeline()
extractive_qa_pipeline.add_component(instance=retriever, name="retriever")
extractive_qa_pipeline.add_component(instance=reader, name="reader")
extractive_qa_pipeline.connect("retriever.documents", "reader.documents")

query = "What is the capital of France?"
extractive_qa_pipeline.run(data={"retriever": {"query": query, "top_k": 3},
                                 "reader": {"query": query, "top_k": 2}})

Zero-Shot Document Classification

Use TransformersZeroShotDocumentClassifier to classify documents with labels of your choice, without fine-tuning:

from haystack import Document
from haystack_integrations.components.classifiers.transformers import TransformersZeroShotDocumentClassifier

documents = [Document(content="Today was a nice day!"),
             Document(content="Yesterday was a bad day!")]

classifier = TransformersZeroShotDocumentClassifier(
    model="cross-encoder/nli-deberta-v3-xsmall",
    labels=["positive", "negative"],
)

result = classifier.run(documents=documents)
print([doc.meta["classification"]["label"] for doc in result["documents"]])
# ['positive', 'negative']

Named Entity Recognition

Use TransformersNamedEntityExtractor to annotate named entities in documents:

from haystack import Document
from haystack_integrations.components.extractors.transformers import TransformersNamedEntityExtractor

documents = [
    Document(content="I'm Merlin, the happy pig!"),
    Document(content="My name is Clara and I live in Berkeley, California."),
]
extractor = TransformersNamedEntityExtractor(model="dslim/bert-base-NER")

results = extractor.run(documents=documents)["documents"]
annotations = [TransformersNamedEntityExtractor.get_stored_annotations(doc) for doc in results]
print(annotations)