{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "T8zE8SbZ0wUz"
      },
      "source": [
        "## Streaming model explorer for Haystack\n",
        "\n",
        "*notebook by Tilde Thurium:\n",
        " [Mastodon](https://tech.lgbt/@annthurium) || [Twitter](https://twitter.com/annthurium) || [LinkedIn](https://www.linkedin.com/in/annthurium/)*\n",
        "\n",
        "*Problem*: there are so many LLMs these days! Which model is the best for my use case?\n",
        "\n",
        "This notebook uses [Haystack](https://docs.haystack.deepset.ai/docs/intro) to compare the results of sending the same prompt to several different models.\n",
        "\n",
        "This is a very basic demo where you can only compare a few models that support streaming responses. I'd like to support more models in the future, so watch this space for updates.\n",
        "\n",
        "\n",
        "### Models\n",
        "\n",
        "All generators use Haystack's chat generator components, which support streaming out of the box:\n",
        "\n",
        "- [OpenAIChatGenerator](https://docs.haystack.deepset.ai/docs/openaichatgenerator) for OpenAI models.\n",
        "- [CohereChatGenerator](https://docs.haystack.deepset.ai/docs/coherechatgenerator) from the `cohere-haystack` integration.\n",
        "- [HuggingFaceAPIChatGenerator](https://docs.haystack.deepset.ai/docs/huggingfaceapichatgenerator) from the `huggingface-api-haystack` integration, used for three models hosted on the HuggingFace Serverless Inference API: `Qwen/Qwen3-32B`, `openai/gpt-oss-20b`, and `deepseek-ai/DeepSeek-V3-0324`.\n",
        "\n",
        "### Prerequisites\n",
        "\n",
        "- You need [HuggingFace](https://huggingface.co/docs/hub/security-tokens), [Cohere](https://docs.cohere.com/docs/connector-authentication), and [OpenAI](https://help.openai.com/en/articles/4936850-where-do-i-find-my-api-key) API keys. Save them as secrets in your Colab. Click on the key icon in the left menu or [see detailed instructions here](https://medium.com/@parthdasawant/how-to-use-secrets-in-google-colab-450c38e3ec75)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "AKjba4xoWzdX",
        "outputId": "6ca3534b-ffa5-4fd7-cb0e-f3a972bcaff4"
      },
      "outputs": [],
      "source": "!pip install -U haystack-ai cohere-haystack huggingface-api-haystack"
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lZWcQd7xvoIp"
      },
      "source": [
        "In order for `userdata.get` to work, these keys need to be saved as secrets in your Colab. Click on the key icon in the left menu or [see detailed instructions here](https://medium.com/@parthdasawant/how-to-use-secrets-in-google-colab-450c38e3ec75)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 337,
          "referenced_widgets": [
            "2e10135bc8e9494fac3aab173d4c2078",
            "a522c5d1bd8e49caae0800c9e18d39aa",
            "e481fa1615494e398f175627d7853958",
            "bf9b998e18ec4986a2ac78bd78cd4e7b",
            "7dc3369ca1e448c69981e462ddf018c6",
            "6359d627a68841ddbc363a7f356d2800",
            "8fbf421eab3b4a11b4c3e03dc30b98d5",
            "7102509d75ca4192a091a955f86710a2",
            "8247a42839ea43ba8db840f41eb14af6",
            "b3d85ead0ec6401aaa1b9771ac841687",
            "3abbf4bc3f704903a44b3679af12cc1c",
            "ff93186428c3412a9b88e3452ca4641f",
            "dc335416c2714c109636ce1104d04e78",
            "fdcc70feed0c4bedba4661088d7bc35a",
            "1bad7e451e3045838ed00decb755bcab",
            "db4d9c55f3fd464391766c42cfe9a46d",
            "815aa58d79a8420e90cc8c4931d09a69",
            "e6917b53a5484ebaade18b7c1273eca3",
            "962a8b0e360045b59cbcf30a3ae280b1",
            "7996b171bab842c2bb11db3d5d641956",
            "7ada1f6b6b8c480489333dd8f7d638d9",
            "8256d386740e482197076450fc29f7e5",
            "331026a0809140988ea4b1d680333c0b",
            "de412979dfa74945b1dfc3a9656c227c",
            "521114d351054104996f66e20e92a22a",
            "e5e8034f0afc48d89c48d4ace9f36d60",
            "491b2070c8e4404aab029061a5404428",
            "eb56f17825074741ab515e3532c49cfe",
            "0b900011a59146888ee27a389125d9d2",
            "80bbb90c18fc4ddcbd92ad603b032a5e",
            "3490d5bbde854aea8e566d4dfcde14ff",
            "a0c343d0ce294fa69e5a37d553d6e7b8",
            "5bc7d1547542406b95e0cae0a8d36958",
            "f11d1c4699d74aac9c36627037f7572c",
            "7cb978ba84ec41fe83e50abe8e52760c",
            "ce5ad1bb209e4c38837a2eed0a1d49ea",
            "978a31f14c664792928267235bedc7ab",
            "5f617626bcf24752b96b2a246c4b65d2",
            "44794f7082d24d1889029a9688f29854",
            "5078c4d07ea54c869c2f5b2d9ab8e27d",
            "0e35c7d9624e449bb1dfbb966c1a76d0",
            "03212b24eb3044b2be732359e7156bae",
            "99b4fdc589e941a0872fdf6b4ab9a497",
            "fc0570d0098d445f859ec694477b16bf",
            "10c123cfa15b44daa699aa2734440616",
            "b5d51cd8a50f4f4cb9068bc9188d599a",
            "d79f07c11bb8422eb1392ad29007f2d2",
            "48310d20da0e4e778b0d4d7dbda0c6a0",
            "4b0e811882de4bffa1da49b5ed498376",
            "e3b8e44a48f5453999f139075a8d3787",
            "112afd67581a44d795243c2260c8fb74",
            "4b99747fdf2746658bb86ce98ab25e1d",
            "85c7ef514fab4f09956b97fbc2deb391",
            "8e0e39185bea4af39aa930a3752df9aa",
            "cb15eb5ba16046f7b1bcbacd559de513",
            "09593eee10394f81ac031b77dbc72748",
            "bdcee6c5ba504c04881c34eb95d168a2",
            "b62310f725b74829ad02e5ec553ca2f2",
            "ddee9f4520cf44cf9931c95198a8ef0f",
            "4b0e2d3b24564b7390a4c9e9faba2727",
            "0d16352a156147578ec91cf5d0b7a2e4",
            "13063b52a1204abd8c91b2fa9d03b2fb",
            "989f73109ac4465faa126f0971d95600",
            "5a48a28a1f604dce9035322d92893ded",
            "8270ee28471949efbbda6723aa7aaf34",
            "e2c0c47a44cd485fb70f9d5daf79ac2f",
            "88cef145e1a940ce826518c040fac414",
            "28585ecc3ec741109a4644c2c0388c6e",
            "bc550e75b67e460dae22b24a4b63c334",
            "0b8328c37dc54284a97d6884641384f9",
            "845847e8aa4749e3950d89b65f12ca47",
            "f75cc063718640fa8bfff81e0284025f",
            "7122fe13a5124759b59842c9d6436a0e",
            "130c8a2aa0234203a1554336e1e8657b",
            "eae143ac81384ed88b02ec603ef8d97e",
            "378357e232014fbda8ad35eb6773f82e",
            "f7c4efbc37a842128a451adeebf76fcd",
            "8ca51f53d0b247b4ba9a27d617866ac0",
            "b460c8004db8474d954b799eb4508b46",
            "5f6d260e36b54e5eb53878f991da3435",
            "b7d8c72b754642d7a9759e9fc0804386",
            "edd6352483fb4c79b602dc02a27de77f",
            "36de93a7183b4ffe83d065a2265bdcb1",
            "6c3b4fd6eef345d6a81019e599bc6a82",
            "a6c6d445609b43b6a904c80accfdacbe",
            "77522a182e9c4408a68b011d2b69a311",
            "fcc10fc0f5234b8792c904b8d7896317",
            "a636c93c67d6483bacdc64a2749476a7",
            "0d1a5b74ba8444bface089e86f4eec7b",
            "34b1dd13bd1e478d999cd89ebc397316",
            "c11c346d5a5c44fc8383acfffff893e5",
            "a662e816583c46fa8441616ac2ce0eb8",
            "8f7a87db386547bfafc60c82966f93d7",
            "41c95bd3766140b9af6949b596999413",
            "93584f6a311f4795bf84f6ff2398b78b",
            "7f5ca9aee0354ff3806c4cc8b94d8f4a",
            "74e95fa99c40412a84ff636d54a3433f",
            "f59931950d3c44fbbdcc09a14e70c9f6",
            "f705472b7a0c4e249f89608d7aaaa100",
            "f3d638dee8bc45f4a1b143fe92a76e04",
            "5ddfd095212b4b6983c5090d74ced482",
            "6a63a02666d14be7b3f931113c5af21a",
            "7d7c58f40a57445eb25097259ec7ae63",
            "e53992b679904d86a0d771fc1cdd81f0",
            "b1f2e890c1314e1ea3633e634fb56a7f",
            "fa175478d3a64aa09f103f7e0b5853c8",
            "1cb0a448fe4d4b5b89d33ef3fa425d8e",
            "3f6b1af8626849b99a83a6bab131de04",
            "1b4b2634dc42426a8a066e01be061e05",
            "d5513a59e3d748358d06910331008cc4"
          ]
        },
        "id": "XWsAANJxXLaH",
        "outputId": "508d0343-417c-4b18-b864-70648467f005"
      },
      "outputs": [],
      "source": [
        "from haystack.components.generators.chat import OpenAIChatGenerator\n",
        "from haystack_integrations.components.generators.cohere import CohereChatGenerator\n",
        "from haystack_integrations.components.generators.huggingface_api import HuggingFaceAPIChatGenerator\n",
        "from haystack.dataclasses import ChatMessage\n",
        "from haystack.utils import Secret\n",
        "from google.colab import userdata\n",
        "\n",
        "open_ai_generator = OpenAIChatGenerator(api_key=Secret.from_token(userdata.get('OPENAI_API_KEY')))\n",
        "\n",
        "cohere_generator = CohereChatGenerator(api_key=Secret.from_token(userdata.get('COHERE_API_KEY')))\n",
        "\n",
        "hf_generator = HuggingFaceAPIChatGenerator(\n",
        "    api_type=\"serverless_inference_api\",\n",
        "    api_params={\"model\": \"Qwen/Qwen3-32B\"},\n",
        "    token=Secret.from_token(userdata.get('HF_API_KEY')))\n",
        "\n",
        "\n",
        "hf_generator_2 = HuggingFaceAPIChatGenerator(\n",
        "    api_type=\"serverless_inference_api\",\n",
        "    api_params={\"model\": \"openai/gpt-oss-20b\"},\n",
        "    token=Secret.from_token(userdata.get('HF_API_KEY')))\n",
        "\n",
        "\n",
        "hf_generator_3 = HuggingFaceAPIChatGenerator(\n",
        "    api_type=\"serverless_inference_api\",\n",
        "    api_params={\"model\": \"deepseek-ai/DeepSeek-V3-0324\"},\n",
        "    token=Secret.from_token(userdata.get('HF_API_KEY')))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "9_xznJXWe1M8"
      },
      "outputs": [],
      "source": [
        "MODELS = [open_ai_generator, cohere_generator, hf_generator, hf_generator_2, hf_generator_3]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zZsKIVCQ5mJ1"
      },
      "source": [
        "The `AppendToken` dataclass formats the output so that the model name is printed, and the text follows in chunks of 5 tokens."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "g1wwjOJVejJF"
      },
      "outputs": [],
      "source": [
        "from dataclasses import dataclass\n",
        "import ipywidgets as widgets\n",
        "from haystack.dataclasses import ChatMessage\n",
        "\n",
        "def output():...\n",
        "\n",
        "@dataclass\n",
        "class AppendToken:\n",
        "  output: widgets.Output\n",
        "  chunks = []\n",
        "  chunk_size = 5\n",
        "\n",
        "  def __call__(self, chunk):\n",
        "      with self.output:\n",
        "        text = getattr(chunk, 'content', '')\n",
        "        self.chunks.append(text)\n",
        "        if len(self.chunks) == self.chunk_size:\n",
        "          output_string = ' '.join(self.chunks)\n",
        "          self.output.append_display_data(output_string)\n",
        "          self.chunks.clear()\n",
        "\n",
        "def multiprompt(prompt, models=MODELS):\n",
        "  outputs = [widgets.Output(layout={'border': '1px solid black'}) for _ in models]\n",
        "  display(widgets.HBox(children=outputs))\n",
        "\n",
        "  for i, model in enumerate(models):\n",
        "    model_name = getattr(model, 'model', '')\n",
        "    outputs[i].append_display_data(f'Model name: {model_name}')\n",
        "    model.streaming_callback = AppendToken(outputs[i])\n",
        "    model.run(messages=[ChatMessage.from_user(prompt)])\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000,
          "referenced_widgets": [
            "35ebe0b3e2a1419cb5452b7b69662927",
            "178875466d004edb9134a1c96cbe4e79",
            "2dfdda7cb1f749d289695a2fa28c859b",
            "80564696d1a24476bf2948fe329d838e",
            "3e0d66ba5ae64c66a07d4a7bcb661449",
            "f0eefc3ebf9d42fdaf1becb2fc38e420",
            "1d75c21aae47449f879936ee9b1372ff",
            "ebbb5a89069f4f4a9e96df8e77a80a72",
            "4101c6be4549498cae703928d38225bf",
            "ea5cf56989064a139fe9061be34a5aa0",
            "07b7c04ba6bc48baa2c52364d4f7f837",
            "af4d0052c0fe4ff194a7fac28386d1c5"
          ]
        },
        "id": "O0CSf3f2gLfe",
        "outputId": "50fdb860-5512-4520-e070-00682cb3432f"
      },
      "outputs": [],
      "source": [
        "multiprompt(\"Tell me a cyberpunk story about a black cat.\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "t2IUAyLyOfeO"
      },
      "source": [
        "This was a very silly example prompt. If you found this demo useful, let me know the kinds of prompts you tested it with!\n",
        "\n",
        " [Mastodon](https://tech.lgbt/@annthurium) || [Twitter](https://twitter.com/annthurium) || [LinkedIn](https://www.linkedin.com/in/annthurium/)\n",
        "\n",
        "Thanks for following along."
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "gpuType": "T4",
      "provenance": []
    },
    "kernelspec": {
      "display_name": ".venv",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.14.1"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
