Haystack review: A flexible LLM app builder

Monday September 9, 2024. 10:30 AM , from InfoWorld

Haystack is an open-source framework for building applications based on large language models (LLMs) including retrieval-augmented generation (RAG) applications, intelligent search systems for large document collections, and much more. Haystack is currently implemented exclusively in Python.

Haystack is also the foundation for deepset Cloud. deepset is the primary sponsor of Haystack, and several deepset employees are heavy contributors to the Haystack project.

Integrations with Haystack include models hosted on platforms, such as Hugging Face, OpenAI, and Cohere; models deployed on platforms, such as Amazon SageMaker, Microsoft Azure AI, and Google Vertex AI; and vector and document stores, such as Elasticsearch, OpenSearch, Pinecone, and Qdrant. In addition, the Haystack community has contributed integrations for tools that perform model evaluation, monitoring, and data ingestion.

Use cases for Haystack include RAG, chatbots, agents, generative multi-modal question answering, and information extraction from documents. Haystack provides functionality for the full scope of LLM projects, such as data source integration, data cleaning and preprocessing, models, logging, and instrumentation.

Haystack components and pipelines help you to assemble applications easily. While Haystack has many pre-built components, adding a custom component is as simple as writing a Python class. Pipelines connect components into graphs or multi-graphs (the graphs don’t need to be acyclic), and Haystack offers many example pipelines for common use cases. deepset Studio, a new product that allows AI developers to design and visualize custom AI pipelines, was announced August 12.

Haystack is one of the four most popular open-source application frameworks, along with LlamaIndex, LangChain, and Semantic Kernel.

What is Haystack?

As an open-source framework for building LLM applications, Haystack tries to do the important things correctly, rather than doing all the things. Haystack doesn’t have as many first-party integrations as, say, LangChain, but it owns and fully supports the 34 integrations it currently has. Haystack also offers 28 community-contributed integrations to tie the framework to less-popular data and document stores, models, tools, and APIs. I mention integrations before discussing the core framework architecture (see “Haystack concepts” below) because the integrations actually require more development effort than core orchestration capabilities.

In addition to valuing doing things right over doing all the things, Haystack tries to be explicit rather than implicit. That may mean writing more code the first time you create a pipeline, but in return for that extra initial effort you’ll find it much easier to debug, update, and maintain your pipeline. To counter the grind of writing a lot of explicit code, you can create your pipeline graphs visually with deepset Studio, discussed below.

The four major design goals of Haystack are to be technology agnostic, to be explicit (as we just discussed), to be flexible, and to be extensible. The Haystack repo README describes these as follows:

Technology agnostic: Allow users the flexibility to decide what vendor or technology they want and make it easy to switch out any component for another. Haystack allows you to use and compare models available from OpenAI, Cohere, and Hugging Face, as well as your own local models or models hosted on Azure, Bedrock, and SageMaker.

Explicit: Make it transparent how different moving parts can “talk” to each other so it’s easier to fit your tech stack and use case.

Flexible: Haystack provides all tooling in one place: database access, file conversion, cleaning, splitting, training, eval, inference, and more. And whenever custom behavior is desirable, it’s easy to create custom components.

Extensible: Provide a uniform and easy way for the community and third parties to build their own components and foster an open ecosystem around Haystack.

In addition to building retrieval-augmented generation applications using a vector database, Haystack can be used to expand RAG apps to create agents (what Microsoft calls copilots), as well as for question answering, information extraction, semantic search, and building applications that resolve complex queries. Haystack applications can use both off-the-shelf models and custom fine-tuned models. If you use multi-modal models, Haystack can also be used to perform image generation, image captioning, and audio transcription.

Haystack concepts

Essentially, Haystack gives you a way to build custom RAG pipelines with LLMs and vector stores. It is organized into components, document stores, data classes, and pipelines. Note that Haystack is agnostic about its vector search and vector embedding.

Haystack components

Pipeline components in Haystack are Python classes with methods that you can call directly. They implement a wide variety of functionality, from audio transcription to document writers. If there is functionality you need to add for your application, you can write a new custom component class using the Haystack Component API. If there is a third-party API or database that does what you need, the Component API will make it easy to connect to your pipelines, and Haystack will validate the connections between components before running the pipeline.

Generators are the components responsible for generating text responses after you give them a prompt. Generators are specific to the LLM they call, and there are two kinds of them: chat and non-chat. Chat generators are designed for conversations, and expect a list of messages for context. Non-chat generators are designed for simple text generation, for example summarization and translation.

Retrievers in Haystack extract documents from a document store that may be relevant to a user query, and pass that context to the next component in a pipeline, which in the simplest case is a generator. Retrievers are specific to the document store they use.

Haystack document stores

Document stores are Haystack objects that store your documents for later retrieval. They are specific to vector databases, except for the InMemoryDocumentStore, which implements document storage, vector embedding, and retrieval all by itself. The InMemoryDocumentStore component is for development and test, but it’s ephemeral and doesn’t scale to production.

Document store components support methods such as write_documents() and bm25_retrieval(). You can also use a DocumentWriter component to add a list of documents into a document store of your choice. DocumentWriter components are typically used in an indexing pipeline.

Haystack data classes

Data classes help Haystack components to communicate with each other, allowing data to flow through pipelines. Haystack data classes include ByteStream; Answer and its subclasses ExtractedAnswer, ExtractedTableAnswer, and GeneratedAnswer; ChatMessage; Document; and StreamingChunk. The Document class can contain text, dataframes, blobs, metadata, a score, and an embedding vector. The StreamingChunk represents a partially streamed LLM response.

Haystack pipelines

Pipelines combine components, document stores, and integrations into custom systems. They can be as simple as a basic RAG pipeline that queries a vector database for relevant data, uses that data to prompt an LLM, and returns the LLM’s response, or they can be arbitrarily complex graphs or multi-graphs that may include simultaneous flows, standalone components, and loops.

What is deepset Cloud?

deepset Cloud is a SaaS platform for building LLM applications and managing them across the whole life cycle, from prototyping to production. In a nutshell, it’s Haystack in the cloud with a nice GUI for development and test, and a REST interface for production.

With deepset Cloud you can preprocess your data and prepare it for search, design and evaluate pipelines, and run experiments to collect metrics about your pipeline performance and iterate. You can also share your pipelines with others for demonstration and testing. deepset Cloud includes Prompt Studio for prompt engineering, automatic scaling of deployed pipelines, and deepset Studio for visual pipeline design. Pipelines also can be created from a template, specified in YAML, or coded in Python using the API.

deepset Cloud Home. Note the functionality menu at the left, and the list of latest searches at the bottom.IDG

deepset Studio

deepset Studio is a new visual pipeline designer, currently in controlled beta and available for free. I tried it within deepset Cloud, but it’s also available as a standalone product.

deepset Studio lets you use a drag-and-drop user interface to access Haystack’s library of components and integrations. You can see all the components and their properties, combine them into pipelines, visualize their architectures, switch between 1:1 code and visual views, and export the final setup as a YAML file for use in different environments.

deepset Studio running in deepset Cloud. This pipeline is a fairly simple RAG implementation that handles half a dozen file formats for input documents, and (offscreen to the right) allows chat with gpt4-turbo.
IDG

Using pipeline templates is a fast way to get started with Haystack on deepset Cloud. Once you’ve created a pipeline from a template, you can edit it for your own application.
IDG

Getting started with Haystack

As described in the “get started” documentation, you can install Haystack with:

pip install haystack-ai

Then you should set a Haystack Secret or an OPENAI_API_KEY environment variable to the value of your OpenAI key. If you don’t have an OpenAI key yet, you can obtain one from the OpenAI platform. That may require you to sign up with OpenAI and supply a credit card. Don’t confuse signing up for ChatGPT access with signing up for OpenAI API access, because the two are separate. Using the API is relatively inexpensive.

Copy the Python code for the very simple starter RAG application from the Haystack documentation and paste it into an editor. I used Visual Studio Code.

As provided, the Python code at line 31 expects a Haystack Secret. If you chose to set an environment variable instead, the simplest way to modify line 31 is to remove the body of the function call, to read:

llm = OpenAIGenerator()

This will cause the Secret class to initialize from the first environment variable it finds. If you want to be more specific, you can use the api_key=Secret.from_env(“OPENAI_API_KEY”) form inside the OpenAIGenerator() call.

Run the program, and you should see output in the terminal:

[‘Jean lives in Paris.’]

There is a pop-up explanation (or recipe) for the code that you can access from the documentation by clicking on the box below the code labeled “2.0 RAG Pipeline.” You can also learn a lot by debugging the code and tracing into each Haystack function call to see how it works. Finally, you can follow the link to learn how to add your own custom data using document stores.

Haystack quick start Python code running in Visual Studio Code. I previously exported an OPENAI_API_KEY environment variable set to my OpenAI secret key value. I also modified line 31 as described in the text above to use the environment variable.
IDG

Haystack learning resources

You can learn more about Haystack from the documentation, the API reference, the tutorials and walkthroughs, the cookbook, the blog posts, and the source code repository. You can discuss Haystack on its Discord. You can sign up for a one-hour course Building AI Applications with Haystack for free on DeepLearning.AI, taught by Tuana Çelik, Developer Relations Lead at Haystack by deepset.

Overall, Haystack is a very good open-source framework for building LLM applications, and deepset Cloud is a very good SaaS platform for building LLM applications and managing them across the whole lifec ycle. deepset Studio is a good visual pipeline designer; the standalone version can convert pipeline diagrams to Python code, as well as view them in YAML as well as diagrams.

Haystack competes with LlamaIndex, LangChain, and Semantic Kernel. Honestly, all four frameworks will do the job for most LLM application use cases. As they are all open source, you can try and use them all for free. Their debugging tools differ, their programming language support differs, and the ways they have implemented cloud versions also differ. I’d advise you to try each one for a day or three with a clear but simple use case of your own as a goal and see which one works best for you.

Bottom Line

Haystack is a very good open-source framework for building LLM applications, and deepset Cloud is a very good SaaS platform for building LLM applications and managing them across the whole life cycle. deepset Studio is a good visual pipeline designer; the standalone version can convert pipeline diagrams to Python code.

Pros

            Open-source framework for building production-ready LLM applications

            Implemented in Python

            Good cloud SaaS implementation

            Good set of integrations with models, vector search, and tools

            Supports monitoring with Chainlit and Traceloop

Cons

      No implementation in programming languages other than Python

      deepset Studio is still in controlled beta except on deepset Cloud

Cost

Haystack 2.0: Free. deepset Cloud: Contact deepset.

Platforms

Python 3.8 or later.