Boosting AI: Build Your Chatbot Over Your Data With MongoDB Atlas Vector Search and LangChain Templates Using the RAG Pattern
Rate this tutorial
In this tutorial, I will show you the simplest way to implement an AI chatbot-style application using MongoDB Atlas Vector Search with LangChain Templates and the retrieval-augmented generation (RAG) pattern for more precise chat responses.
The retrieval-augmented generation (RAG) model enhances LLMs by supplementing them with additional, relevant data, ensuring grounded and precise responses for business purposes. Through vector search, RAG identifies and retrieves pertinent documents from databases, which it uses as context sent to the LLM along with the query, thereby improving the LLM's response quality. This approach decreases inaccuracies by anchoring responses in factual content and ensures responses remain relevant with the most current data. RAG optimizes token use without expanding an LLM's token limit, focusing on the most relevant documents to inform the response process.
To elaborate on the implementation: Consider a question-answering system that uses an LLM like OpenAI, operating with the RAG model. It starts with Atlas Vector Search to pinpoint relevant documents or text snippets within a database, providing the necessary context for the question. This context, along with the question, is then processed through OpenAI's API, enabling a more informed and accurate response.
Atlas Vector Search plays a vital role for developers within the retrieval-augmented generation framework. A key technology in funnelling external data into LLMs is LangChain. This framework facilitates the development of applications that integrate LLMs, covering a range of uses that align with the capabilities of language models themselves. These uses encompass tasks like document analysis and summarization, the operation of chatbots, and the scrutiny of code.
MongoDB has streamlined the process for developers to integrate AI into their applications by teaming up with LangChain for the introduction of LangChain Templates. This collaboration has produced a retrieval-augmented generation template that capitalizes on the strengths of MongoDB Atlas Vector Search along with OpenAI's technologies. The template offers a developer-friendly approach to crafting and deploying chatbot applications tailored to specific data sets. The LangChain templates serve as a deployable reference framework, accessible as a REST API via LangServe.
The alliance has also been instrumental in showcasing the latest Atlas Vector Search advancements, notably the
$vectorSearch
aggregation stage, now embedded within LangChain's Python and JavaScript offerings. The joint venture is committed to ongoing development, with plans to unveil more templates. These future additions are intended to further accelerate developers' abilities to realise and launch their creative projects.LangChain Templates present a selection of reference architectures that are designed for quick deployment, available to any user. These templates introduce an innovative system for the crafting, exchanging, refreshing, acquiring, and tailoring of diverse chains and agents. They are crafted in a uniform format for smooth integration with LangServe, enabling the swift deployment of production-ready APIs. Additionally, these templates provide a free sandbox for experimental and developmental purposes.
The
rag-mongo
template is specifically designed to perform retrieval-augmented generation utilizing MongoDB and OpenAI technologies. We will take a closer look at the rag-mongo
template in the following section of this tutorial.To get started, you only need to install the
langchain-cli
.Use the LangChain CLI to bootstrap a LangServe project quickly. The application will be named
my-blog-article
, and the name of the template must also be specified. I’ll name it rag-mongo
.This will create a new directory called my-app with two folders:
app
: This is where LangServe code will live.packages
: This is where your chains or agents will live.
We will need to insert data to MongoDB Atlas. In our exercise, we utilize a publicly accessible PDF document titled "MongoDB Atlas Best Practices" as a data source for constructing a text-searchable vector space. The data will be ingested into the MongoDB
langchain.vectorSearch
namespace.In order to do it, navigate to the directory
my-blog-article/packages/rag-mongo
and in the file ingest.py
, change the default names of the MongoDB database and collection. Additionally, modify the URL of the document you wish to use for generating embeddings.Creating and inserting embeddings into MongoDB Atlas using LangChain templates is very easy. You just need to run the
ingest.py
script. It will first load a document from a specified URL using the PyPDFLoader. Then, it splits the text into manageable chunks using the RecursiveCharacterTextSplitter
. Finally, the script uses the OpenAI Embeddings API to generate embeddings for each chunk and inserts them into the MongoDB Atlas langchain.vectorSearch
namespace.Now, it's time to initialize Atlas Vector Search. We will do this through the Atlas UI. In the Atlas UI, choose
Search
and then Create Search
. Afterwards, choose the JSON Editor to declare the index parameters as well as the database and collection where the Atlas Vector Search will be established (langchain.vectorSearch
). Set index name as default
. The definition of my index is presented below.Let's now take a closer look at the central component of the LangChain
rag-mongo
template: the chain.py
script. This script utilizes the MongoDBAtlasVectorSearch
class and is used to create an object —
vectorstore
— that interfaces with MongoDB Atlas's vector search capabilities for semantic similarity searches. The retriever
is then configured from vectorstore
to perform these searches, specifying the search type as "similarity."This configuration ensures the most contextually relevant document is retrieved from the database. Upon retrieval, the script merges this document with a user's query and leverages the
ChatOpenAI
class to process the input through OpenAI's GPT models, crafting a coherent answer. To further enhance this process, the ChatOpenAI class is initialized with the gpt-3.5-turbo-16k-0613
model, chosen for its optimal performance. The temperature is set to 0, promoting consistent, deterministic outputs for a streamlined and precise user experience.This class permits tailoring API requests, offering control over retry attempts, token limits, and response temperature. It adeptly manages multiple response generations, response caching, and callback operations. Additionally, it facilitates asynchronous tasks to streamline response generation and incorporates metadata and tagging for comprehensive API run tracking.
After successfully creating and storing embeddings in MongoDB Atlas, you can start utilizing the LangServe Playground by executing the
langchain serve
command, which grants you access to your chatbot.This will start the FastAPI application, with a server running locally at
http://127.0.0.1:8000
. All templates can be viewed at http://127.0.0.1:8000/docs
, and the playground can be accessed at http://127.0.0.1:8000/rag-mongo/playground/
.The chatbot will answer questions about best practices for using MongoDB Atlas with the help of context provided through vector search. Questions on other topics will not be considered by the chatbot.
Go to the following URL:
And start using your template! You can ask questions related to MongoDB Atlas in the chat.
By expanding the
Intermediate steps
menu, you can trace the entire process of formulating a response to your question. This process encompasses searching for the most pertinent documents related to your query, and forwarding them to the Open AI API to serve as the context for the query. This methodology aligns with the RAG pattern, wherein relevant documents are retrieved to furnish context for generating a well-informed response to a specific inquiry.We can also use
curl
to interact with LangServe
REST API and contact endpoints, such as /rag-mongo/invoke
:We can also send batch requests to the API using the
/rag-mongo/batch
endpoint, for example:For comprehensive documentation and further details, please visit
http://127.0.0.1:8000/docs
.In this article, we've explored the synergy of MongoDB Atlas Vector Search with LangChain Templates and the RAG pattern to significantly improve chatbot response quality. By implementing these tools, developers can ensure their AI chatbots deliver highly accurate and contextually relevant answers. Step into the future of chatbot technology by applying the insights and instructions provided here. Elevate your AI and engage users like never before. Don't just build chatbots — craft intelligent conversational experiences. Start now with MongoDB Atlas and LangChain!