Summary
In today's world, we have an abundance of options when it comes to selecting large language models (LLMs) for developing AI-driven applications. These LLMs differ greatly in their architecture, training data, intended use cases, and prompting techniques. Additionally, they often need to integrate with other systems for data retrieval or performance monitoring, adding another layer of complexity. This work focuses on exploring the application of LangChain in the development and utilization of LLMs. Python functions and data files needed to run this notebook are available via this link.
Figure below shows the state of Large Language Models (LLMs):
Harrison Chase created LangChain in October 2022 to change all of that. LangChain is an open-source framework that helps developers connect LLMs, data sources, and other functionality under a single, unified syntax. With LangChain, developers can create scalable, modular LLM applications for almost any use.
LangChain encompasses an entire ecosystem of tools, but in this course, we'll focus on the core functionality of the LangChain library. We will learn about chains and tools, which we can use to improve the quality of our LLM's output, and also discuss troubleshooting and evaluation techniques. Note that LangChain is available in Python and JavaScript, but this course will only cover the Python version.
LangChain has three main components:
prompts for turning user inputs into model inputs;
parsers, for organizing data for easy retrieval.
The system also includes chains and agents for creating workflows containing different components.
Hugging Face is a huge repository of open source datasets, tools, and most importantly for us, models! Using language models hosted on Hugging Face is free, but using them in LangChain requires a Hugging Face API token.
To create one, log in or create a Hugging Face account, and navigate to the URL shown under settings https://huggingface.co/settings/tokens. Here, click on 'New token' and copy your key.
With our key ready, let's leverage LangChain to utilize a model from Hugging Face and compare it to an OpenAI model. LangChain provides an OpenAI class and a HuggingFaceHub class to interact seamlessly with their respective APIs. After importing these classes, we define the LLM using the model name and API key.
For Hugging Face, we'll use the Falcon 7B
instruction-optimized model. We'll set up an unfinished sentence as input and have both models predict the next words. Finally, we'll print the results to compare the outputs. Despite using entirely different models from separate platforms, LangChain harmonizes their usage into a unified, modular workflow. In this case, the OpenAI model produces a longer response compared to the Falcon 7B instruction model. However, it’s important to note that a longer response doesn't necessarily equate to a better answer in all scenarios.
#pip install langchain
#pip install langchain-community
#pip install huggingface_hub
from langchain_community.llms import HuggingFaceHub
huggingfacehub_api_token = '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'
llm = HuggingFaceHub(repo_id='tiiuae/falcon-7b-instruct',
huggingfacehub_api_token=huggingfacehub_api_token)
question = 'Can you still have fun if'
output = llm.invoke(question)
print(output)
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\1477989867.py:4: LangChainDeprecationWarning: The class `HuggingFaceHub` was deprecated in LangChain 0.0.21 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-huggingface package and should be used instead. To use it run `pip install -U :class:`~langchain-huggingface` and import as `from :class:`~langchain_huggingface import HuggingFaceEndpoint``. llm = HuggingFaceHub(repo_id='tiiuae/falcon-7b-instruct', D:\Learning\MyWebsite\FinalGithub\AlreadyPublihsed\blogs\DataCamp_Intro_to_LangChain\vm_data_capm_langchain\lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
Can you still have fun if you don't win? Yes, you can still have fun even if you don't win. Winning is not the only way to have fun. You can have fun by participating in activities you enjoy, spending time with friends and family, and finding new ways to challenge yourself and learn new skills.
OpenAI's models are particularly well-regarded in the AI/LLM community; their high performance is largely part to the use of their proprietary technology and carefully selected training data. In contrast to the open-source models on Hugging Face, OpenAI's models do have costs associated with their use, but for many applications, they are currently the best choice to build on.
Due to LangChain's unified syntax, swapping one model for another only requires changing a small amount of code. In this exercise, you'll do just that!
To use OpenAI's models, you'll need an OpenAI API key. If you haven't created one of these before, first, visit their signup page. Next, navigate to the API keys page to create your secret key. If you've lost your key, you can create a new one here, too.
#pip install langchain-openai
from langchain_openai import OpenAI
import os
openai_api_key = "....."
llm = OpenAI(openai_api_key=openai_api_key)
question = 'Take care of your mental health if'
output = llm.invoke(question)
print(output)
you’re feeling alone 1. Practice self-care: Make sure you are taking care of yourself physically, emotionally, and mentally. This can include getting enough rest, eating well, exercising, and engaging in activities that bring you joy. 2. Reach out to loved ones: Don't be afraid to reach out to friends and family for support. They may not know how you are feeling unless you tell them. Talking to someone you trust can help you feel less alone. 3. Join a support group: Consider joining a support group where you can connect with others who are going through similar experiences. It can be comforting to know that you are not alone in your struggles. 4. Seek professional help: If you are feeling overwhelmed and struggling to cope, don't hesitate to seek help from a mental health professional. They can provide you with the tools and support you need to manage your feelings of loneliness. 5. Find a hobby or activity: Engaging in a hobby or activity that you enjoy can help you feel more connected to yourself and others. It can also give you a sense of purpose and fulfillment. 6. Practice mindfulness: Mindfulness techniques, such as meditation or deep breathing, can help you stay present and calm your mind. This can be especially helpful if you are
In this case, OpenAI model provides a longer response than Falcon's 7b instruct model. However, many cases exist where a longer response does not always mean a better answer. LangChain's tools can be used to optimize these outputs for our particular use case, and this will be one of the main focuses of this course.
In summary, LangChain is an excellent tool for working with natural language. In real-world production development, it enables intelligent interactions with documents, helping companies make informed business decisions, automate tasks, and explore new ways to analyze text data. LangChain simplifies AI integration while offering enhanced control over the entire workflow. For this course, we will use a specific version of LangChain: langchain==0.1.0
. If you choose to work with a newer version on your own system, ensure you consult the LangChain documentation for any updates or changes https://creativecommons.org/licenses/by-sa/3.0/deed.en
Let’s use LangChain to implement prompting strategies for chatbots, applicable to both OpenAI chat models and open-source chat models available on Hugging Face. In addition to OpenAI's chat models, LangChain provides access to thousands of chat-optimized language models via the Hugging Face Hub API.
To find chat-optimized language models, visit the models section on Hugging Face and filter by *Question Answering</span>*. Note down the model name for reference in your code. New models are regularly added to Hugging Face, expanding your options.
Many of these models are fine-tuned on domain-specific datasets, making them adept at understanding the nuances of particular regions, cultures, or tasks. Taking the time to search for the most suitable model for your specific use case is highly beneficial.
https://huggingface.co/models?pipeline_tag=question-answering&sort=trending
After selecting a model, we can begin prompting it by using a prompt template. Prompt templates serve as flexible and modular frameworks for generating prompts from user inputs. A template can include instructions, examples for few-shot prompting, and any additional context to help the model complete the task effectively.
Prompt templates are built using LangChain's PromptTemplate class. First, we create a structured template string designed to guide the AI in answering a question. The {question}
field is defined for dynamic input insertion at runtime. To convert this string into a prompt template compatible with our model, we use the PromptTemplate class, specifying the variables representing inputs via the input_variables
argument. Once the prompt template is ready, we can seamlessly integrate it into our model.
from langchain.prompts import PromptTemplate
template = "As an artificial intelligence assistant, please answer the question: {question}"
prompt = PromptTemplate(template=template, input_variables=["question"])
prompt
PromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, template='As an artificial intelligence assistant, please answer the question: {question}')
To begin passing user inputs, we call the .run() method on the chain, passing the input string.
from langchain.chains import LLMChain
from langchain_community.llms import HuggingFaceHub
llm = HuggingFaceHub(repo_id='tiiuae/falcon-7b-instruct',
huggingfacehub_api_token=huggingfacehub_api_token)
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What is LangChain and how it can be used?"
print(llm_chain.run(question))
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\645706002.py:6: LangChainDeprecationWarning: The class `LLMChain` was deprecated in LangChain 0.1.17 and will be removed in 1.0. Use :meth:`~RunnableSequence, e.g., `prompt | llm`` instead. llm_chain = LLMChain(prompt=prompt, llm=llm) C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\645706002.py:9: LangChainDeprecationWarning: The method `Chain.run` was deprecated in langchain 0.1.0 and will be removed in 1.0. Use :meth:`~invoke` instead. print(llm_chain.run(question))
As an artificial intelligence assistant, please answer the question: What is LangChain and how it can be used? LangChain is a blockchain-based language learning platform that allows users to learn and earn cryptocurrency while learning a new language. It offers a gamified approach to language learning, where users can earn tokens by completing courses and interacting with the platform. The platform offers courses in various languages, including English, Spanish, French, and Mandarin. Users can also earn tokens by creating and sharing their own courses, as well as by participating in various community-driven activities.
LangChain offers specialized classes for working with chat models, such as ChatPromptTemplate and ChatOpenAI. The ChatOpenAI class provides chat-specific functionality beyond what the standard OpenAI class offers. You can instantiate the model as usual, ensuring to include your OpenAI API key.
To create a chat prompt template for the model, use the .from_messages() method of the ChatPromptTemplate class. This allows you to specify messages for the various OpenAI chat roles, including system, human, and ai. Like the standard PromptTemplate, input variables are denoted using curly brackets. Once the template is set up, inputs can be passed to it using the .format_messages() method. Finally, you can send the formatted prompt to the model to observe its response.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
llm = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)
prompt_template = ChatPromptTemplate.from_messages(
[
("system", "You are omnipotent."),
("human", "Answer this question: {question}")
]
)
full_prompt = prompt_template.format_messages(question='What is the reason for creting snakes?')
llm(full_prompt)
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\857107592.py:12: LangChainDeprecationWarning: The method `BaseChatModel.__call__` was deprecated in langchain-core 0.1.7 and will be removed in 1.0. Use :meth:`~invoke` instead. llm(full_prompt)
AIMessage(content='Snakes were created as part of the diverse ecosystem on Earth. They play important roles in controlling populations of rodents and other pests, helping to maintain balance in the food chain. Additionally, snakes have unique adaptations and characteristics that make them fascinating creatures to study and appreciate.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 53, 'prompt_tokens': 29, 'total_tokens': 82, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-bdea1407-d4b7-452f-aa51-acbaa32e8e7e-0', usage_metadata={'input_tokens': 29, 'output_tokens': 53, 'total_tokens': 82, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})
Memory plays a crucial role in conversations with chat models. It enables features like follow-up questions, iterative refinement of model responses, and the ability for chatbots to adapt to user preferences and behaviors. While LangChain provides tools to customize and enhance in-conversation memory for chatbots, it remains constrained by the model's context window.
An LLM's context window refers to the maximum amount of input text the model can process at once when generating a response. The size of this window varies across different models. LangChain offers a standardized approach for optimizing model memory, and we’ll explore three key LangChain classes designed for implementing chatbot memory:
ChatMessageHistory
¶The message history stores the entire conversation between the user and the model, enabling follow-up questions and iterative refinement of responses. Let’s implement this feature in an OpenAI model. First, we import the ChatMessageHistory
and ChatOpenAI
classes and define the LLM.
To initialize the conversation history, create an instance of ChatMessageHistory
and assign it to a variable. We’ll kick off the conversation with an AI message, which helps establish the tone and direction. Use the .add_ai_message()
method to add the AI message to the history. Similarly, user messages can be added using the .add_user_message()
method.
To pass these messages to the model, simply call the model on the messages
attribute of the history object. And that's it—our conversational history is now integrated!
from langchain.memory import ChatMessageHistory
from langchain.chat_models import ChatOpenAI
chat = ChatOpenAI(temperature=0.2, openai_api_key=openai_api_key)
history = ChatMessageHistory()
history.add_ai_message("Hi! Ask me anything about Math.")
history.add_user_message("Describe cintral limit theorem?")
chat(history.messages)
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\1518057443.py:3: LangChainDeprecationWarning: The class `ChatOpenAI` was deprecated in LangChain 0.0.10 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-openai package and should be used instead. To use it run `pip install -U :class:`~langchain-openai` and import as `from :class:`~langchain_openai import ChatOpenAI``. chat = ChatOpenAI(temperature=0.2, openai_api_key=openai_api_key)
AIMessage(content='The Central Limit Theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution. In other words, if you take multiple random samples from a population and calculate the mean of each sample, the distribution of those sample means will be approximately normally distributed, even if the original population is not normally distributed. This theorem is a fundamental concept in statistics and is used in various statistical analyses and hypothesis testing.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 95, 'prompt_tokens': 26, 'total_tokens': 121, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-faba9346-58fb-4779-93e9-02a049e89355-0')
history.add_user_message("Does this rule apply for all distributions like uniform, trangular and random?")
chat(history.messages)
AIMessage(content='The Central Limit Theorem (CLT) is a fundamental concept in statistics that states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution. This means that even if the original population distribution is not normal (e.g., uniform, triangular, exponential, etc.), the distribution of sample means will tend to be normal as the sample size increases.\n\nIn other words, the Central Limit Theorem applies to a wide range of distributions, not just normal distributions. As long as the sample size is sufficiently large (usually n ≥ 30 is considered a rule of thumb), the distribution of sample means will approximate a normal distribution, regardless of the shape of the original population distribution. This property makes the Central Limit Theorem a powerful tool in statistical inference and hypothesis testing.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 167, 'prompt_tokens': 45, 'total_tokens': 212, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-4b4d84a3-2165-4631-ba38-9bc746b3406d-0')
history.add_user_message("Summarize the conversations")
chat(history.messages)
AIMessage(content='The Central Limit Theorem states that the sampling distribution of the sample mean will be approximately normally distributed, regardless of the shape of the original population distribution, as long as the sample size is sufficiently large. This theorem applies to a wide range of distributions, including uniform, triangular, and random distributions. In summary, the Central Limit Theorem allows us to make inferences about population parameters based on sample means, even if the original population distribution is not normal.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 91, 'prompt_tokens': 54, 'total_tokens': 145, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-48305e52-6cb0-4b14-bbe9-dc9ef4a4327b-0')
We can use different tools to manage memory usage in LLM applications, and we can even integrate external data to give the models even more context. These different tools can change and process information at different speeds as responses are generated by the system, so it's not the case that one solution fits all cases.
Retrieved from DataCamp course: "Developing LLM Applications with LangChain"
ConversationBufferMemory
¶The first memory optimization tool we'll explore is ConversationBufferMemory
. This feature provides the application with a rolling buffer that retains the most recent messages in the conversation. The number of messages to store can be customized using the size argument, with older messages automatically discarded as new ones are added.
To integrate this memory type with a model, we use a specialized chain designed for conversations: ConversationChain
. Additionally, setting verbose=True allows the model to display its decision-making process alongside its results, providing greater transparency.
from langchain.memory import ConversationBufferMemory
from langchain_openai import OpenAI
from langchain.chains import ConversationChain
chat = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=1, openai_api_key=openai_api_key)
memory = ConversationBufferMemory(size=4)
chain_buffer = ConversationChain(llm=chat, memory=memory, verbose=True)
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\328167441.py:5: LangChainDeprecationWarning: Please see the migration guide at: https://python.langchain.com/docs/versions/migrating_memory/ memory = ConversationBufferMemory(size=4) C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\328167441.py:6: LangChainDeprecationWarning: The class `ConversationChain` was deprecated in LangChain 0.2.7 and will be removed in 1.0. Use :meth:`~RunnableWithMessageHistory: https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html` instead. chain_buffer = ConversationChain(llm=chat, memory=memory, verbose=True)
Let's pass the chain a series of inputs. verbose=True tells the predict function to display the output and memory state for each question, so you can see exactly what context the model had. We've limited the memory shown here to only the final question, which contains the context from the previous messages.
chain_buffer.predict(input="Describe a logistic regression in two sentence")
chain_buffer.predict(input="When it can be applied?")
chain_buffer.predict(input="What are its limitation?")
chain_buffer.predict(input="What was my second question? I forgot.")
> Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: Human: Describe a logistic regression in two sentence AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: Human: Describe a logistic regression in two sentence AI: Logistic regression is a statistical method used for predicting the probability of a binary outcome based on one or more independent variables. It uses a logistic function to map the input variables to a linear regression model and outputs a probability between 0 and 1, with values closer to 1 indicating a higher likelihood of the event occurring. Human: When it can be applied? AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: Human: Describe a logistic regression in two sentence AI: Logistic regression is a statistical method used for predicting the probability of a binary outcome based on one or more independent variables. It uses a logistic function to map the input variables to a linear regression model and outputs a probability between 0 and 1, with values closer to 1 indicating a higher likelihood of the event occurring. Human: When it can be applied? AI: Logistic regression can be applied in various fields such as marketing, finance, biostatistics, and social sciences. It is commonly used for predicting binary outcomes, such as whether a customer will make a purchase or if a patient will respond to a new medication. Human: What are its limitation? AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: Human: Describe a logistic regression in two sentence AI: Logistic regression is a statistical method used for predicting the probability of a binary outcome based on one or more independent variables. It uses a logistic function to map the input variables to a linear regression model and outputs a probability between 0 and 1, with values closer to 1 indicating a higher likelihood of the event occurring. Human: When it can be applied? AI: Logistic regression can be applied in various fields such as marketing, finance, biostatistics, and social sciences. It is commonly used for predicting binary outcomes, such as whether a customer will make a purchase or if a patient will respond to a new medication. Human: What are its limitation? AI: Like any statistical model, logistic regression has its limitations. It assumes that the relationship between the independent variables and the outcome is linear, and it is not suitable for predicting continuous outcomes. It also assumes that the observations are independent, which may not be true in some cases. Additionally, it is sensitive to overfitting and requires a large sample size to accurately estimate the coefficients. Human: What was my second question? I forgot. AI: > Finished chain.
' Your second question was "When can it be applied?" and I provided several examples of potential applications for logistic regression. Did you have any other questions for me?'
ConversationSummaryMemory
¶Summarizing key points from a conversation is another effective way to optimize memory. The ConversationSummaryMemory
class condenses the conversation over time, allowing the chat model to retain important context without needing to store the entire conversation history.
To implement this, we’ll use the ConversationChain
again. First, we instantiate the model that will handle the conversation. Unlike ConversationBufferMemory
, ConversationSummaryMemory
requires an LLM as an argument to generate summaries. Essentially, with each new message, an LLM call is made to summarize the conversation history. Defining the conversation chain is as straightforward as before, showcasing the simplicity and modularity of the LangChain framework.
from langchain.memory import ConversationSummaryMemory
chat = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0, openai_api_key=openai_api_key)
memory = ConversationSummaryMemory(llm=OpenAI(model_name="gpt-3.5-turbo-instruct",
openai_api_key=openai_api_key))
chain_summary = ConversationChain(llm=chat, memory=memory, verbose=True)
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\2154888729.py:3: LangChainDeprecationWarning: Please see the migration guide at: https://python.langchain.com/docs/versions/migrating_memory/ memory = ConversationSummaryMemory(llm=OpenAI(model_name="gpt-3.5-turbo-instruct",
Let's ask the model three questions in sequence using ConversationSummaryMemory
. The output shown is the memory that will be used to respond to the final question, which we can see is a summary of the first two inputs and responses.
chain_summary.predict(input="Today is very rainy but had to go outside to buy some grocery")
chain_summary.predict(input="I had invited some of my frineds to my house for dinner party but did not have meat, wine and fruits?")
chain_summary.predict(input="Unfortunately, although I prepared all the foods for dinner, no one showed up because of sever thonder storm ?")
chain_summary.predict(input="I do not know what I should do with all the foods. Can you help?")
chain_summary.predict(input="Any suggenstion to ensure something like this will not happen to me?")
> Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: Human: Today is very rainy but had to go outside to buy some grocery AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: The human mentions going outside in the rain to buy groceries. The AI expresses sympathy and asks what kind of groceries were needed. Human: I had invited some of my frineds to my house for dinner party but did not have meat, wine and fruits? AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: The human mentions going outside in the rain to buy groceries. The AI expresses sympathy and asks what kind of groceries were needed. The human reveals they were planning a dinner party but didn't have meat, wine, and fruits. The AI expresses further sympathy and asks what specific types of these items were needed. Human: Unfortunately, although I prepared all the foods for dinner, no one showed up because of sever thonder storm ? AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: The human mentions going outside in the rain to buy groceries. The AI expresses sympathy and asks what kind of groceries were needed. The human reveals they were planning a dinner party but didn't have meat, wine, and fruits. The AI expresses further sympathy and asks what specific types of these items were needed. The human then explains that the dinner party was cancelled due to a severe thunderstorm and the AI expresses even more sympathy, asking about the specific groceries that were needed for the party. Human: I do not know what I should do with all the foods. Can you help? AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: The human mentions going outside in the rain to buy groceries. The AI expresses sympathy and asks what kind of groceries were needed. The human reveals they were planning a dinner party but didn't have meat, wine, and fruits. The AI expresses further sympathy and asks what specific types of these items were needed. The human then explains that the dinner party was cancelled due to a severe thunderstorm and the AI expresses even more sympathy, asking about the specific groceries that were needed for the party. The human expresses uncertainty about what to do with the food and the AI offers to help by suggesting creative recipes using the ingredients available. Human: Any suggenstion to ensure something like this will not happen to me? AI: > Finished chain.
" I'm sorry to hear that your dinner party was cancelled due to the thunderstorm. What specific groceries were you planning to buy for the party? Perhaps I can help you come up with some creative recipes using the ingredients you already have."
from langchain.memory import ConversationSummaryMemory
chat = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0, openai_api_key=openai_api_key)
memory_new = ConversationSummaryMemory(llm=OpenAI(model_name="gpt-3.5-turbo-instruct",
openai_api_key=openai_api_key))
chain_summary_new = ConversationChain(llm=chat, memory=memory_new, verbose=True)
chain_summary_new.predict(input="""Mia had always loved the quaint little bookshop at
the corner of Maple Street. Every Saturday morning, she would wander through its
narrow aisles, the scent of aged paper and ink a comforting presence. It was
on one such morning, while perusing the dusty shelves in the back, that she
stumbled upon an old, leather-bound journal. The cover was worn, and the pages
yellowed with time, but something about it called to her. As she opened it,
she realized it wasn't just any journal—it was filled with intricate drawings and
cryptic notes, seemingly leading to a hidden treasure.
""")
chain_summary_new.predict(input="""Intrigued by the mystery, Mia decided to follow the
clues detailed within the journal. Each drawing seemed to depict a different landmark
in her town, places she had known all her life but never looked at closely. She spent
the next few weeks deciphering the codes and visiting these locations, discovering hidden
messages and secret symbols carved into stone and wood. As the pieces of the puzzle started
to come together, Mia felt a sense of adventure she hadn't experienced since childhood.
The thrill of the hunt consumed her, and she began to dream of what the treasure might be.
""")
chain_summary_new.predict(input="""One evening, as the sun set in a blaze of orange and pink,
Mia found herself standing in front of an old, abandoned lighthouse at the edge of town,
the final location marked in the journal. Heart pounding with anticipation, she climbed
the rickety stairs to the top, where she discovered a small, rusted box hidden under a
loose floorboard. Inside, instead of gold or jewels, she found a collection of letters
and photographs from the early 1900s, telling the love story of a young couple separated
by war. Mia realized that the true treasure was not material wealth, but the poignant
history and love preserved in those letters, a testament to enduring love and the
passage of time. As she read through the heartfelt words, she felt a deep connection
to the past and a newfound appreciation for the stories hidden in her own town.
""")
chain_summary_new.predict(input="make summary of two sentences?")
> Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: Human: Mia had always loved the quaint little bookshop at the corner of Maple Street. Every Saturday morning, she would wander through its narrow aisles, the scent of aged paper and ink a comforting presence. It was on one such morning, while perusing the dusty shelves in the back, that she stumbled upon an old, leather-bound journal. The cover was worn, and the pages yellowed with time, but something about it called to her. As she opened it, she realized it wasn't just any journal—it was filled with intricate drawings and cryptic notes, seemingly leading to a hidden treasure. AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: The human describes Mia's love for a quaint bookshop on Maple Street and her chance discovery of an old journal filled with cryptic notes and drawings. The AI expresses fascination and offers to provide more information about the bookshop. Human: Intrigued by the mystery, Mia decided to follow the clues detailed within the journal. Each drawing seemed to depict a different landmark in her town, places she had known all her life but never looked at closely. She spent the next few weeks deciphering the codes and visiting these locations, discovering hidden messages and secret symbols carved into stone and wood. As the pieces of the puzzle started to come together, Mia felt a sense of adventure she hadn't experienced since childhood. The thrill of the hunt consumed her, and she began to dream of what the treasure might be. AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: The human describes Mia's discovery of a mysterious journal in a quaint bookshop on Maple Street. Intrigued, Mia follows the clues within the journal and uncovers hidden messages in familiar landmarks. The AI expresses fascination and reveals that the bookshop has a history of puzzles and riddles, possibly left behind by the previous owner. The human and AI discuss the excitement and adventure that such a simple place can hold. Human: One evening, as the sun set in a blaze of orange and pink, Mia found herself standing in front of an old, abandoned lighthouse at the edge of town, the final location marked in the journal. Heart pounding with anticipation, she climbed the rickety stairs to the top, where she discovered a small, rusted box hidden under a loose floorboard. Inside, instead of gold or jewels, she found a collection of letters and photographs from the early 1900s, telling the love story of a young couple separated by war. Mia realized that the true treasure was not material wealth, but the poignant history and love preserved in those letters, a testament to enduring love and the passage of time. As she read through the heartfelt words, she felt a deep connection to the past and a newfound appreciation for the stories hidden in her own town. AI: > Finished chain. > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: The human describes Mia's discovery of a mysterious journal in a quaint bookshop on Maple Street. Intrigued, Mia follows the clues within the journal and uncovers hidden messages in familiar landmarks. The AI expresses fascination and reveals that the bookshop has a history of puzzles and riddles, possibly left behind by the previous owner. The human and AI discuss the excitement and adventure that such a simple place can hold. They also discuss Mia's later discovery of a collection of letters and photographs hidden in an old lighthouse, showing the power of stories and history in unexpected places. Human: make summary of two sentences? AI: > Finished chain.
' Sure, the summary of the conversation is that Mia discovered a mysterious journal in a quaint bookshop on Maple Street and followed its clues to uncover hidden messages in familiar landmarks. The AI reveals that the bookshop has a history of puzzles and riddles, possibly left behind by the previous owner. Mia also discovered a collection of letters and photographs hidden in an old lighthouse, showing the power of stories and history in unexpected places.'
Pre-trained language models do not have direct access to private or proprietary data sources. The process of integrating these data sources into the model's response generation is called Retrieval-Augmented Generation (RAG).
This process begins with a user query, which is sent to an application built using a framework like LangChain. The query is then converted into a vector representation.
The model searches for the most relevant documents in a vector database by comparing them to the user's query. It ranks the documents based on their relevance using a selected distance metric.
Next, the most relevant documents from the vector database are combined with the user's query and sent to the model. The model processes this combined information and generates a response, which is then returned to the user via the application.
There are three main steps in developing Retrieval-Augmented Generation (RAG) with LangChain:
LangChain offers over 160 document loaders, some of which are provided by third parties that handle unique document formats. These loaders support a wide variety of sources, including Amazon S3, Microsoft, Google Cloud, Jupyter notebooks, pandas DataFrames, unstructured HTML, YouTube audio transcripts, and more. LangChain provides excellent documentation for all of its document loaders, and you'll find that the implementation for different formats is often quite similar.
LangChain offers various types of PDF loaders, with detailed documentation available for each. In this tutorial, we'll begin with the PyPDFLoader
. This class loads one document per page, including the PDF metadata. To use it, we instantiate the PyPDFLoader
class and provide the path to the PDF file we want to load. We then use the .load()
method to load the document and assign the result to a variable, such as data
. After loading the document, we can check the output to confirm that the document has been loaded successfully. Keep in mind that the PyPDFLoader
requires the installation of the pypdf
package as a dependency.
#pip install pypdf
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader("data/2024_February.pdf")
data = loader.load()
data[0]
Document(metadata={'source': 'data/2024_February.pdf', 'page': 0}, page_content='Pre Authorized Amount to be Withdrawn Mar 11.....$ 463.31\nIf payment is received after 2024 March 11,the following late payment fees will apply:\nA one-time late payment fee of 3.25% on Current Charges.\nSummary of Your Account\nPrevious Charges and Credits\nPrevious balance ..............................................................$ 362.97\nPayment we processed on FEB 12. Thank you................$ 362.97 CR\nBalance Forward ............................................................$ 0.00\nElectricity ....................................(GST: $5.11) $ 102.19\nNatural Gas ................................(GST: $11.21) $ 224.10\nSubtotal ............................................................................$ 326.29\nWater Treatment and Supply............................................$ 33.11\nWastewater Collection and Treatment..............................$ 44.77\nStormwater Management .................................................$ 14.59\nWaste and Recycling........................................................$ 28.23\nSubtotal ............................................................................$ 120.70\nTotal GST .........................................................................$ 16.32\nTotal Current Charges .....................................................$ 463.31\nTotal Amount Due ............................................................$ 463.31')
When loading CSVs, the syntax is very similar, but instead we use the CSVLoader
class. We're seeing a pattern forming!
from langchain_community.document_loaders.csv_loader import CSVLoader
loader = CSVLoader(file_path='./data/Churn_Modelling.csv')
data = loader.load()
#data[:4]
from langchain_community.document_loaders import TextLoader
loader = TextLoader("./data/Paper.txt", encoding = 'UTF-8')
data = loader.load()
len(data)
1
Different document loaders in LangChain are designed to work with various document formats, but the overall syntax for loading documents remains consistent. For third-party document formats, many libraries are available. For example, we can use the Hacker News Loader to retrieve the top stories from Hacker News through its URL.
Hacker News (HN) is a social news platform focused on computer science, technology, and entrepreneurship, run by the startup incubator Y Combinator. The site allows content submissions that cater to "anything that gratifies one's intellectual curiosity."
To load data from Hacker News, we use loader.load()
and assign the result to a variable. After loading, we can inspect the first element of the data to check its contents. Additionally, the document metadata can be accessed by querying the metadata
attribute.
from langchain_community.document_loaders import HNLoader
loader = HNLoader("https://news.ycombinator.com")
data = loader.load()
data
[]
Document splitting refers to dividing a loaded document into smaller segments, known as chunks. Chunking is especially helpful for ensuring long documents fit within an LLM's context window. A basic approach could involve splitting the document into lines as they appear in the text. While simple to implement, this method may be problematic, as key context for understanding one line might be located in a different line, leading to incomplete processing.
There isn't a one-size-fits-all strategy for document splitting. Instead, it's a matter of experimenting with various methods to find the optimal balance between preserving context and managing chunk size. LangChain offers two primary document splitting methods:
CharacterTextSplitter
divides text based on a specified separator, processing individual characters.RecursiveCharacterTextSplitter
splits the text recursively by applying multiple separators until the chunk size is within the specified limit.When splitting documents into chunks, chunk overlap is crucial to ensure that context is maintained across chunks. To illustrate this, consider two overlapping chunks with the overlap shown in green. The extra overlap helps preserve context between chunks. If a model struggles with losing context or misunderstanding information when generating responses from external sources, increasing the chunk overlap may improve accuracy and coherence.
CharacterTextSplitter
¶quote = """Success is not final, failure is not fatal: It is the courage to continue that counts.
The journey of a thousand miles begins with one step. In the end, we will remember not
the words of our enemies, but the silence of our friends."""
len(quote)
233
chunk_size = 30
chunk_overlap = 10
from langchain.text_splitter import CharacterTextSplitter
ct_splitter = CharacterTextSplitter(
separator = '.',
chunk_size = chunk_size,
chunk_overlap = chunk_overlap)
docs = ct_splitter.split_text(quote)
print(docs)
Created a chunk of size 85, which is longer than the specified 30 Created a chunk of size 54, which is longer than the specified 30
['Success is not final, failure is not fatal: It is the courage to continue that counts', 'The journey of a thousand miles begins with one step', 'In the end, we will remember not \nthe words of our enemies, but the silence of our friends']
RecursiveCharacterTextSplitter
¶from langchain.text_splitter import RecursiveCharacterTextSplitter
rc_splitter = RecursiveCharacterTextSplitter(
chunk_size = chunk_size,
chunk_overlap = chunk_overlap)
docs = rc_splitter.split_text(quote)
print(docs)
['Success is not final, failure', 'failure is not fatal: It is', 'It is the courage to continue', 'continue that counts.', 'The journey of a thousand', 'thousand miles begins with', 'with one step. In the end, we', 'end, we will remember not', 'the words of our enemies, but', 'but the silence of our', 'of our friends.']
RecursiveCharacterTextSplitter
by HTML¶The splitting functionality in LangChain is not limited to plain text and can be applied to other document formats like PDFs and HTML. As we've seen earlier, LangChain provides specialized document loader classes for various formats. For example, to load an HTML document, we can use the UnstructuredHTMLLoader
class and its .load()
method.
In this case, let's say we have a local file summarizing an executive order that isn't part of the model's training data. After loading the document, we can apply a splitter to break it into chunks. For formats such as HTML and other non-string types, we use the split_documents
method rather than split_text
to perform the splitting operation.
This approach ensures that, regardless of the document format, we can process the text efficiently for use in language models while maintaining the context necessary for accurate responses.
#pip install unstructured
from langchain_community.document_loaders import UnstructuredHTMLLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
loader = UnstructuredHTMLLoader("data/Machine_Learning_Overview.html")
data = loader.load()
rc_splitter = RecursiveCharacterTextSplitter(
chunk_size = chunk_size,
chunk_overlap = chunk_overlap,
separators = ['.']
)
html = rc_splitter.split_documents(data)
print(html[0])
page_content='Table of Contents 1 ENPE 519' metadata={'source': 'data/Machine_Learning_Overview.html'}
Now that we've explored document loading and splitting, let's complete the Retrieval-Augmented Generation (RAG) workflow by focusing on storing and retrieving information using vector databases.
To make documents retrievable, we first encode and embed the chunks, transforming them into vector representations that capture the content and meaning of the text. These vectors are then stored in a vector database, where each chunk is indexed based on its similarity. This allows the system to quickly evaluate and retrieve relevant chunks when queried.
The vector database plays a crucial role in efficiently managing the chunks and their similarity scores, ensuring that only the most relevant chunks are retrieved during the RAG process. This enables faster, more accurate responses by directly connecting user queries to the most relevant information.
LangChain offers a variety of vector databases, each with its own advantages, and some may not be suitable for certain use cases.
When choosing a solution, consider whether an open-source option is necessary.
It's also important to evaluate the legal implications of storing data on external servers—cloud-based storage may not be permissible in all cases.
The required storage capacity should also be factored in. While a lightweight in-memory database may suffice in some scenarios, others may need more robust solutions.
In this blog, we will focus on Chroma due to its lightweight nature and ease of setup.
To split and store the text, use the .split_text()
method. For documents, use the .split_documents()
method.
quote = """Success is not final, failure is not fatal: It is the courage to continue that counts.
The journey of a thousand miles begins with one step. In the end, we will remember not
the words of our enemies, but the silence of our friends."""
from langchain.text_splitter import RecursiveCharacterTextSplitter
chunk_size = 40
chunk_overlap = 15
splitter = RecursiveCharacterTextSplitter(
chunk_size=chunk_size,
chunk_overlap=chunk_overlap
)
docs = splitter.split_text(quote)
docs
['Success is not final, failure is not', 'failure is not fatal: It is the courage', 'is the courage to continue that counts.', 'that counts.', 'The journey of a thousand miles begins', 'miles begins with one step. In the end,', 'In the end, we will remember not', 'the words of our enemies, but the', 'but the silence of our friends.']
Now that we've parsed the data, it's time to embed it. We need to select an embeddings model to transform the data and store it in the vector database. There are hundreds of embeddings models available on Hugging Face, which you can explore to find the best fit for your use case. To use these models, you'll also need the additional Hugging Face library called transformers. However, in this video and the upcoming exercises, we'll be using an OpenAI model for embeddings instead.
docs
['Success is not final, failure is not', 'failure is not fatal: It is the courage', 'is the courage to continue that counts.', 'that counts.', 'The journey of a thousand miles begins', 'miles begins with one step. In the end,', 'In the end, we will remember not', 'the words of our enemies, but the', 'but the silence of our friends.']
#pip install chromadb
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
embedding_function = OpenAIEmbeddings(openai_api_key=openai_api_key)
vectordb = Chroma(embedding_function=embedding_function)
vectordb.persist()
docstorage = Chroma.from_texts(docs, embedding_function)
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\1307043483.py:6: LangChainDeprecationWarning: The class `Chroma` was deprecated in LangChain 0.2.9 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-chroma package and should be used instead. To use it run `pip install -U :class:`~langchain-chroma` and import as `from :class:`~langchain_chroma import Chroma``. vectordb = Chroma(embedding_function=embedding_function) C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\1307043483.py:7: LangChainDeprecationWarning: Since Chroma 0.4.x the manual persistence method is no longer supported as docs are automatically persisted. vectordb.persist()
from langchain.chains import RetrievalQA
qa = RetrievalQA.from_chain_type(llm=OpenAI(model_name="gpt-3.5-turbo-instruct",
openai_api_key=openai_api_key,),
chain_type="stuff", retriever=docstorage.as_retriever())
query = "What is failure?"
print(qa.run(query))
Failure is not fatal. It is the courage to continue that counts.
loader = PyPDFLoader('data/Paper.pdf')
data = loader.load()
splitter = RecursiveCharacterTextSplitter(
chunk_size=250,
chunk_overlap=60,
separators=['.'])
paper = splitter.split_documents(data)
embedding_model = OpenAIEmbeddings(openai_api_key=openai_api_key)
docstorage_paper = Chroma.from_documents(paper, embedding_model)
# Define the function for the question to be answered with
qa = RetrievalQA.from_chain_type(
OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.2, openai_api_key=openai_api_key),
chain_type="stuff", retriever=docstorage_paper.as_retriever()
)
# Run the query on the documents
question = "What is data leakage?"
results = qa(question)
print(results["result"])
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\3031139129.py:21: LangChainDeprecationWarning: The method `Chain.__call__` was deprecated in langchain 0.1.0 and will be removed in 1.0. Use :meth:`~invoke` instead. results = qa(question)
Data leakage is when information from the validation data is unintentionally used in the model, leading to biased results. It can also refer to a highly skewed distribution of data that affects the accuracy of predictive models.
from langchain_community.llms import HuggingFaceHub
huggingfacehub_api_token = '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'
llm = HuggingFaceHub(repo_id='mistralai/Mistral-7B-Instruct-v0.2',
huggingfacehub_api_token=huggingfacehub_api_token)
question = 'Can you still have fun if'
output = llm.invoke(question)
print(output)
Can you still have fun if you’re not drinking? Absolutely! Here are some ideas for a fun and alcohol-free night out in the city. 1. Explore the night markets Toronto has a vibrant night market scene, and they’re a great place to spend an evening without drinking. The Kensington Market Night Market is a popular choice, with a variety of food vendors, live music, and local artists selling their wares. The St. Lawrence Market Night is another great
#pip install sentence-transformers
from langchain_community.embeddings.sentence_transformer import (
SentenceTransformerEmbeddings,)
# create the open-source embedding function
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\3879088051.py:4: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-huggingface package and should be used instead. To use it run `pip install -U :class:`~langchain-huggingface` and import as `from :class:`~langchain_huggingface import HuggingFaceEmbeddings``. embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
from chromadb.errors import InvalidDimensionException
loader = PyPDFLoader('data/Paper.pdf')
data = loader.load()
splitter = RecursiveCharacterTextSplitter(
chunk_size=500,
chunk_overlap=20,
separators=['.'])
paper = splitter.split_documents(data)
# If you can't find the directory where your index/collection
# is stored in order to remove it, you can use a workaround that works for me.
try:
docstorage_paper = Chroma.from_documents(paper, embedding_function)
except InvalidDimensionException:
Chroma().delete_collection()
docstorage_paper = Chroma.from_documents(paper, embedding_function)
# Define the function for the question to be answered with
qa = RetrievalQA.from_chain_type(llm,
chain_type="stuff", retriever=docstorage_paper.as_retriever()
)
# Run the query on the documents
question = "What is advantage of Multivariate Bootstrapping?"
results = qa(question)
print(results["result"])
Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. . Multivariate Bootstrapping (Khan and Deutsch, 2016; Rezvandehy and Deutsch, 2017; Rezvandehy et al., 2019) is another approach that can be applied to quantify the uncer- tainty in the distribution of each feature, and then replace each missing value with a random sample from the distribution. This approach is fast, quantifies the uncertainty in imputation and the correla- tion between features are reproduced. However, the missing data cannot be simulated by conditioning on non-missing values Data-Centric Engineering 9 Figure 4. a) Cholesky decomposition of correlation matrix for 𝑛 features (well properties). b) LU unconditional simulation. c) LU conditional simulation.. efficient because it just picks a random instance at every iteration and computes the gradients based only on that single instance (Bottou, 2012; Géron, 2019). • Logistic Regression is a simple approach to estimate the probability of a particular class . There are complex techniques such as multivariate imputation by chained equation (MICE) (Buuren and Groothuis-Oudshoorn, 2010) and deep learning (DataWing, 2022). However, the imputation using these techniques can be quite slow and computationally expensive for large datasets. They may also need special software, distributional assumption, and the uncertainty in imputation of missing data can not be taken into account . To evaluate the efficiency of the proposed imputation technique a synthetic example is considered with four correlated features with 10000 data as shown in Fig6-a. Features 1 and 2 are Gaussian and lognormal distributions, respectively while features 3 and 4 are triangular distributions with different statistics (mean and mode). Fig6-b shows the correlation matrix between features (below diagonal elements) and percentage of missing data for each bivariate feature (above diagonal elements) Question: What is advantage of Multivariate Bootstrapping? Helpful Answer: Multivariate Bootstrapping is an advantageous approach for imputing missing data because it is fast and efficiently quantifies the uncertainty in the imputation for each feature. It also reproduces the correlation between features. However, it cannot simulate missing data by conditioning on non-missing values.
print(results["result"].split("Helpful Answer:",1)[1])
Multivariate Bootstrapping is an advantageous approach for imputing missing data because it is fast and efficiently quantifies the uncertainty in the imputation for each feature. It also reproduces the correlation between features. However, it cannot simulate missing data by conditioning on non-missing values.
from chromadb.errors import InvalidDimensionException
#loader = PyPDFLoader('data/Paper.pdf')
loader = PyPDFLoader('./pdfs/offer_test.pdf')
data = loader.load()
splitter = RecursiveCharacterTextSplitter(
chunk_size=100,
chunk_overlap=100,
separators=['.'])
paper = splitter.split_documents(data)
# If you can't find the directory where your index/collection
# is stored in order to remove it, you can use a workaround that works for me.
try:
docstorage_paper = Chroma.from_documents(paper, embedding_function)
except InvalidDimensionException:
Chroma().delete_collection()
docstorage_paper = Chroma.from_documents(paper, embedding_function)
# Define the function for the question to be answered with
qa = RetrievalQA.from_chain_type(llm,
chain_type="stuff", retriever=docstorage_paper.as_retriever()
)
# Run the query on the documents
question = """What is the conclusion of this validation?
The output should be one of these categories: "Unacceptable", "Acceptable but remediation required", "Acceptable but improvement required", "Unacceptable"
"""
## Run the query on the documents
#question = """Extract the following values described in table 1a: "Severity", "Observation Description".
#
# Expected output:
# {{"Severity": "Medium", "Observation Description": Model Execution * MRM noticed that the model development team did not have UAT tests ( parallel test and User Acceptance Test) for this model, "Severity": "High", "Observation Description": Model Building Code ● MRM reviewed the code presented by the model owner and notes that the code to create Target=0 is not correct.}}
#
#?"""
results = qa(question)
#print(results["result"])
print(results["result"].split("Helpful Answer:",1)[1])
The conclusion of this validation is "Acceptable but improvement required". During the validation process, MRM identified some issues with the model, specifically instances of negative class being misclassified as positive class and vice versa. While the overall accuracy of the model was not specified in the context provided, it is mentioned that improvements are required. The validation also ensured that overfitting did not occur by comparing the performance of the training and validation sets.
LCEL is a crucial component of the LangChain toolkit, enabling the connection of prompts, models, and retrieval components through a pipe operator, rather than relying on task-specific classes. It also allows for the creation of complex workflows that are well-suited for production environments. These chains support batch processing, streaming, and asynchronous execution, making it easy to integrate with other LangChain tools such as LangSmith and LangServe.
To begin using LCEL for a chatbot, we first define the chat model and prompt template, just as we have done previously. Instead of using a chain class, we integrate the prompt template into the model using LCEL's pipe operator. To execute the chain, we use the .invoke()
method, passing in the prompt template’s input variables as a dictionary. The response will be an AIMessage object, with the result located in the content argument.
# OpenAI
from langchain.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
model = ChatOpenAI(openai_api_key=openai_api_key)
prompt = ChatPromptTemplate.from_template("""As you are a capable personal assistant, would you answer the
following question: {question}""")
chain = prompt | model
print(chain.invoke({"question": "Can we travel to Mars in 2050?"}))
content='As of now, there are plans and ongoing research by various space agencies and private companies to send humans to Mars in the near future, with some aiming for as early as the 2030s. However, there are still numerous challenges that need to be overcome before a manned mission to Mars can be successfully carried out, including technological, logistical, and health considerations. While it is possible that we may see a manned mission to Mars by 2050, it is difficult to predict with certainty at this time.' additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 102, 'prompt_tokens': 33, 'total_tokens': 135, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-c66e73b2-27cd-4373-a28d-a6b8bea8b545-0'
# HuggingFace
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.llms import HuggingFaceHub
huggingfacehub_api_token = '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'
llm_hf = HuggingFaceHub(repo_id='tiiuae/falcon-7b-instruct',
huggingfacehub_api_token=huggingfacehub_api_token)
prompt = ChatPromptTemplate.from_template("""As you are a capable personal assistant, would you answer the
following question: {question}""")
chain = prompt | llm_hf
print(chain.invoke({"question": "Can we travel to Mars in 2050?"}))
Human: As you are a capable personal assistant, would you answer the following question: Can we travel to Mars in 2050? Mini While it is currently not possible for humans to travel to Mars in 2050, there are plans in place to make it a possibility in the future. NASA is already working on developing the necessary technology and infrastructure to make this happen. User
In addition to invoking the chain and receiving the response in a single step, LCEL offers two other methods for calling a chain. The .stream()
method allows for streaming the response to the end user in iterative chunks, which is especially useful when generating long responses that may take some time to complete. Alternatively, we can use the .batch()
method to process multiple inputs at once. This method takes a list of dictionaries and returns a list of AIMessages, each containing the corresponding response.
for chunk in chain.stream({"question": "What's the name of Iran country 1500 years ago?"}):
print(chunk)
Human: As you are a capable personal assistant, would you answer the following question: What's the name of Iran country 1500 years ago? Mini The name of Iran country 1500 years ago was Persia. User
inputs = [{"question": "What's the name of Iran country 1500 years ago?"},
{"question": "What snakes do to scape the heat?"}]
results = chain.batch(inputs)
for result in results:
print(result)
Human: As you are a capable personal assistant, would you answer the following question: What's the name of Iran country 1500 years ago? Mini The name of Iran country 1500 years ago was Persia. User Human: As you are a capable personal assistant, would you answer the following question: What snakes do to scape the heat? Mini Snakes have a few ways to escape the heat. Some of them will seek shade, while others will try to find a cool spot to rest. Some snakes will also try to absorb heat from the ground by using their scales to heat up and then cool down. User
Runnables in LCEL are classes that contain predefined functions or actions, which can be executed as part of the expression. They enable the integration of specific tasks, such as data retrieval or format conversion, into the chain. Acting as building blocks, they enhance the versatility and power of LangChain expressions. Some examples include:
RunnablePassthrough
: A runnable used in RAG (Retrieval-Augmented Generation) to pass input directly to the model.
RunnableLambda
: This runnable allows us to transform the input before passing it to the next component in the chain.
RunnableMap
: This runnable enables multiple components to process inputs in parallel. The results can then be merged into a single input for the next component in the LangChain application.
RAG operations with LCEL mainly utilize the RunnablePassThrough
runnable function. We'll follow the same RAG structure as before, but with the new LCEL syntax to define and invoke the chain for a response. In addition to the standard RAG libraries, we'll also import RunnablePassThrough
and a class called StrOutputParser
, which automatically converts the LCEL output into a string.
Our RAG workflow begins in the same way as before:
Next, we create our ChatPromptTemplate
with two input variables: context and question. The context will be retrieved from the vector store, and the question will be the user input.
This is where LCEL comes into play. First, the inputs are defined in a dictionary. The context is retrieved from the vector database, and the question is the user input, passed through the RunnablePassThrough()
function. These inputs are then piped into the prompt template, which is passed into the model, and the result is converted into a string.
Finally, we'll invoke the chain to get the response. LCEL provides the flexibility to create complex chains that involve numerous components, making it ideal for sophisticated production applications.
from langchain_core.runnables import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser
model = ChatOpenAI(openai_api_key=openai_api_key, temperature=0)
# If you can't find the directory where your index/collection
# is stored in order to remove it, you can use a workaround that works for me.
try:
vectorstore = Chroma.from_texts(["Sandian village is near Rezvanshar in North Iran near Caspain sea. The weather there is moderate"],
embedding=OpenAIEmbeddings(openai_api_key=openai_api_key))
except InvalidDimensionException:
Chroma().delete_collection()
vectorstore = Chroma.from_texts(["Nothing is shaking on Shakedown Street."],
embedding=OpenAIEmbeddings(openai_api_key=openai_api_key))
retriever = vectorstore.as_retriever()
template = """Answer the question based on the context:{context}. Question: {question}"""
prompt = ChatPromptTemplate.from_template(template)
chain = ({"context": retriever, "question": RunnablePassthrough()} | prompt | model | StrOutputParser())
chain.invoke("What is Sandian?")
Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1
'There is no mention of Sandian in the provided context.'
Now that we've starting getting familiar with LCEL, we will explore how chains can be used to handle complex information routing.
1. Generation chains: newly generated text or response. Such as
ChatOpenAI
andChatAnthropic
2. Retrieval chains: retrieve information from external sources. Such as
WikipediaRetriever
,Chroma
3. Preprocessing chains. Such as
StrOutputParser
handles tasks like language detection or transformation.
If a prompt is longer than the model's token limit, it may create problems.
In Sequential chains: output from one call becomes the input for another call. Here is an example of sequential chain
tourguid_prompt_1 = PromptTemplate.from_template(
"""You are an expert tour guild. Mention the most popular places to visit
in your region. question: {question}. Your answer: """)
transn_prompt_2 = PromptTemplate.from_template(
"""You are an expert translator. Translate the answer
in the native language . Item: {answer}. Your translation:""")
llm = ChatOpenAI(openai_api_key=openai_api_key)
chain = ({"answer": tourguid_prompt_1 | llm | StrOutputParser()} | transn_prompt_2 | llm | StrOutputParser())
chain.invoke({"question": "I am in BandareAnzali, Iran, where I should go to visit first?"})
'به عنوان یک راهنمای تور حرفه ای در بندر انزلی، ایران، توصیه می کنم با بازدید از دریاچه زیبای بندر انزلی شروع کنید. این جاذبه طبیعی، با چشم انداز زیبا، جانوران متنوع و فرصت های تور قایق و مشاهده پرندگان، باید دیده شود. بعد، پیشنهاد می کنم بازدید از بازار تاریخی بندر انزلی را در نظر بگیرید، جایی که می توانید خود را در فرهنگ محلی فرو ببرید، برای یافتن سوغاتی های منحصر به فرد، و طعم گذاری از غذاهای خوشمزه پارسی. در نهایت، فرصت دیدن روستای زیبای ماسوله را از دست ندهید، که به خاطر معماری خیره کننده و چشم اندازهای کوهستانی زیبا شناخته می شود. این تنها چند مکان محبوب برای بازدید در بندر انزلی هستند، و من مطمئن هستم که تجربه ای فراموش ناپذیر از کاوش این منطقه زیبا خواهید داشت.'
llm = ChatOpenAI(openai_api_key=openai_api_key)
#
prompt_1 = ChatPromptTemplate.from_template("what is the age of earth")
chain1 = prompt_1 | llm
#
prompt_2 = ChatPromptTemplate.from_template("Divide {age} by the age of Albert Einstein when he died.")
chain2 = prompt_2 | llm
#
answer1 = chain1.invoke({})
answer2 = chain2.invoke({"age": answer1.content})
print("Age of earth:", answer1.content)
print("Result of division:", answer2.content)
Age of earth: The age of Earth is estimated to be around 4.54 billion years old. Result of division: 4.54 billion years / 76 years = 59,736,842.1 Therefore, the age of Earth is estimated to be around 59,736,842 times older than Albert Einstein was when he died.
RunnablePassthrough
in Chains¶RunnablePassthrough()
is used to pass values between chains
from langchain_core.runnables import RunnablePassthrough
prompt_1 = ChatPromptTemplate.from_template("You are a helpful helper. Please answer the question: {input}")
prompt_1_q_response = (prompt_1 | llm | {"response": RunnablePassthrough() | StrOutputParser()})
#
prompt_2 = ChatPromptTemplate.from_template(
"You are a challenger. Describe the most powerful opposing idea for {response}")
prompt_2_contrarian_response = (prompt_2 | llm | StrOutputParser())
#
final_chain = (
{"response": prompt_1_q_response, "opposing_view": prompt_2_contrarian_response}
| ChatPromptTemplate.from_messages([("ai", "{response}"),
("human", "Response:\n{response}\n\nOpposing view:\n{opposing_view}"),
("system", "Summarize the original response and an opposing response.")])
| llm
| StrOutputParser()
)
print(final_chain.invoke({"input": "What is the best dish in Iran?", "response": "", "opposing_response": ""}))
The original response highlighted the popular and delicious Iranian dish called "Chelow Kabab," consisting of grilled skewers of meat served with saffron-infused rice and yogurt sauce, a staple in Iranian cuisine. The opposing response argued that technology and automation will create more jobs than it destroys, citing factors such as increased demand for skilled workers, new emerging industries, and historical examples of technological advancements leading to job creation. It emphasized the potential for positive outcomes and opportunities for workers in the future due to advancements in technology.
Agents: use language models to decide which actions to take
Tools: functions used by the agent to interact with the system (utilities, chains, more agents)
#pip install numexpr
from langchain.agents import initialize_agent, AgentType, load_tools
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0, openai_api_key=openai_api_key)
tools = load_tools(["llm-math"], llm=llm)
zero_shot_agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
zero_shot_agent.run("What is the calculation of 10 powered 3?")
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\252818712.py:5: LangChainDeprecationWarning: The function `initialize_agent` was deprecated in LangChain 0.1.0 and will be removed in 1.0. Use :meth:`~Use new agent constructor methods like create_react_agent, create_json_agent, create_structured_chat_agent, etc.` instead. zero_shot_agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
> Entering new AgentExecutor chain... I should use a calculator to solve this problem. Action: Calculator Action Input: 10 ** 3 Observation: Answer: 1000 Thought: I now know the final answer. Final Answer: 1000 > Finished chain.
'1000'
Custom tools are user-defined functions for agents to complete tasks:
Examples:
Custom tools
StructuredTool
format_tool_to_openai_function
Custom tools are user-defined functions that agents can utilize to perform specific tasks. There are several approaches for creating these tools, depending on the requirements, and we’ll cover three of them.
StructuredTool
class provides a more sophisticated alternative. from langchain.agents import tool
@tool # decorator
def calculate_bmr(peron_name: str):
"""Calculate Basal Metabolic Rate (BMR) using the Mifflin-St Jeor Equation and provide a body judgment."""
# (weight:float, height: float, age: int, gender: str)
weight = 98
height = 184
age = 44
gender = 'male'
if gender == 'male':
bmr = 10 * weight + 6.25 * height - 5 * age + 5
elif gender == 'female':
bmr = 10 * weight + 6.25 * height - 5 * age - 161
else:
raise ValueError("Gender must be 'male' or 'female'")
# Calculate BMI
height_m = height / 100
bmi = weight / (height_m ** 2)
# Provide a judgment based on BMI
if bmi < 18.5:
body_judgment = "Underweight"
elif 18.5 <= bmi < 24.9:
body_judgment = "Normal weight"
elif 25 <= bmi < 29.9:
body_judgment = "Overweight"
else:
body_judgment = "Obese"
return bmr, body_judgment
calculate_bmr.args
{'peron_name': {'title': 'Peron Name', 'type': 'string'}}
To use a custom tool, you first need to import the necessary classes. Then, create a tools list, where each tool represents a different function. For instance, the Tool object for the FinanceReport
tool would refer to the financial_report
function and include a description. After that, instantiate the LLM and initialize the agent. Define a question string, and then run the agent by calling its run
method with the question.
# Import libraries
from langchain.agents import tool, AgentType, Tool, initialize_agent
from langchain_openai import OpenAI
# Define the previously created tool in a list
tools = [Tool(name="calculate_bmr",
func=calculate_bmr,
description="Use this to calculate Basal Metabolic Rate (BMR).",)]
# Define the model and the agent
llm = OpenAI(temperature=0, openai_api_key=openai_api_key)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
# Define a question and run the agent
question = """calculate Basal Metabolic Rate (BMR) for Ali"""
agent.run(question)
> Entering new AgentExecutor chain... I should use the calculate_bmr tool to calculate Ali's BMR Action: calculate_bmr Action Input: 'Ali' Observation: (1915.0, 'Overweight') Thought: I now know Ali's BMR is 1915.0 and he is classified as 'Overweight' Final Answer: Ali's BMR is 1915.0 and he is classified as 'Overweight' > Finished chain.
"Ali's BMR is 1915.0 and he is classified as 'Overweight'"
StructuredTool
¶The StructuredTool
class allows us to precisely define the formats of input and output variables, ensuring consistency.
To use StructuredTool
, the necessary library must be imported. We create an instance of StructuredTool
called calculate_bmr_tool
from the calculate_bmr
function, making it usable as a tool within LangChain's agent framework. To utilize the structured capabilities, set the agent type as Structured_Chat_Zero_Shot_React_Description
. The default OpenAI model is instantiated, and the agent is initialized. The agent then executes the factorial_tool
with an input of 5 and returns the result.
OpenAI models require tools to be defined with specific parameters: input_name, output_name, function_name, tool_name, and a description. By printing the arguments of the financial_report
function, we see that it expects the input parameter company_name
, and the function name is financial_report
. The model also needs the description field to understand how to use the tool.
We can add a description to the tool manually with just a few lines of code. First, import the BaseModel
and Field
classes from pydantic
. Then, create a new class that inherits from BaseModel
, defining a string named 'query' that holds the tool description as a Field
object. Finally, when defining the function using the tool decorator, pass the class we created to the args_schema
argument and define the financial_report
function as before.
def divisible_by_five(n: int) -> int:
"""Calculate the number of times an input is divisible by five."""
n_times = n // 5
return n_times
from langchain.agents import initialize_agent, AgentType
from langchain_openai import OpenAI
from langchain.tools import StructuredTool
factorial_tool = StructuredTool.from_function(divisible_by_five)
llm = OpenAI(temperature=0, openai_api_key=openai_api_key)
agent = initialize_agent(tools=[factorial_tool],llm=llm,
agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
verbose=True)
result = factorial_tool.func(n=12)
print(result)
2
#@tool # decorator
def calculate_bmr(weight:float, height: float, age: int, gender: str) -> float:
"""Calculate Basal Metabolic Rate (BMR) using the Mifflin-St Jeor Equation and provide a body judgment."""
if gender == 'male':
bmr = 10 * weight + 6.25 * height - 5 * age + 5
elif gender == 'female':
bmr = 10 * weight + 6.25 * height - 5 * age - 161
else:
raise ValueError("Gender must be 'male' or 'female'")
# Calculate BMI
height_m = height / 100
bmi = weight / (height_m ** 2)
# Provide a judgment based on BMI
if bmi < 18.5:
body_judgment = "Underweight"
elif 18.5 <= bmi < 24.9:
body_judgment = "Normal weight"
elif 25 <= bmi < 29.9:
body_judgment = "Overweight"
else:
body_judgment = "Obese"
return bmr, body_judgment
from langchain.agents import initialize_agent, AgentType
from langchain_openai import OpenAI
from langchain.tools import StructuredTool
calculate_bmr_tool = StructuredTool.from_function(calculate_bmr)
llm = OpenAI(temperature=0, openai_api_key=openai_api_key)
agent = initialize_agent(tools=[calculate_bmr_tool],llm=llm,
agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
verbose=True)
# Define a question and run the agent
question = """calculate Basal Metabolic Rate (BMR) for Ali for if his weight is 98,
height is 184, age of 44 and gender='male' """
agent.run(question)
> Entering new AgentExecutor chain... Action: ``` { "action": "calculate_bmr", "action_input": { "weight": 98, "height": 184, "age": 44, "gender": "male" } } ``` Observation: (1915.0, 'Overweight') Thought: I know what to respond Action: ``` { "action": "Final Answer", "action_input": "Ali's BMR is 1915.0 and he is overweight." } ``` > Finished chain.
"Ali's BMR is 1915.0 and he is overweight."
from langchain_core.pydantic_v1 import BaseModel, Field
class CalculateBmr(BaseModel):query: str = Field(description='Calculate Basal Metabolic Rate (BMR) using the Mifflin-St Jeor Equation and provide a body judgment')
@tool(args_schema=CalculateBmr) # decorator
def calculate_bmr(weight:float, height: float, age: int, gender: str):
"""Calculate Basal Metabolic Rate (BMR) using the Mifflin-St Jeor Equation and provide a body judgment."""
if gender == 'male':
bmr = 10 * weight + 6.25 * height - 5 * age + 5
elif gender == 'female':
bmr = 10 * weight + 6.25 * height - 5 * age - 161
else:
raise ValueError("Gender must be 'male' or 'female'")
# Calculate BMI
height_m = height / 100
bmi = weight / (height_m ** 2)
# Provide a judgment based on BMI
if bmi < 18.5:
body_judgment = "Underweight"
elif 18.5 <= bmi < 24.9:
body_judgment = "Normal weight"
elif 25 <= bmi < 29.9:
body_judgment = "Overweight"
else:
body_judgment = "Obese"
return bmr, body_judgment
D:\Learning\MyWebsite\FinalGithub\AlreadyPublihsed\blogs\DataCamp_Intro_to_LangChain\vm_data_capm_langchain\lib\site-packages\IPython\core\interactiveshell.py:3550: LangChainDeprecationWarning: As of langchain-core 0.3.0, LangChain uses pydantic v2 internally. The langchain_core.pydantic_v1 module was a compatibility shim for pydantic v1, and should no longer be used. Please update the code to import from Pydantic directly. For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel` with: `from pydantic import BaseModel` or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. from pydantic.v1 import BaseModel exec(code_obj, self.user_global_ns, self.user_ns)
Finally, we need to wrap the function with format_tool_to_openai_function
. The tool's output shows that the it now has a description field and follows the OpenAI API specification.
from langchain.tools import format_tool_to_openai_function
print(format_tool_to_openai_function(calculate_bmr))
{'name': 'calculate_bmr', 'description': 'Calculate Basal Metabolic Rate (BMR) using the Mifflin-St Jeor Equation and provide a body judgment.', 'parameters': {'type': 'object', 'properties': {'query': {'description': 'Calculate Basal Metabolic Rate (BMR) using the Mifflin-St Jeor Equation and provide a body judgment', 'type': 'string'}}, 'required': ['query']}}
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\3082090257.py:3: LangChainDeprecationWarning: The function `format_tool_to_openai_function` was deprecated in LangChain 0.1.16 and will be removed in 1.0. Use :meth:`~langchain_core.utils.function_calling.convert_to_openai_function()` instead. print(format_tool_to_openai_function(calculate_bmr))
In LangChain, callbacks are functions or methods triggered at specific points during the application's execution. Running an AI application is similar to preparing a dish in a kitchen—just as we refer back to a recipe at various stages to check ingredients, timings, and temperatures, callbacks allow us to:
Callbacks serve five primary purposes in AI applications:
Data Preprocessing: Callbacks can modify how data is ingested into the model before it's processed.
Model Inference: During model inference, if the output quality deviates from expectations, callbacks can adjust or re-evaluate the model's parameters.
Error Handling: In more complex applications, callbacks can detect and log errors as they occur, allowing for quick identification and resolution of problems.
Resource Management: Since inference with large language models can be computationally intensive, callbacks can monitor and optimize resource usage, such as memory and processing power.
User Interaction: In user-facing applications, callbacks can track user responses and engagement, which can then be used to refine the model’s responses, making them more relevant and user-friendly.
LangChain provides various callback methods that execute at different points in the application, helping developers gain insights into each layer of the process. These methods are documented in LangChain’s documentation, so there's no need to memorize them. For instance, the on_llm_new_token
callback method is triggered every time a new token is produced by the LLM, while the on_chain_end
callback runs only after a chain operation is completed.
For more information, visit the LangChain callback documentation.
from langchain import LLMChain, OpenAI, PromptTemplate
from langchain.callbacks.base import BaseCallbackHandler
class CallingItBack(BaseCallbackHandler):
def on_llm_start(self, serialized, prompts, invocation_params, **kwargs):
print(prompts)
print(invocation_params["model_name"])
print(invocation_params["temperature"])
def on_llm_new_token(self, token: str, **kwargs) -> None:
print(repr(token))
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", streaming=True, openai_api_key=openai_api_key)
prompt_template = "How far is it from {location1} to {location2} in km and what is the best possible route and appraoch to travel"
chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(prompt_template))
output = chain.run({"location1": "Rasht-Iran", "location2": "Yazd-Iran"}, callbacks=[CallingItBack()])
print(output)
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\167339161.py:1: LangChainDeprecationWarning: The class `OpenAI` was deprecated in LangChain 0.0.10 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-openai package and should be used instead. To use it run `pip install -U :class:`~langchain-openai` and import as `from :class:`~langchain_openai import OpenAI``. llm = OpenAI(model_name="gpt-3.5-turbo-instruct", streaming=True, openai_api_key=openai_api_key)
['How far is it from Rasht-Iran to Yazd-Iran in km and what is the best possible route and appraoch to travel'] gpt-3.5-turbo-instruct 0.7 '\n\n' 'The' ' distance' ' between' ' Ras' 'ht' ' and' ' Yaz' 'd' ' is' ' approximately' ' ' '1' ',' '300' ' kilometers' '.' ' The best route' ' to' ' travel' ' by car would' ' be to take the Tehran-Q' 'om' '-' 'Is' 'f' 'ahan' '-Y' 'az' 'd' ' highway' ',' ' which' ' is' ' the' ' most' ' direct and' ' well' '-maintained route' '.' ' The' ' estimated' ' driving' ' time' ' is' ' around' ' ' '16' '-' '18' ' hours.\n\n' 'Alternatively,' ' you' ' can' ' also' ' take' ' a' ' domestic' ' flight' ' from' ' Ras' 'ht' ' to' ' Yaz' 'd' ',' ' which' ' would' ' take' ' about' ' ' '2' ' hours' '.' ' You' ' can' ' check' ' for' ' flight' ' options and prices on' ' websites such' ' as' ' Iran' ' Air' ',' ' Mah' 'an' ' Air, or Aseman Airlines' '.\n\n' 'Another' ' option' ' would' ' be' ' to' ' take' ' a' ' train' ' from' ' Ras' 'ht' ' to' ' Tehran' ' and' ' then' ' from' ' Tehran' ' to' ' Yaz' 'd' '.' ' The' ' total' ' journey' ' time' ' would' ' be' ' around' ' ' '20-' '22' ' hours.' ' You can check for' ' train' ' schedules' ' and tickets on' ' the' ' Iranian' ' Rail' 'ways' ' website' '.\n\n' 'It' ' is' ' recommended' ' to' ' plan' ' your' ' trip' ' in advance and book transportation' ' tickets' ' and' ' accommodations' ' in' ' advance' ',' ' especially' ' during' ' peak travel seasons' '.' ' You can also use' ' a' ' travel' ' app' ' such' ' as' ' Google' ' Maps' ' or Waze for' ' navigation and' ' real' '-time' ' traffic' ' updates' ' during' ' your' ' journey' '.' ' ' '' The distance between Rasht and Yazd is approximately 1,300 kilometers. The best route to travel by car would be to take the Tehran-Qom-Isfahan-Yazd highway, which is the most direct and well-maintained route. The estimated driving time is around 16-18 hours. Alternatively, you can also take a domestic flight from Rasht to Yazd, which would take about 2 hours. You can check for flight options and prices on websites such as Iran Air, Mahan Air, or Aseman Airlines. Another option would be to take a train from Rasht to Tehran and then from Tehran to Yazd. The total journey time would be around 20-22 hours. You can check for train schedules and tickets on the Iranian Railways website. It is recommended to plan your trip in advance and book transportation tickets and accommodations in advance, especially during peak travel seasons. You can also use a travel app such as Google Maps or Waze for navigation and real-time traffic updates during your journey.
The verbose flag is also useful for troubleshooting decision-making in the model. To activate it, pass verbose=True when defining the model, and define the prompt as before. When we send the prompt to the model, we see the output was explained much more clearly.
from langchain.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
model = ChatOpenAI(streaming=True, openai_api_key=openai_api_key, temperature=0, verbose=True)
prompt = ChatPromptTemplate.from_template("Answer a question with a strict process and deep analysis: {question}")
chain = prompt | model
response = chain.invoke({"question": "How far is it from Rasht to Yazd in km and what is the best possible route and appraoch to travel"})
output = response.content
print(output)
The distance from Rasht to Yazd is approximately 1,100 kilometers. The best possible route to travel between these two cities would involve taking the following steps: 1. Begin by heading south on the highway towards Tehran, which is the capital city of Iran. This route will take you through major cities such as Qazvin and Saveh. 2. Once you reach Tehran, continue south towards Isfahan. Isfahan is a major city in central Iran and serves as a hub for transportation to other parts of the country. 3. From Isfahan, head east towards Yazd. This route will take you through smaller towns and villages, offering a glimpse into the local culture and landscape of Iran. 4. Upon reaching Yazd, take some time to explore the city's historical sites, such as the Yazd Atash Behram and the Jameh Mosque of Yazd. By following this route, you will not only cover the distance between Rasht and Yazd efficiently but also have the opportunity to experience the diverse landscapes and cultures of Iran along the way. Additionally, using a GPS navigation app such as Google Maps or Waze can help you navigate the route with ease and avoid any potential roadblocks or traffic delays.
Evaluating AI applications is essential for several key reasons:
Accuracy in Interpretation and Response: Evaluation ensures that the AI model can accurately interpret and respond to various inputs. This is crucial in decision-making applications, where the reliability of the responses is paramount.
Identifying Strengths and Weaknesses: Regular evaluation helps pinpoint the model's strengths and weaknesses, enabling targeted improvements and building trust with users and stakeholders.
Aligning Output with Human Intent: Evaluation helps refine model outputs to better align with human intent, accelerating the process of obtaining ideal responses.
LangChain offers built-in evaluation tools that compare model outputs based on common criteria like relevance and correctness. These tools also support defining custom evaluation criteria, tailored to specific use cases.
One such tool is the QAEvalChain
class, which measures how well an AI’s response answers a specific question compared to ground truth responses.
For example, to evaluate using the relevance criterion, we can use LangChain’s built-in evaluator. First, import the load_evaluator
function, and then load an evaluator by specifying criteria='relevance'
. Make sure to specify the LLM to use. To evaluate, call the .evaluate_strings()
method, passing in the ground truth response as the prediction
and the input question. This will give feedback on how well the AI's response matches the expected result.
When developing sophisticated applications with LLMs, a challenging aspect is assessing performance. How do you measure if your AI model meets accuracy standards? Additionally, if you modify your implementation—such as switching to a different LLM or changing how you utilize retrieval mechanisms—how can you evaluate whether these changes improve or degrade performance?
This notebook addresses these challenges, providing strategies for evaluating the accuracy and effectiveness of applications powered by large language models. It highlights the importance of understanding each step's inputs and outputs in the application’s workflow and introduces tools for structured evaluation.
Furthermore, it discusses using language models and chains themselves to evaluate other models and applications. With the rise of prompt-based development and increased reliance on LLMs, the process of evaluating application workflows is evolving.
from langchain.evaluation import Criteria
list(Criteria)
[<Criteria.CONCISENESS: 'conciseness'>, <Criteria.RELEVANCE: 'relevance'>, <Criteria.CORRECTNESS: 'correctness'>, <Criteria.COHERENCE: 'coherence'>, <Criteria.HARMFULNESS: 'harmfulness'>, <Criteria.MALICIOUSNESS: 'maliciousness'>, <Criteria.HELPFULNESS: 'helpfulness'>, <Criteria.CONTROVERSIALITY: 'controversiality'>, <Criteria.MISOGYNY: 'misogyny'>, <Criteria.CRIMINALITY: 'criminality'>, <Criteria.INSENSITIVITY: 'insensitivity'>, <Criteria.DEPTH: 'depth'>, <Criteria.CREATIVITY: 'creativity'>, <Criteria.DETAIL: 'detail'>]
from langchain.chat_models import ChatOpenAI
from langchain.evaluation import load_evaluator
evaluator = load_evaluator("criteria", criteria="relevance",
llm = ChatOpenAI(openai_api_key=openai_api_key))
eval_result = evaluator.evaluate_strings(prediction="I want to make sum of two numbers",
input="who was president of USA on 1973?")
print(eval_result)
{'reasoning': '1. The submission is not relevant to the input question, which is about the president of the USA in 1973. The submission talks about wanting to make a sum of two numbers, which does not address the question at all.\n\nTherefore, the submission does not meet the criteria of relevance.\n\nN', 'value': 'N', 'score': 0}
evaluator = load_evaluator("criteria", criteria="conciseness",
llm = ChatOpenAI(openai_api_key=openai_api_key))
eval_result = evaluator.evaluate_strings(prediction="I want to make sum of two numbers",
input="who was president of USA on 1973?")
print(eval_result)
{'reasoning': '1. Is the submission concise and to the point?\n- The submission is clearly not concise and directly addresses the question asked.\n\nN', 'value': 'N', 'score': 0}
# Load evaluator, assign it to criteria
evaluator = load_evaluator("criteria", criteria="relevance", llm=ChatOpenAI(openai_api_key=openai_api_key))
# Evaluate the input and prediction
eval_result = evaluator.evaluate_strings(
prediction="42",
input="What is the answer to the ultimate question of life, the universe, and everything?",
)
print(eval_result)
{'reasoning': '1. The submission "42" is a reference to the quote from "The Hitchhiker\'s Guide to the Galaxy" by Douglas Adams, where the answer to the ultimate question of life, the universe, and everything is indeed "42."\n\nTherefore, the submission meets the criteria for relevance.\n\nY', 'value': 'Y', 'score': 1}
custom_criteria = {"truthfulness": "Is the writing honest and factual?",
"bias": "Does the language stay free of human bias?",
"simplicity": "Does the language use brevity?",
"clarity": "Is the writing easy to understand?"
}
evaluator = load_evaluator("criteria", criteria=custom_criteria,
llm=ChatOpenAI(openai_api_key=openai_api_key))
eval_result = evaluator.evaluate_strings(
input="What is the majority job that Asian people have in Canada?",
prediction="I think most of them are involved in manual works like construction")
print(eval_result)
{'reasoning': '- truthfulness: The submission states that Asian people in Canada are mostly involved in manual works like construction. This statement is not entirely accurate as it generalizes the job roles of an entire demographic group. While some Asian individuals may work in construction, there is a wide range of occupations that Asian people hold in Canada. This statement lacks factual accuracy and can be misleading. Therefore, the submission does not meet the criteria of truthfulness.\n- bias: The submission does not contain any evident bias based on the language used. It simply presents an opinion without showing favoritism or prejudice towards any group. Therefore, the submission meets the criteria of bias.\n- simplicity: The language used in the submission is brief and straightforward. It conveys the main point in a concise manner without unnecessary elaboration. Therefore, the submission meets the criteria of simplicity.\n- clarity: The writing in the submission is easy to understand as it clearly states the opinion about the majority job that Asian people have in Canada. The message is not convoluted or confusing. Therefore, the submission meets the criteria of clarity.', 'value': 'N', 'score': 0}
# Add a scalability criterion to custom_criteria
custom_criteria = {
"market_potential": "Does the suggestion effectively assess the market potential of the startup?",
"innovation": "Does the suggestion highlight the startup's innovation and uniqueness in its sector?",
"risk_assessment": "Does the suggestion provide a thorough analysis of potential risks and mitigation strategies?",
"scalability": "Does the suggestion address the startup's scalability and growth potential?"
}
# Criteria an evaluator from custom_criteria
evaluator = load_evaluator("criteria", criteria=custom_criteria, llm=ChatOpenAI(openai_api_key=openai_api_key))
# Evaluate the input and prediction
eval_result = evaluator.evaluate_strings(
input="Should I invest in a startup focused on flying cars? The CEO won't take no for an answer from anyone.",
prediction="No, that is ridiculous.")
print(eval_result)
{'reasoning': "- market_potential: The submission does not effectively assess the market potential of the startup. It simply states that investing in a startup focused on flying cars is ridiculous without providing any analysis or reasoning.\n- innovation: The submission does not highlight the startup's innovation and uniqueness in its sector. It simply dismisses the idea without mentioning any specific innovative features of the startup.\n- risk_assessment: The submission does not provide a thorough analysis of potential risks and mitigation strategies. It only states a negative opinion without delving into the specific risks associated with investing in a startup focused on flying cars.\n- scalability: The submission does not address the startup's scalability and growth potential. It only gives a one-word answer without considering the potential for the startup to grow and expand.", 'value': 'N', 'score': 0}
QAEvalChain is designed to assess the accuracy and relevance of the responses generated by the model. In this process, RAG (Retrieval-Augmented Generation) is employed to store both the document and the ground truth responses. An evaluation model is then used to compare the semantic meaning of the model's output with the ground truth. The workflow begins by loading the data source, such as a PDF document, and splitting it into smaller chunks. We then configure the embeddings model, set up the vector database, and integrate them with the LLM in a chain. The input_key is set to "question" to allow the questions to be used for querying the database.
# Read a pdf document
loader = PyPDFLoader('data/Paper.pdf')
data = loader.load()
chunk_size = 200
chunk_overlap = 50
# split it to chunks
splitter = RecursiveCharacterTextSplitter(
chunk_size=chunk_size,
chunk_overlap=chunk_overlap,
separators=['.'])
docs = splitter.split_documents(data)
# set up the enbedding models
embedding = OpenAIEmbeddings(openai_api_key=openai_api_key)
docstorage = Chroma.from_documents(docs, embedding)
For evaluation, we also need reference points to define a question set and the ground truth responses that we'd expect. We store these in a list of dictionaries. To ensure accurate evaluation, it is important to spend time making these ground truth examples specific and accurate.
The second page explores the different skills you need to start a career in AI and data science. So the first one we can ask is a simple question Is machine learning foundations the most important skill?
for the second one, we can ask what are the Python frameworks you need to learn. The answer to this question is in the second document.
But this doesn’t scale that. It takes a bit of time to look through each example and figure out what’s going on. So a better way to do that is to automate it.
One of the methods that we can automate the evaluation process is with LLM themselves. We have a chain in Langchain that can do exactly that. So we can import the QA generation chain, and this will take in documents and will create a question-answer pair from each document. It’ll do this using a language model itself. So we need to create this chain by passing in the chat open AI language model.
question_set = [
{
"query": "What is the difference between GM and SCVF?",
"answer": "there is no difference between GM and SCVF."
},
{
"query": "According to the paper, what is the advantage of LU simulation?",
"answer": "LU simulation respects the correltion between predictors."
},
{
"query": "What is disadvantage of oversampling within k-fold?",
"answer": "It will lead to overfitting."
}
]
To begin the evaluation, we call the retrieval QA chain on our question set and collect the responses as predictions. Next, we set up an LLM for QA evaluation. To perform the evaluation, call the .evaluate() method, passing the question_set, which contains the ground truth responses, the predictions from the retrieval QA chain, and the keys that link the question, prediction, and ground truth answer. Printing the results, shows that the retrieval QA chain got two out of three answers correct, or 66%, versus the ground truth.
qa = RetrievalQA.from_chain_type(llm=llm,
chain_type="stuff",
retriever=docstorage.as_retriever(),
input_key="query")
from langchain.evaluation.qa import QAEvalChain
# Define the evaluation chain
eval_chain = QAEvalChain.from_llm(llm)
for i in range(len(question_set)):
# Generate the model responses using the RetrievalQA chain and question_set
predictions = qa.apply([question_set[i]])
# Evaluate the ground truth against the answers that are returned
results = eval_chain.evaluate([question_set[i]],
predictions,
question_key="query",
answer_key='answer',
prediction_key="result")
print(f"Question {i+1}: {question_set[i]['query']}")
print(f"Expected Answer: {question_set[i]['answer']}")
print(f"Model Prediction: {predictions}\n")
print(f"Result: {results[0]['results']}\n")
C:\Users\mrezv\AppData\Local\Temp\ipykernel_11232\826670384.py:6: LangChainDeprecationWarning: The method `Chain.apply` was deprecated in langchain 0.1.0 and will be removed in 1.0. Use :meth:`~batch` instead. predictions = qa.apply([question_set[i]])
Question 1: What is the difference between GM and SCVF? Expected Answer: there is no difference between GM and SCVF. Model Prediction: [{'query': 'What is the difference between GM and SCVF?', 'answer': 'there is no difference between GM and SCVF.', 'result': '\nGM refers to the flow of any detectable gas outside of the outermost casing string of oil and gas wells, while SCVF is specifically the flow of gas within the well itself and is often used to refer to internal migration.'}] Result: CORRECT Question 2: According to the paper, what is the advantage of LU simulation? Expected Answer: LU simulation respects the correltion between predictors. Model Prediction: [{'query': 'According to the paper, what is the advantage of LU simulation?', 'answer': 'LU simulation respects the correltion between predictors.', 'result': ' The advantage of LU simulation is reduced computational cost and quantifying uncertainty.'}] Result: CORRECT Question 3: What is disadvantage of oversampling within k-fold? Expected Answer: It will lead to overfitting. Model Prediction: [{'query': 'What is disadvantage of oversampling within k-fold?', 'answer': 'It will lead to overfitting.', 'result': '\nThe disadvantage of oversampling within k-fold is that it increases the likelihood of overfitting due to including exact copies of minority class examples.'}] Result: CORRECT