Ollama+LangChainによるネイティブRAGアプリケーションの構築

AIハンズオンチュートリアル投稿：5ヶ月前 AIシェアリングサークル

1.2K 00

このチュートリアルは、あなたがすでに以下の概念に精通していることを前提としています。

チャットモデル
ランナブルの連結
埋め込み
ベクターストア
検索補強世代

など、多くの人気アイテムがある。ラマ.cpp , オーラマそしてラマファイルは、大規模な言語モデルをローカル環境で実行することの重要性を示している。

LangChainは、ローカルで実行される多くのオープンソースLLMベンダー Ollamaもそのひとつだ。

環境設定

まず、環境を整える必要がある。

OllamaのGitHubリポジトリには詳細な説明があり、要約すると以下のようになる。

Ollamaアプリケーションをダウンロードして実行する
コマンドラインから、Ollamaのモデルリストとテキスト埋め込みモデル一覧モデルを引っ張るそのチュートリアルでは llama3.1:8b 歌で応える nomic-embed-text 例
- コマンドライン入力 ollama pull llama3.1:8b一般的なオープンソースの大規模言語モデルを引き出す llama3.1:8b
- コマンドライン入力 ollama pull nomic-embed-text プルテキスト埋め込みモデル nomic-embed-text
アプリケーションを実行すると、すべてのモデルは自動的に localhost:11434 さかのぼる
モデルを選択する際には、ローカルハードウェアの機能、このチュートリアルのリファレンスビデオメモリサイズを考慮する必要があることに注意してください。 GPU Memory > 8GB

次に、ローカル埋め込み、ベクトル保存、モデル推論に必要なパッケージをインストールする。

# langchain_community
%pip install -qU langchain langchain_community
# Chroma
%pip install -qU langchain_chroma
# Ollama
%pip install -qU langchain_ollama

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.

また、次のこともできる。このページを見る利用可能な埋め込みモデルの全リストはこちら

ドキュメントの読み込み

それでは、サンプル文書を読み込んで分割してみましょう。

私たちは、リリアン・ウェンのエージェントに関する記事を使用する。ブログ一例を挙げよう。

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

次に、ベクトルストアを初期化する。使用するテキスト埋め込みモデルは nomic-embed-text .

from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings
local_embeddings = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = Chroma.from_documents(documents=all_splits, embedding=local_embeddings)

これでローカル・ベクトル・データベースができた！類似性検索を簡単にテストしてみよう。

question = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(question)
len(docs)

docs[0]

Document(metadata={'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:', 'language': 'en', 'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'title': "LLM Powered Autonomous Agents | Lil'Log"}, page_content='Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.')

次に、大きな言語モデルをインスタンス化する。 llama3.1:8b そして、モデルの推論が正しく機能しているかどうかをテストする：

from langchain_ollama import ChatOllama
model = ChatOllama(
model="llama3.1:8b",
)

response_message = model.invoke(
"Simulate a rap battle between Stephen Colbert and John Oliver"
)
print(response_message.content)

**The scene is set: a packed arena, the crowd on their feet. In the blue corner, we have Stephen Colbert, aka "The O'Reilly Factor" himself. In the red corner, the challenger, John Oliver. The judges are announced as Tina Fey, Larry Wilmore, and Patton Oswalt. The crowd roars as the two opponents face off.**
**Stephen Colbert (aka "The Truth with a Twist"):**
Yo, I'm the king of satire, the one they all fear
My show's on late, but my jokes are clear
I skewer the politicians, with precision and might
They tremble at my wit, day and night
**John Oliver:**
Hold up, Stevie boy, you may have had your time
But I'm the new kid on the block, with a different prime
Time to wake up from that 90s coma, son
My show's got bite, and my facts are never done
**Stephen Colbert:**
Oh, so you think you're the one, with the "Last Week" crown
But your jokes are stale, like the ones I wore down
I'm the master of absurdity, the lord of the spin
You're just a British import, trying to fit in
**John Oliver:**
Stevie, my friend, you may have been the first
But I've got the skill and the wit, that's never blurred
My show's not afraid, to take on the fray
I'm the one who'll make you think, come what may
**Stephen Colbert:**
Well, it's time for a showdown, like two old friends
Let's see whose satire reigns supreme, till the very end
But I've got a secret, that might just seal your fate
My humor's contagious, and it's already too late!
**John Oliver:**
Bring it on, Stevie! I'm ready for you
I'll take on your jokes, and show them what to do
My sarcasm's sharp, like a scalpel in the night
You're just a relic of the past, without a fight
**The judges deliberate, weighing the rhymes and the flow. Finally, they announce their decision:**
Tina Fey: I've got to go with John Oliver. His jokes were sharper, and his delivery was smoother.
Larry Wilmore: Agreed! But Stephen Colbert's still got that old-school charm.
Patton Oswalt: You know what? It's a tie. Both of them brought the heat!
**The crowd goes wild as both opponents take a bow. The rap battle may be over, but the satire war is just beginning...

連鎖式の構築

取得したドキュメントと簡単なプロンプトを渡して summarization chain .

与えられた入力キー値を用いてプロンプトテンプレートをフォーマットし、フォーマットされた文字列を指定されたモデルに渡す：

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template(
"Summarize the main themes in these retrieved docs: {docs}"
)
# 将传入的文档转换成字符串的形式
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
chain = {"docs": format_docs} | prompt | model | StrOutputParser()
question = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(question)
chain.invoke(docs)

'The main themes in these documents are:\n\n1. **Task Decomposition**: The process of breaking down complex tasks into smaller, manageable subgoals is crucial for efficient task handling.\n2. **Autonomous Agent System**: A system powered by Large Language Models (LLMs) that can perform planning, reflection, and refinement to improve the quality of final results.\n3. **Challenges in Planning and Decomposition**:\n\t* Long-term planning and task decomposition are challenging for LLMs.\n\t* Adjusting plans when faced with unexpected errors is difficult for LLMs.\n\t* Humans learn from trial and error, making them more robust than LLMs in certain situations.\n\nOverall, the documents highlight the importance of task decomposition and planning in autonomous agent systems powered by LLMs, as well as the challenges that still need to be addressed.'

シンプルなQA

from langchain_core.runnables import RunnablePassthrough
RAG_TEMPLATE = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
<context>
{context}
</context>
Answer the following question:
{question}"""
rag_prompt = ChatPromptTemplate.from_template(RAG_TEMPLATE)
chain = (
RunnablePassthrough.assign(context=lambda input: format_docs(input["context"]))
| rag_prompt
| model
| StrOutputParser()
)
question = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(question)
# Run
chain.invoke({"context": docs, "question": question})

'Task decomposition can be done through (1) simple prompting using LLM, (2) task-specific instructions, or (3) human inputs. This approach helps break down large tasks into smaller, manageable subgoals for efficient handling of complex tasks. It enables agents to plan ahead and improve the quality of final results through reflection and refinement.'

検索によるQA

最後に、我々のQAアプリケーションは、意味検索（ローカルラグアプリケーション）は、ユーザーの質問に基づき、ベクトルデータベースから意味的に最も類似した文書断片を自動的に検索することができる：

retriever = vectorstore.as_retriever()
qa_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| rag_prompt
| model
| StrOutputParser()
)

question = "What are the approaches to Task Decomposition?"
qa_chain.invoke(question)

'Task decomposition can be done through (1) simple prompting in Large Language Models (LLM), (2) using task-specific instructions, or (3) with human inputs. This process involves breaking down large tasks into smaller, manageable subgoals for efficient handling of complex tasks.'

概要

おめでとうございます、この時点であなたはLangchainフレームワークとローカルモデルで構築されたRAGアプリケーションを完全に実装しています。このチュートリアルをベースにして、ローカルモデルを置き換えて様々なモデルの効果や機能を試してみたり、さらに拡張してアプリケーションの機能や表現力を豊かにしたり、より便利で面白い機能を追加したりすることができます。