LangChainの用意しているプロンプトやラッパーは英語以外を意識していないことが多く、特に検索系のtoolがUSのサイトを引っ張ってくるということが多々ある。

こういったケースはtoken数に制限のあるChatGPT APIにとっては大きな問題になってくる。USのサイトが検索上位にかかってくることで得られる情報が減る上に、これを解消するために検索数を増やせばそれだけtoken数を消費するためである。当然だがtoken数の上限を超えればエラーを吐いて異常終了する。

こういった問題の多くは、LangChainのライブラリが用意しているクラスをそのまま使うことによって発生している。

またLangChainはagentの定義を行うだけで簡単に基本的な機能が使えてしまうため、agentがどのようなロジックで動いているのか理解しにくい。

本記事はLangChainを少しだけ掘り下げて、これらの問題を解決するためのtipsを紹介する。

tool呼び出しはBaseToolクラスを使う

先ほども触れたが、LangChainが用意しているtoolやwrapperは現状だと柔軟性に乏しい。

APIのwrapperの場合、呼び出し元のAPIが本来扱えるはずのパラメータが扱えなくなってしまうことが多々ある。

例えばBingSearchAPIWrapperであれば、本来'mkt'でサイトの地域を、'setLang'でUIに使用する言語を指定できるがこの引数がLangChainのwrapperには存在していない。

こういった問題は自分でtoolを作ってしまうのが手っ取り早い。

Pythonfrom langchain.tools import BaseTool

class BingSearch(BaseTool):
    name = "BingSearch"
    description = "useful for when you need to answer questions about current events"

    def _run(self, query: str) -> str:
        """Use the tool."""
        mkt = 'ja-JP'
        setLang="Jp"
        params = {'count': 5,'q': query, 'mkt': mkt, 'safeSearch': "Strict", 'setLang': setLang}
        headers = { 'Ocp-Apim-Subscription-Key': os.getenv("BING_SUBSCRIPTION_KEY") }

        search_site = requests.get(os.getenv("BING_SEARCH_URL"), headers=headers, params=params)
        search_site.raise_for_status()
        search_result = search_site.json()

        response = ""

        for snippet in search_result["webPages"]["value"]:
            response += f"\n{i['url']} : {i['snippet']}"

        return response

    async def _arun(self, query: str) -> str:
        """Use the tool asynchronously."""
        raise NotImplementedError("BingSearch does not support async")

というわけで検索のtoolを自分で作った。(BingSearchAPIを使用)

難しいことをしているわけではないのでさっくりとしたものだが、言語の指定を行い、返り値を'name'と'snippet'に絞っている。

言語の設定は情報の確実さや鮮度、視点に大きくかかわってくる。一概に日本を指定することが良いこととは言わないが、特に鮮度、リアルタイム性という点では設定するべき項目だと思う。またtoken制限を意識して返り値をなるべく小さくする工夫も重要である。

BaseToolクラス :
langchain/libs/langchain/langchain/tools/base.py at 61dd92f8215daef3d9cf1734b0d1f8c70c1571c3 · langchain-ai/langchain (github.com)

BaseToolクラスのドキュメント（Python） :

Defining Custom Tools | 🦜️🔗 Langchain

AgentTypeは使わない

AgentTypeにはそれぞれプロンプトが用意されており、initialize_agentでAgentTypeを指定すると用意されたプロンプトをそのまま使用することになる。

agentのプロンプトはデフォルトで以下のようになっている。

ZERO_SHOT_REACT_DESCRIPTION

PythonPREFIX = """Answer the following questions as best you can. You have access to the following tools:"""
FORMAT_INSTRUCTIONS = """Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question"""
SUFFIX = """Begin!
Question: {input}
Thought:{agent_scratchpad}"""

CHAT_ZERO_SHOT_REACT_DESCRIPTION

PythonSYSTEM_MESSAGE_PREFIX = """Answer the following questions as best you can. You have access to the following tools:"""
FORMAT_INSTRUCTIONS = """The way you use the tools is by specifying a json blob.
Specifically, this json should have a `action` key (with the name of the tool to use) and a `action_input` key (with the input to the tool going here).
The only values that should be in the "action" field are: {tool_names}
The $JSON_BLOB should only contain a SINGLE action, do NOT return a list of multiple actions. Here is an example of a valid $JSON_BLOB:
```
{{{{
 "action": $TOOL_NAME,
 "action_input": $INPUT
}}}}
```
ALWAYS use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action:
```
$JSON_BLOB
```
Observation: the result of the action
... (this Thought/Action/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question"""
SYSTEM_MESSAGE_SUFFIX = """Begin! Reminder to always use the exact characters `Final Answer` when responding."""
HUMAN_MESSAGE = "{input}\n\n{agent_scratchpad}"

これだけ見ても分かりにくいのでAgentTypeが宣言されているクラスのコードを見るといいのだが、今はPREFIXの後にtoolの一覧とそのtoolの説明が入ると考えればいい。

重要なのはFORMAT_INSTRUCTIONSで、ここがagentの挙動の根幹を担っている。

そして、AgentTypeを宣言しているクラスのcreate_promptメソッドでプロンプトを変更することができる。

というわけで実際にZERO_SHOT_REACT_DESCRIPTIONのプロンプトを変えてみる。

Pythonsuffix = """
Follow the conditions:
- Please respond in Japanese.
- If there is not enough information to generate a "Final Answer" from the "BingSearch_1" snippet, "Thought" is set to "I use BingSearch_2 because BingSearch_1 didn't give me enough information.".

Let's get started!
Question: {input}
{agent_scratchpad}"""

prompt = ZeroShotAgent.create_prompt(
   tools=tools, 
   suffix=suffix,
   input_variables=["input", "agent_scratchpad"],
)

print(prompt.template)

llm_chain = LLMChain(llm=llm, prompt=prompt)
tool_names = [tool.name for tool in tools]

agent_executor = AgentExecutor.from_agent_and_tools(
    agent=ZeroShotAgent(llm_chain=llm_chain, allowed_tools=tool_names),
     
    tools=tools,
    verbose=True,
    max_iterations=5,
    handle_parsing_errors=True,
)

agent_executor.run("質問文を入力")

文末を少しだけ変えただけだが、これで日本語応答してくれるようになる。

また、toolの呼び出しにある程度干渉できる。

これでtool（上記コードの場合BingSearch_2）のdescriptionを "Thoughtで明示されたときのみ使用します" のようにすると特定条件でしか呼び出されないtoolが完成する。

便利な機能としてagentを定義する時にmax_iterationsを指定することでchainの最大繰り返し回数を指定できる。またhandle_parsing_errors=Trueとすればparserエラーを無視できるので、parserエラーが頻発する場合は指定するといいだろう（これはinitialize_agentでも使用できる）

プロンプトはtemplateメソッドで取得できるので必要な場合は記述しておくといい。

ZERO_SHOT_REACT_DESCRIPTION（ZeroShotAgentクラス） :
langchain/libs/langchain/langchain/agents/mrkl/base.py at bd2e298468447845d4d8de3b5d2f6772e862973e · langchain-ai/langchain (github.com)

CHAT_ZERO_SHOT_REACT_DESCRIPTION（chatAgentクラス） :
langchain/libs/langchain/langchain/agents/chat/base.py at bd2e298468447845d4d8de3b5d2f6772e862973e · langchain-ai/langchain (github.com)

ZERO_SHOT_REACT_DESCRIPTIONのプロンプトテンプレート :
langchain/libs/langchain/langchain/agents/mrkl/prompt.py at e83250cc5f4dc5edd1ae8fb0a41c40454d13fb9d · langchain-ai/langchain (github.com)

CHAT_ZERO_SHOT_REACT_DESCRIPTIONのプロンプトテンプレート :
langchain/libs/langchain/langchain/agents/chat/prompt.py at e83250cc5f4dc5edd1ae8fb0a41c40454d13fb9d · langchain-ai/langchain (github.com)

ログを取る

agentを走らせれば標準出力で大抵の情報が可視化されるが、これを変数として取得したい場合も多かったのでその方法である。

以下のようにagentの定義時にreturn_intermediate_steps=Trueを指定すれば良い、またrunメソッドは使用せずAgentExecutorにinputを直接渡すような形になる（intialize_agentでも可能）

Pythonagent_executor = AgentExecutor.from_agent_and_tools(
   agent=ZeroShotAgent(llm_chain=llm_chain, allowed_tools=tool_names),
   tools=tools,
   verbose=True,
   max_iterations=7,
   handle_parsing_errors=True,
   return_intermediate_steps=True
)

response = agent_executor({"input": "質問文を入力"})

print(f"log : {response["intermediate_steps"]}")
print(f"answer : {response["output"]}")

Access intermediate stepsのドキュメント（Python） :

Access intermediate steps | 🦜️🔗 Langchain

おわりに

LangChainは便利なライブラリだが、まだ公開されてから日が浅いこともあり孫の手が欲しくなる場面が非常に多い。

また、公式のドキュメントもそこまで低いレイヤーの情報を提供していないので、GitHubのコードを見て理解していくしかないという場面も多い。

本記事が少しでも多くの人の役に立てば幸いである。