複数のエージェントを協調させてタスクを実行: openai/swarmを試す

2024年10月12日 19:35

OpenAIが発表した、複数のエージェントが協調してタスクを遂行するための実験的なフレームワークSwarmを試してみました。

個別のエージェントに特定の小さな「指示」と「機能（ツール）」を持たせて、必要に応じて他のエージェントにタスクを引き渡すことで、協調的にタスク処理を行うコンセプトのようです。

なお、Swarm エージェントは、便宜上、同様の名前が付けられていますが、アシスタント API のアシスタントとは関係なく、ステートレスなChat Completions API で動作します。このためエージェント間でデータや状態が保存されませんが、会話の履歴やコンテキストを保持して使いたい場合、会話の履歴など必要な情報を context_variables に含めることで保持させるようです。

ともかく、さくっとGoogle Colabで触ってみます。

環境準備

ライブラリのインストール

!pip install git+https://github.com/openai/swarm.git

環境変数の準備

import os
from google.colab import userdata
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

サンプル１：ツールの利用する単一エージェントの例

import json
from swarm import Agent


# 天気を取得するツール(モック)
def get_weather(location, time="now"):
    """Get the current weather in a given location. Location MUST be a city."""
    return json.dumps({"location": location, "temperature": "20℃", "time": time})

# メール送信ツール(モック)
def send_email(recipient, subject, body):
    print("Sending email...")
    print(f"To: {recipient}")
    print(f"Subject: {subject}")
    print(f"Body: {body}")
    return "Sent!"

# エージェント
weather_agent = Agent(
    name="Weather Agent",
    instructions="あなたは親切なエージェントです",
    # 使用可能なツール
    functions=[get_weather, send_email],
)

from swarm.repl import run_demo_loop

run_demo_loop(weather_agent, stream=True)

いい感じに、ツール（天気予報、メール送信）を呼び出して動作していることがわかります。

サンプル2：複数エージェントの協調例

機能（ツール）とエージェントの定義

from swarm import Agent


# 返品処理
def process_refund(item_id, reason="NOT SPECIFIED"):
    """Refund an item. Refund an item. Make sure you have the item_id of the form item_... Ask for user confirmation before processing the refund."""
    print(f"[mock] Refunding item {item_id} because {reason}...")
    return "Success!"

# 割引処理
def apply_discount():
    """Apply a discount to the user's cart."""
    print("[mock] Applying discount...")
    return "Applied discount of 11%"

# エージェント
triage_agent = Agent(
    name="Triage Agent",
    instructions="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent.",
)
sales_agent = Agent(
    name="Sales Agent",
    instructions="Be super enthusiastic about selling bees.",
)
refunds_agent = Agent(
    name="Refunds Agent",
    instructions="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund.",
    functions=[process_refund, apply_discount],
)


def transfer_back_to_triage():
    """Call this function if a user is asking about a topic that is not handled by the current agent."""
    return triage_agent


def transfer_to_sales():
    return sales_agent


def transfer_to_refunds():
    return refunds_agent


triage_agent.functions = [transfer_to_sales, transfer_to_refunds]
sales_agent.functions.append(transfer_back_to_triage)
refunds_agent.functions.append(transfer_back_to_triage)

run_demo_loop(triage_agent)

最初トリアージエージェントがユーザーの目的を聞いてから、要望に応じて担当エージェントにバトンタッチしています。

シンプルな記述で各エージェントが使えるツールを絞ることで比較的確実に動く複数のエージェントとツールを組み合わせて、高機能なエージェントを実現するアイデアとしてとても良さそうです。

個別の機能の実装部分は、この前ためしたellを組み合わせて使っても面白いかも。

最後までお読みいただき、ありがとうございました。