google/gemma-2b-itで拡張プロンプト

2025年1月12日 01:32

小型LLMを使用

Googleの小型LLM「Gemma-2b-it」を使ってプロンプトを拡張するコードを作成してみました。
ChatGPTのcanvasを使って作成したコードが以下です。

# Google ColabでGemma-2-2b-itを動かすサンプルコード（躍動感のある動物用プロンプト拡張版）

# 必要なライブラリをインストール
# transformers: Hugging FaceのNLPモデルを扱うライブラリ
# accelerate: モデルの高速化とマルチGPU対応をサポートするライブラリ
!pip install transformers accelerate

# 必要なライブラリのインポート
from transformers import AutoTokenizer, AutoModelForCausalLM  # モデルとトークナイザーのインポート
import torch  # PyTorchライブラリ（深層学習フレームワーク）のインポート

# モデルとトークナイザーのロード
# Hugging Faceの「google/gemma-2b-it」モデルとトークナイザーをロード
# トークナイザーは入力テキストをトークン化（数値化）し、モデルは因果言語モデル（Causal Language Model）
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b-it")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b-it")

# GPUまたはCPUのデバイス設定
# CUDAが使用可能であればGPUを、使用不可の場合はCPUを選択
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)  # モデルを選択したデバイスに転送

### 人物生成用プロンプト拡張関数
# ユーザーの入力プロンプトをより詳細で豊かな人物描写に変換
# キャラクターの外見、服装、国籍、感情表現を強調して描写
def expand_prompt_for_character(prompt):
    expanded_prompt = (f"Create a highly detailed and emotionally expressive character: {prompt}. "
                       "Describe the character's appearance, clothing, nationality, and expression vividly. "
                       "Focus on defining unique personality traits, background, and subtle facial details. "
                       "Ensure the character's emotions are clearly conveyed through body language and expressions.")
    return expanded_prompt

### テキスト生成関数
def generate_text(prompt, max_length=256):
    # プロンプトを拡張
    expanded_prompt = expand_prompt_for_character(prompt)
    # トークナイズしてテンソル化し、デバイスに転送
    inputs = tokenizer(expanded_prompt, return_tensors="pt").to(device)
    # モデルを用いてテキスト生成
    outputs = model.generate(
        **inputs,  # トークナイズ済みのデータをモデルに入力
        max_length=max_length,  # 最大生成トークン数
        num_return_sequences=1,  # 生成する文章の数
        do_sample=True,  # サンプリングを有効化して多様な出力を生成
        temperature=0.7,  # 創造性の制御（高いほど創造的な出力）
        #eos_token_id=tokenizer.eos_token_id,  # 文の終了を示すトークン（コメントアウト中）
        #pad_token_id=tokenizer.pad_token_id,  # パディングトークン（コメントアウト中）
        early_stopping=True,  # 文の自然な終了を促進
        repetition_penalty=1.1  # 同じフレーズの繰り返しを抑制
    )
    # 生成されたトークンをデコードし、特殊トークンを除外
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# ユーザー入力の受け付け
# ユーザーからプロンプトを受け取り、テキスト生成を実行
prompt = input("Enter your prompt: ")
result = generate_text(prompt)

# 生成結果の表示
print("\nGenerated Text:\n", result)

プロンプト拡張例

「Japanese girl, long hair, 18 year's old」というキーワードを提供して得られたプロンプトの例が次の通りです。
実際には文章の途中で切れたので、区切りのよいところまでを使っています。

Akari has long, flowing silver hair that cascades down her back in a gentle breeze. Her eyes are the color of the midnight sky, and her skin is the color of the twilight sky. Her eyes are always filled with a mixture of curiosity and wisdom.

Akari's clothing is a blend of traditional Japanese aesthetics and modern practicality. She wears a long, flowing kimono adorned with delicate embroidery and floral motifs. The kimono is paired with a pair of sleek, high-waisted pants that hug her curves perfectly.

Akari is a natural performer and loves to express herself through dance and music. Her clothing reflects her vibrant personality and the joy she finds in life. She often wears a traditional Japanese dance dress that accentuates her long, flowing hair and graceful movements.

画像生成

上機能プロンプトを使って、Gensparkで画像を生成しました。

動画生成

OpenAIのSoraを使って動画も生成しました。

#ChatGPTcanvas
↓#GoogleColab
↓#Genspark #FLUX 1.1 [pro] Ultra
↓#Sora pic.twitter.com/tp5WNa9eM4
— @makemediaxx (@makemediaxyz) January 11, 2025