DeepSeek-R1をGoogle Colabのローカル環境でさくっと動かす方法

2025年1月29日 18:11

サンプルコードはこちらです

https://colab.research.google.com/drive/1GeDvF2jJqBGEX_ppWkXrftlVMYvsqZvv?usp=sharing

解説

省コストといわれるDeepSeek-R1ですが、それでもオリジナルモデルは、6850億パラメータ（＝600GB以上）という超大規模モデルです。これを動かすためには1台あたり500万円するNvidia H100が8台は必要なため、個人でオリジナルモデルを動かすことは現実的ではありません。

代わりに今回はこの蒸留モデル版である「DeepSeek-R1-Distill-Qwen-1.5B」をGoogle Colabで動かしてみます。このモデルは要はオリジナルのモデルを圧縮したモデルで、その名の通り15億パラメータに減らされています。容量にすると、GPUメモリが4GBあれば足りるため、Google Colabの無料枠の範囲内で動かすことができます。

準備：Google Colabの設定

Google Colabにアクセス
新しいノートブックを作成
上部メニュー → 「ランタイム」 → 「ランタイムのタイプを変更」
ハードウェアアクセラレータで「T4 GPU」を選択

実行

Google Colabはラインタイムを起動した時点でPyTorchやTransformersなど必要なライブラリがインストール済みのため、ライブラリの追加インストールは不要です。

ステップ1：モデルのダウンロードと読み込み

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(model_name)
print("Model loaded successfully!")

モデルをダウンロードするために5分ほど時間がかかります。

ステップ2：推論の実行

prompt = "Why is the sky blue?"

messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    pad_token_id=tokenizer.eos_token_id,
    max_new_tokens=1024,
    do_sample=True,
    temperature=0.6,
    top_p=0.95,
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

1行目のprompt = "Why is the sky blue?"の部分を変更することで、質問を変えることができます。ちなみにこの空はなぜ青い？に対しては以下の回答でした。蒸留モデルにもかかわらず正しい回答が得られました。

<think> </think> The blue color of the sky is primarily due to a combination of factors, including atmospheric conditions and the color of the Earth's atmosphere. Here is a detailed explanation: 1. **Atmospheric Composition**: The Earth's atmosphere is composed primarily of nitrogen (78%) and oxygen (21%), with trace amounts of other gases. As sunlight enters the Earth's atmosphere, it passes through the atmosphere, and the shorter blue light is refracted (bent) more than the longer red light when entering the atmosphere. This effect is known as "refraction," and it causes the red light to appear more concentrated near the horizon and the blue light to appear more spread out in the sky. 2. **Scattering of Light**: In the Earth's atmosphere, sunlig …(省略)

蒸留モデルの感想

パラメータ数はオリジナルモデルの0.2%ほどにもかかわらず、ある程度適切な回答が得られました。印象としては初代のChatGPTくらいのレベルではないかと思います。

ほかの蒸留モデルについて

DeepSeek-R1は今回試した1.5B以外に、7B,8B,14B,32B,70B版も公開されています。もしGoogle Colab Pro(有料版)を契約しているのであれば、Nvidia A100 GPUを選択して、14Bまでのモデルを動かすことができるはずです。

モデルの変更はコードの冒頭部分を変更するだけです。

model_name = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"