ローカルLLM のリリース年表

npaka

2023年9月20日 20:49

主な「ローカルLLM」のリリース年表をまとめました。

2022年

11月30日 OpenAI - ChatGPT リリース

2023年

2月24日 LLaMA

7B、13B、33B、65B を研究者向けの限定リリース。

3月13日 Alpaca

Instructionデータセットによる学習効果が注目される。

3月14日 OpenAI - GPT-4 リリース

3月19日 Llama.cpp

Llamaの高速推論が注目される。

3月20日 Alpaca-LoRA

LoRAでのInstructionデータセットによる学習効果が注目される。

3月30日 Vicuna

ShareGPT (ChatGPTログ) による学習効果が注目される。

4月17日 RedPajama-Data-1T

オープンなLlamaを実装するため大規模データセットが作成される。

5月15日 RWKV-Raven

RWKV

・RWKV/rwkv-raven-14b
・RWKV/rwkv-raven-7b
・RWKV/rwkv-raven-3b
・RWKV/rwkv-raven-1b5
・RWKV/rwkv-4-14b-pile
・RWKV/rwkv-4-7b-pile
・RWKV/rwkv-4-3b-pile
・RWKV/rwkv-4-1b5-pile
・RWKV/rwkv-4-430m-pile
・RWKV/rwkv-4-169m-pile

5月17日 Rinna-3.6B

・rinna/japanese-gpt-neox-3.6b
・rinna/japanese-gpt-neox-3.6b-instruction-sft

5月18日 OpenCALM

・cyberagent/open-calm-7b
・cyberagent/open-calm-3b
・cyberagent/open-calm-1b
・cyberagent/open-calm-large
・cyberagent/open-calm-small
・cyberagent/open-calm-medium

5月31日 Rinna-3.6B-instruction-ppo

・rinna/japanese-gpt-neox-3.6b-instruction-ppo

7月14日 RWKV-4-World

RWKV

・BlinkDL/rwkv-4-world

7月17日 OpenAI - Code Interpreter リリース

7月19日 LLaMA 2

・meta-llama/Llama-2-7b-hf
・meta-llama/Llama-2-13b-hf
・meta-llama/Llama-2-70b-hf
・meta-llama/Llama-2-7b-chat-hf
・meta-llama/Llama-2-13b-chat-hf
・meta-llama/Llama-2-70b-chat-hf
・meta-llama/Llama-2-7b
・meta-llama/Llama-2-13b
・meta-llama/Llama-2-70b
・meta-llama/Llama-2-7b-chat
・meta-llama/Llama-2-13b-chat
・meta-llama/Llama-2-70b-chat

7月31日 Rinna-4B

・rinna/bilingual-gpt-neox-4b
・rinna/bilingual-gpt-neox-4b-8k
・rinna/bilingual-gpt-neox-4b-instruction-sft
・rinna/bilingual-gpt-neox-4b-instruction-ppo
・rinna/bilingual-gpt-neox-4b-minigpt4

8月10日 Japanese StableLM Alpha-7B

・stabilityai/japanese-stablelm-base-alpha-7b
・stabilityai/japanese-stablelm-instruct-alpha-7b

8月11日 AIBunCho-6B

・AIBunCho/japanese-novel-gpt-j-6b

8月14日 Line-3.6B

・line-corporation/japanese-large-lm-3.6b

8月17日 Japanese InstructBLIP Alpha

・stabilityai/japanese-instructblip-alpha

8月18日 Line-3.6B-instruction-sft

・line-corporation/japanese-large-lm-3.6b-instruction-sft

8月22日 WebLab-10B

・matsuo-lab/weblab-10b
・matsuo-lab/weblab-10b-instruction-sft

8月24日 CodeLlama

Code

・codellama/CodeLlama-34b-hf
・codellama/CodeLlama-34b-Instruct-hf
・codellama/CodeLlama-34b-Python-hf
・codellama/CodeLlama-13b-hf
・codellama/CodeLlama-13b-Instruct-hf
・codellama/CodeLlama-13b-Python-hf
・codellama/CodeLlama-7b-hf
・codellama/CodeLlama-7b-Instruct-hf
・codellama/CodeLlama-7b-Python-hf

8月29日 ELYZA-7B

・elyza/ELYZA-japanese-Llama-2-7b-instruct
・elyza/ELYZA-japanese-Llama-2-7b-fast-instruct
・elyza/ELYZA-japanese-Llama-2-7b
・elyza/ELYZA-japanese-Llama-2-7b-fast

9月6日 Open Interpreter

ローカル環境で、OpenAIの「Code Interpreter」相当の機能を実行可能になる。

9月6日 Falcon 180B

ローカルLLM初の180BでGPT-3 (175B) 以上のパラメータになる。

・tiiuae/falcon-180B

9月7日 Heron

VLM

・turing-motors/heron-preliminary-git-Llama-2-70b-v0
・turing-motors/heron-chat-blip-ja-stablelm-base-7b-v0
・turing-motors/heron-chat-git-ELYZA-fast-7b-v0
・turing-motors/heron-chat-git-ja-stablelm-base-7b-v0

9月21日 Xwin-LM

ベンチマーク「AlpacaEval」で「GPT-4」を追い抜き1位を獲得したモデル。

・Xwin-LM/Xwin-LM-70B-V0.1
・Xwin-LM/Xwin-LM-13B-V0.1
・Xwin-LM/Xwin-LM-7B-V0.1

9月25日 OpenAI - GPT-4Vリリース

9月27日 Mistral-7B-v0.1

「Mistral AI」が開発したLLM。7Bにもかわらず「Llama 2 13B」や「Llama 1 34B」など大きなモデルよりもベンチマークで高スコアを獲得したモデル。

・mistralai/Mistral-7B-v0.1
・mistralai/Mistral-7B-Instruct-v0.1

9月28日 PLaMo-13B

・pfnet/plamo-13b

10月3日 Qwen-14B

・Qwen/Qwen-14B
・Qwen/Qwen-14B-Chat

10月5日 LLaVA-1.5

VLM

・liuhaotian/llava-v1.5-13b

10月10日 Japanese StableLM Instruct Alpha-7B-v2

・stabilityai/japanese-stablelm-instruct-alpha-7b-v2

10月20日 LLM-jp-13B

・llm-jp-13b-instruct-full-jaster-v1.0
・llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0
・llm-jp-13b-instruct-full-dolly-oasst-v1.0
・llm-jp-13b-instruct-lora-jaster-v1.0
・llm-jp-13b-instruct-lora-jaster-dolly-oasst-v1.0
・llm-jp-13b-instruct-lora-dolly-oasst-v1.0
・llm-jp-13b-v1.0
・llm-jp-1.3b-v1.0

10月25日 Japanese Stable LM 3B-4E1T

・Japanese Stable LM 3B-4E1T Base
・Japanese Stable LM 3B-4E1T Instruct

10月25日 Japanese Stable LM Gamma 7B

・Japanese Stable LM Base Gamma 7B
・Japanese Stable LM Instruct Gamma 7B

10月26日 Stockmark-13B

・stockmark/stockmark-13b

10月27日 Zephyr-7B-Beta

・HuggingFaceH4/zephyr-7b-beta

10月25日 RWKV-5-World

RWKV

・BlinkDL/rwkv-5-world

10月31日 Youri-7B

・rinna/youri-7b
・rinna/youri-7b-instruction
・rinna/youri-7b-chat
・rinna/youri-7b-gptq
・rinna/youri-7b-instruction-gptq
・rinna/youri-7b-chat-gptq

11月2日 Japanese Stable LM Beta

・stabilityai/japanese-stablelm-base-beta-7b
・stabilityai/japanese-stablelm-base-beta-70b
・stabilityai/japanese-stablelm-instruct-beta-7b
・stabilityai/japanese-stablelm-instruct-beta-70b
・stabilityai/japanese-stablelm-base-ja_vocab-beta-7b
・stabilityai/japanese-stablelm-instruct-ja_vocab-beta-7b

11月2日 CALM2

・cyberagent/calm2-7b
・cyberagent/calm2-7b-chat

11月6日 OpenAI DevDay

11月4日 DeepSeek Coder

Code

・deepseek-ai/deepseek-coder-33b-instruct
・deepseek-ai/deepseek-coder-33b-base
・deepseek-ai/deepseek-coder-6.7b-instruct
・deepseek-ai/deepseek-coder-6.7b-base
・deepseek-ai/deepseek-coder-5.7bmqa-base
・deepseek-ai/deepseek-coder-1.3b-instruct
・deepseek-ai/deepseek-coder-1.3b-base

11月7日 PLaMo-13B-Instruct

・pfnet/plamo-13b-instruct
・pfnet/plamo-13b-instruct-nc

11月13日 Japanese Stable VLM

VLM

・stabilityai/japanese-stable-vlm

11月15日 ELYZA-japanese-CodeLlama-7b

Code

・elyza/ELYZA-japanese-CodeLlama-7b
・elyza/ELYZA-japanese-CodeLlama-7b-instruct

11月15日 Japanese Stable CLIP

・stabilityai/japanese-stable-clip-vit-l-16

11月28日 Starling-7B

・berkeley-nest/Starling-LM-7B-alpha

11月30日 DeepSeek LLM

・deepseek-ai/deepseek-llm-67b-chat
・deepseek-ai/deepseek-llm-67b-base
・deepseek-ai/deepseek-llm-7b-chat
・deepseek-ai/deepseek-llm-7b-base

12月1日 Qwen-72B・Qwen-Audio

・Qwen/Qwen-72B
・Qwen/Qwen-72B-Chat
・Qwen/Qwen-Audio-Chat

12月6日 Shisa-7B

・augmxnt/shisa-base-7b-v1
・augmxnt/shisa-7b-v1

12月8日 StableLM Zephyr 3B

・stabilityai/stablelm-zephyr-3b

12月8日 StripedHyena-7B

Hyena

・togethercomputer/StripedHyena-Hessian-7B
・togethercomputer/StripedHyena-Nous-7B

12月9日 Mixtral-8x7b-v0.1

・mistralai/Mixtral-8x7B-v0.1
・mistralai/Mixtral-8x7B-Instruct-v0.1

12月11日 Mistral-7B-Instruct-v0.2

・mistralai/Mistral-7B-Instruct-v0.2

※ Mistral-7B-v0.1ベースのInstructモデルのv0.2

12月13日 phi-2

・microsoft/phi-2

12月19日 Swallow

・tokyotech-llm/Swallow-7b-hf
・tokyotech-llm/Swallow-7b-instruct-hf
・tokyotech-llm/Swallow-13b-hf
・tokyotech-llm/Swallow-13b-instruct-hf
・tokyotech-llm/Swallow-70b-hf
・tokyotech-llm/Swallow-70b-instruct-hf

12月19日 PowerInfer

Inference

12月21日 Nekomata

・rinna/nekomata-14b
・rinna/nekomata-14b-instruction
・rinna/nekomata-7b
・rinna/nekomata-7b-instruction

12月27日 ELYZA-japanese-Llama-2-13B

・elyza/ELYZA-japanese-Llama-2-13b
・elyza/ELYZA-japanese-Llama-2-13b-instruct
・elyza/ELYZA-japanese-Llama-2-13b-fast
・elyza/ELYZA-japanese-Llama-2-13b-fast-instruct

12月29日 Karasu・Qarasu

・lightblue/qarasu-14B-chat-plus-unleashed
・lightblue/karasu-7B-chat-plus-unleashed
・lightblue/karasu-7B-chat
・lightblue/karasu-7B

2024年

1月3日 M2UGen

MLM

・M2UGen/M2UGen-MusicGen-small
・M2UGen/M2UGen-MusicGen-medium
・M2UGen/M2UGen-AudioLDM2

1月10日 Phixtral

・mlabonne/phixtral-4x2_8
・mlabonne/phixtral-2x2_8

🔀 Phixtral

I made the first efficient Mixture of Experts with phi-2 models. 🥳

It combines 2 to 4 fine-tuned models and is better than each individual expert.

🤗 phixtral-2x2_8: https://t.co/XbPpsF76vN
🤗 phixtral-4x2_8: https://t.co/9xfRd46585 pic.twitter.com/coRpRIxG2V
— Maxime Labonne (@maximelabonne) January 9, 2024

1月16日 Stable Code 3B

Code

・stabilityai/stable-code-3b

1月20日 StableLM 2 1.6B

・stabilityai/stablelm-2-1_6b

1月22日 Stable LM 2 1.6B

・stabilityai/stablelm-2-1_6b
・stabilityai/stablelm-2-zephyr-1_6b

1月23日 Orion-14B

・OrionStarAI/Orion-14B-Base
・OrionStarAI/Orion-14B-Chat
・OrionStarAI/Orion-14B-LongChat
・OrionStarAI/Orion-14B-Chat-RAG
・OrionStarAI/Orion-14B-Chat-Plugin
・OrionStarAI/Orion-14B-Base-Int4
・OrionStarAI/Orion-14B-Chat-Int4

1月23日 Yi-VL-34B

・01-ai/Yi-VL-34B
・01-ai/Yi-VL-6B

1月29日 RWKV-Eagle-7B

RWKV

・RWKV/v5-Eagle-7B

1月29日 CodeLlama-70B

Code

・codellama/CodeLlama-70b-hf
・codellama/CodeLlama-70b-Instruct-hf
・codellama/CodeLlama-70b-Python-hf

1月30日 LLaVA-1.6

VLM

・liuhaotian/llava-v1.6-34b
・liuhaotian/llava-v1.6-mistral-7b
・liuhaotian/llava-v1.6-vicuna-13b
・liuhaotian/llava-v1.6-vicuna-7b

1月31日 KARAKURI LM

・karakuri-ai/karakuri-lm-70b-v0.1
・karakuri-ai/karakuri-lm-70b-chat-v0.1

2月4日 Qwen1.5

・Qwen/Qwen1.5-0.5B
・Qwen/Qwen1.5-1.8B
・Qwen/Qwen1.5-4B
・Qwen/Qwen1.5-7B
・Qwen/Qwen1.5-14B
・Qwen/Qwen1.5-72B
・Qwen/Qwen1.5-0.5B-Chat
・Qwen/Qwen1.5-1.8B-Chat
・Qwen/Qwen1.5-4B-Chat
・Qwen/Qwen1.5-7B-Chat
・Qwen/Qwen1.5-14B-Chat
・Qwen/Qwen1.5-72B-Chat

2月7日 MobileVLM V2

・mtgv/MobileVLM_V2-7B
・mtgv/MobileVLM_V2-3B
・mtgv/MobileVLM_V2-1.7B

2月9日 LLM-jp 13B v1.1

・llm-jp/llm-jp-13b-dpo-lora-hh_rlhf_ja-v1.1
・llm-jp/llm-jp-13b-instruct-full-dolly_en-dolly_ja-ichikara_003_001-oasst_en-oasst_ja-v1.1
・llm-jp/llm-jp-13b-instruct-lora-dolly_en-dolly_ja-ichikara_003_001-oasst_en-oasst_ja-v1.1

2月19日 kotomamba

Mamba

・kotoba-tech/kotomamba-2.8B-v1.0
・kotoba-tech/kotomamba-2.8B-CL-v1.0

2月21日 Gemma

・google/gemma-7b
・google/gemma-7b-it
・google/gemma-2b
・google/gemma-2b-it

2月28日 StarCoder 2

Code

・bigcode/starcoder2-3b
・bigcode/starcoder2-7b
・bigcode/starcoder2-15b

2月28日 BitNet

3月3日 Swallow-7B-plus

・tokyotech-llm/Swallow-7b-plus-hf

3月6日 heron-blip-v1

VLM

・turing-motors/heron-chat-blip-ja-stablelm-base-7b-v1

3月11日 Swallow-MS 7B

・tokyotech-llm/Swallow-MS-7b-v0.1

3月11日 Swallow-MX 8x7B

・tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1

3月11日 Command R

・CohereForAI/c4ai-command-r-v01
・CohereForAI/c4ai-command-r-v01-4bit

3月17日 Grok-1

・xai-org/grok-1

3月21日 EvoVLM-JP-v1

VLM

・SakanaAI/EvoVLM-JP-v1-7B
・SakanaAI/EvoLLM-JP-v1-10B

3月21日 EvoLLM-JP-v1

・SakanaAI/EvoLLM-JP-A-v1-7B
・SakanaAI/EvoLLM-JP-v1-7B

3月21日 RakutenAI-7B

・Rakuten/RakutenAI-7B-instruct
・Rakuten/RakutenAI-7B-chat
・Rakuten/RakutenAI-7B

3月22日 ao-Karasu-72B

・lightblue/ao-karasu-72B
・lightblue/ao-karasu-72B-AWQ-4bit

3月24日 Mistral-7B-v0.2

Mistral just announced at @SHACK15sf that they will release a new model today:

Mistral 7B v0.2 Base Model

- 32k instead of 8k context window
- Rope Theta = 1e6
- No sliding window pic.twitter.com/iAuEUEOw5K
— Marvin von Hagen (@marvinvonhagen) March 23, 2024

3月27日 DBRX

・databricks/dbrx-base
・databricks/dbrx-instruct

3月28日 Qwen1.5-MoE

・Qwen/Qwen1.5-MoE-A2.7B-Chat
・Qwen/Qwen1.5-MoE-A2.7B
・Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4

3月28日 Jamba

Mamba

・ai21labs/Jamba-v0.1

4月2日 Qwen1.5-32B

・Qwen/Qwen1.5-32B
・Qwen/Qwen1.5-32B-Chat

4月3日 LightChatAssistant

・Sdff-Ltba/LightChatAssistant-2x7B

4月4日 Command R+

・CohereForAI/c4ai-command-r-plus

4月5日 JetMoE-8B

・jetmoe/jetmoe-8b
・jetmoe/jetmoe-8b-sft
・jetmoe/jetmoe-8b-chat

4月5日 Gemma-1.1

・google/gemma-1.1-7b-it
・google/gemma-1.1-2b-it

4月8日 Stable LM 2 12B

・stabilityai/stablelm-2-12b
・stabilityai/stablelm-2-12b-chat

4月9日 CodeGemma

Code

・google/codegemma-7b
・google/codegemma-7b-it
・google/codegemma-2b

4月9日 RecurrentGemma

・google/recurrentgemma-2b
・google/recurrentgemma-2b-it

4月15日 Idefics2

VLM

・HuggingFaceM4/idefics2-8b
・HuggingFaceM4/idefics2-8b-base

4月15日 Japanese-Starling-ChatV-7B

・TFMC/Japanese-Starling-ChatV-7B
・TFMC/Japanese-Starling-ChatV-7B-GGUF

4月16日 WizardLM-2 8x22B・70B・7B

🔥Today we are announcing WizardLM-2, our next generation state-of-the-art LLM.

New family includes three cutting-edge models: WizardLM-2 8x22B, 70B, and 7B - demonstrates highly competitive performance compared to leading proprietary LLMs.

📙Release Blog:… pic.twitter.com/bclr4aBib1
— WizardLM (@WizardLM_AI) April 15, 2024

4月17日 Mixtral-8x22B-v0.1

・mistralai/Mixtral-8x22B-v0.1
・mistralai/Mixtral-8x22B-Instruct-v0.1

4月18日 Llama 3

・meta-llama/Meta-Llama-3-8B
・meta-llama/Meta-Llama-3-8B-Instruct
・meta-llama/Meta-Llama-3-70B
・meta-llama/Meta-Llama-3-70B-Instruct

4月23日 Suzume-Llama-3-8B

・lightblue/suzume-llama-3-8B-japanese

4月23日 Phi-3-mini

・microsoft/Phi-3-mini-4k-instruct
・microsoft/Phi-3-mini-128k-instruct

4月24日 OpenELM

・apple/OpenELM-270M
・apple/OpenELM-450M
・apple/OpenELM-1_1B
・apple/OpenELM-3B
・apple/OpenELM-270M-Instruct
・apple/OpenELM-450M-Instruct
・apple/OpenELM-1_1B-Instruct
・apple/OpenELM-3B-Instruct

4月24日 LEIA

・leia-llm/Leia-Swallow-7b
・leia-llm/Leia-Swallow-13b

4月24日 Snowflake Arctic

・Snowflake/snowflake-arctic-base
・Snowflake/snowflake-arctic-instruct

4月24日 Antler-7B-Novel-Writing

・Aratako/Antler-7B-Novel-Writing
・Aratako/Antler-7B-Novel-Writing-GGUF

4月26日 SniffyOtter-7B-Novel-Writing-NSFW

・Aratako/SniffyOtter-7B-Novel-Writing-NSFW
・Aratako/SniffyOtter-7B-Novel-Writing-NSFW-GGUF

4月26日 Qwen/Qwen1.5-110B

・Qwen/Qwen1.5-110B
・Qwen/Qwen1.5-110B-Chat

4月26日 Swallow-MS-7b-instruct v0.1

・tokyotech-llm/Swallow-MS-7b-instruct-v0.1

4月29日 StarCoder2-Instruct

Code

・bigcode/starcoder2-15b-instruct-v0.1

4月30日 LLM-jp-13B v2.0

・llm-jp/llm-jp-13b-instruct-full-ac_001_16x-dolly-ichikara_004_001_single-oasst-oasst2-v2.0
・llm-jp/llm-jp-13b-instruct-full-ac_001-dolly-ichikara_004_001_single-oasst-oasst2-v2.0
・llm-jp/llm-jp-13b-instruct-full-dolly-ichikara_004_001_single-oasst-oasst2-v2.0
・llm-jp/llm-jp-13b-v2.0

5月1日 Llama-3-Youko-8B

・rinna/llama-3-youko-8b

5月1日 Ninja-v1 ・ Vecteus-v1

・Local-Novel-LLM-project/Ninja-v1
・Local-Novel-LLM-project/Ninja-v1-128k
・Local-Novel-LLM-project/Ninja-v1-NSFW
・Local-Novel-LLM-project/Ninja-v1-NSFW-128k
・Local-Novel-LLM-project/Vecteus-v1

5月3日 Assistance

・Local-Novel-LLM-project/Assistance

5月7日 DeepSeek-V2

・deepseek-ai/DeepSeek-V2

5月7日 KARAKURI LM 8x7B Chat v0.1

・karakuri-ai/karakuri-lm-8x7b-chat-v0.1

KARAKURI LM 8x7B Chat v0.1を公開しました！

model: https://t.co/bJJ9Tad1mH
demo: https://t.co/QlWZ8W2i9n

AWS Trainiumで学習されたMoEモデルとしては多分世界初です。
詳細はスレッドに。 pic.twitter.com/2wuBGPZJL5
— Tomofumi Nakayama (@txmy) May 7, 2024

5月7日 KARAKURI LM 7B APM v0.1

・karakuri-ai/karakuri-lm-7b-apm-v0.1

5月9日 Japanese Stable LM 2 1.6B

・stabilityai/japanese-stablelm-2-base-1_6b
・stabilityai/japanese-stablelm-2-instruct-1_6b

5月9日 ArrowPro-7B-KUJIRA

・DataPilot/ArrowPro-7B-KUJIRA

5月10日 ArrowPro-7B-RobinHood

・DataPilot/ArrowPro-7B-RobinHood

5月10日 Ocuteus-v1

・Local-Novel-LLM-project/Ocuteus-v1

5月10日 Fugaku-LLM-13B

・Fugaku-LLM/Fugaku-LLM-13B
・Fugaku-LLM/Fugaku-LLM-13B-instruct

5月13日 OpenAI - GPT-4o リリース

5/13 Yi-1.5

・01-ai/Yi-1.5-34B-Chat
・01-ai/Yi-1.5-34B-Chat-16K
・01-ai/Yi-1.5-34B
・01-ai/Yi-1.5-34B-32K
・01-ai/Yi-1.5-9B-Chat
・01-ai/Yi-1.5-9B-Chat-16K
・01-ai/Yi-1.5-9B
・01-ai/Yi-1.5-9B-32K
・01-ai/Yi-1.5-6B-Chat
・01-ai/Yi-1.5-6B

5月14日 PaliGemma

VLM

・google/paligemma-3b-pt-224
・google/paligemma-3b-pt-448
・google/paligemma-3b-pt-896
・google/paligemma-3b-mix-224
・google/paligemma-3b-mix-448

5月16日 Stockmark-100b

・stockmark/stockmark-100b
・stockmark/stockmark-100b-instruct-v0.1

5月20日 MiniCPM-Llama3-V 2.5

VLM

・openbmb/MiniCPM-Llama3-V-2_5

5月21日 Phi-3-small (7B)

・microsoft/Phi-3-small-128k-instruct
・microsoft/Phi-3-small-8k-instruct

5月21日 Phi-3-medium (14B)

・microsoft/Phi-3-medium-128k-instruct
・microsoft/Phi-3-medium-4k-instruct

5月21日 Phi-3-vision

VLM

・microsoft/Phi-3-vision-128k-instruct

5月21日 Ninja-v1-RP-expressive

・Aratako/Ninja-v1-RP-expressive

5月22日 Mistral-7B-v0.3

・mistralai/Mistral-7B-Instruct-v0.3
・mistralai/Mistral-7B-v0.3

5月23日 Aya-23

・CohereForAI/aya-23-35B
・CohereForAI/aya-23-8B

5月26日 ArrowPro-7B-KillerWhale

・DataPilot/ArrowPro-7B-KillerWhale

5月29日 Codestral-22B-v0.1

Code

・mistralai/Codestral-22B-v0.1

5月29日 Umievo-itr012-Gleipnir-7B

・umiyuki/Umievo-itr012-Gleipnir-7B

6月1日 Tanuki-8B

・hatakeyama-llm-team/Tanuki-8B
・hatakeyama-llm-team/Tanuki-8B-Instruct
・hatakeyama-llm-team/Tanuki-8B-Instruct-without-DPO

6月1日 Oumuamua-7B

・nitky/Oumuamua-7b-instruct-v2

6月5日 GLM-4-9B

・THUDM/glm-4-9b
・THUDM/glm-4-9b-chat
・THUDM/glm-4-9b-chat-1m
・THUDM/glm-4v-9b

6月7日 Qwen2

・Qwen/Qwen2-72B-Instruct
・Qwen/Qwen2-72B
・Qwen/Qwen2-57B-A14B-Instruct
・Qwen/Qwen2-57B-A14B
・Qwen/Qwen2-7B-Instruct
・Qwen/Qwen2-7B
・Qwen/Qwen2-1.5B-Instruct
・Qwen/Qwen2-1.5B
・Qwen/Qwen2-0.5B-Instruct
・Qwen/Qwen2-0.5B

6月12日 RecurrentGemma-9B

・google/recurrentgemma-9b
・google/recurrentgemma-9b-it

6月13日 Llava Calm2 Siglip

VLM

・cyberagent/llava-calm2-siglip

6月14日 Nemotron-4-340B

・nvidia/Nemotron-4-340B-Instruct

6月14日 Sarashina1

・sbintuitions/sarashina1-65b
・sbintuitions/sarashina1-13b
・sbintuitions/sarashina1-7b

6月14日 Sarashina2

・sbintuitions/sarashina2-13b
・sbintuitions/sarashina2-7b

6月20日 KARAKURI LM 8x7B Instruct v0.1

・karakuri-ai/karakuri-lm-8x7b-instruct-v0.1

6月26日 Llama-3-ELYZA-JP-8B

・elyza/Llama-3-ELYZA-JP-8B

6月27日 Gemma 2

・google/gemma-2-27b-it
・google/gemma-2-27b
・google/gemma-2-9b-it
・google/gemma-2-9b

7月1日 Llama-3-Swallow

・Llama-3-Swallow-8B-v0.1
・Llama-3-Swallow-8B-Instruct-v0.1
・Llama-3-Swallow-70B-v0.1
・Llama-3-Swallow-70B-Instruct-v0.1

7月3日 CALM3-22B-Chat

・cyberagent/calm3-22b-chat

7月3日 internLM 2.5

・internlm/internlm2_5-7b-chat
・internlm/internlm2_5-7b-chat-1m
・internlm/internlm2_5-7b

7月3日 InternLM-XComposer2.5

VLM

・internlm/internlm-xcomposer2d5-7b

7月16日 Mathstral-7B-v0.1

Math

・mistralai/mathstral-7B-v0.1

7月16日 Mamba-Codestral-7B-v0.1

Mamba、Code

・mistralai/mamba-codestral-7B-v0.1

7月18日 Mistral NeMo

・mistralai/Mistral-Nemo-Base-2407
・mistralai/Mistral-Nemo-Instruct-2407

7月19日 DCLM-7B

・apple/DCLM-7B
・apple/DCLM-7B-8k

7月19日 Athene-70B

7月23日 Llama-3.1-405B・70B・8B

7月24日 Mistral Large 2

・mistralai/Mistral-Large-Instruct-2407

7月25日 Llama 3 Youko

・rinna/llama-3-youko-70b
・rinna/llama-3-youko-70b-instruct
・rinna/llama-3-youko-8b
・rinna/llama-3-youko-8b-instruct

7月26日 Llama-3.1-70B-Japanese-Instruct-2407

・cyberagent/Llama-3.1-70B-Japanese-Instruct-2407

7月30日 Llama-3.1-70B-EZO-1.1-it ・ Llama-3.1-8B-EZO-1.1-it

・HODACHI/Llama-3.1-70B-EZO-1.1-it
・HODACHI/Llama-3.1-8B-EZO-1.1-it

7月31日 Gemma 2 2B

・google/gemma-2-2b
・google/gemma-2-2b-it

7月31日 ShieldGemma

Moderation

・google/shieldgemma-2b

8月1日 EZO-Common-T2-2B-gemma-2-it

・HODACHI/EZO-Common-T2-2B-gemma-2-it

8月2日 Llama-3-EvoVLM-JP-v2

VLM

・SakanaAI/Llama-3-EvoVLM-JP-v2

8月5日 Llama-3-EZO-VLM-1

VLM

・HODACHI/Llama-3-EZO-VLM-1

8月7日 MiniCPM-V2.6

VLM

・openbmb/MiniCPM-V-2_6

8月7日 Sarashina2-70B

・sbintuitions/sarashina2-70b

8月12日 FalconMamba 7B

Mamba

・tiiuae/falcon-mamba-7b

8月13日 LongWriter

・THUDM/LongWriter-llama3.1-8b
・THUDM/LongWriter-glm4-9b

8月19日 EZO-InternVL2-26B

VLM

・HODACHI/EZO-InternVL2-26B

8月20日 Phi-3.5-mini-instruct

・microsoft/Phi-3.5-mini-instruct

8月20日 Phi-3.5-MoE-instruct

・microsoft/Phi-3.5-MoE-instruct

8月20日 Phi-3.5-vision-instruct

VLM

・microsoft/Phi-3.5-vision-instruct

8月21日 Borea-Phi-3.5-mini-Instruct

・HODACHI/Borea-Phi-3.5-mini-Instruct-Jp
・HODACHI/Borea-Phi-3.5-mini-Instruct-Common
・HODACHI/Borea-Phi-3.5-mini-Instruct-Coding

8月22日 Jamba-1.5

Mamba

・ai21labs/AI21-Jamba-1.5-Mini
・ai21labs/AI21-Jamba-1.5-Large

8月29日 Qwen2-VL

VLM

・Qwen/Qwen2-VL-7B-Instruct
・Qwen/Qwen2-VL-2B-Instruct

8月30日 Tanuki-8x8B

・weblab-GENIAC/Tanuki-8x8B-dpo-v1.0
・weblab-GENIAC/Tanuki-8B-dpo-v1.0

8月30日 Command-R-plus-08-2024 ・ Command-R-08-2024

・CohereForAI/c4ai-command-r-plus-08-2024
・CohereForAI/c4ai-command-r-08-2024

9月6日 DeepSeek-V2.5

・deepseek-ai/DeepSeek-V2.5

9月11日 Reader-LM

HTML-to-Markdown

・jinaai/reader-lm-1.5b
・jinaai/reader-lm-0.5b

9月11日 Pixtral-12B

VLM

・mistralai/Pixtral-12B-2409

9月11日 LLaMA-Omni

Speech-to-Speech

・ICTNLP/Llama-3.1-8B-Omni

9月12日 DataGemma

・google/datagemma-rig-27b-it
・google/datagemma-rag-27b-it

9月17日 LLM-jp-3 172B beta1

・llm-jp/llm-jp-3-172b-beta1-instruct
・llm-jp/llm-jp-3-172b-beta1

9月17日 Mistral-Small-Instruct-2409

・mistralai/Mistral-Small-Instruct-2409

9月18日 CogVideoX-5b-I2V

・THUDM/CogVideoX-5b-I2V

9月18日 Qwen2.5

・Qwen2.5 : 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B
・Qwen2.5-Coder : 1.5B, 7B
・Qwen2.5-Math : 1.5B, 7B, 72B

9月18日 Moshi

9月19日 Kurage

9月24日 EZO-Qwen2.5 ・ EZO-AutoCoTRAG-Qwen2.5

・AXCXEPT/EZO-Qwen2.5-32B-Instruct
・AXCXEPT/EZO-Qwen2.5-72B-Instruct
・AXCXEPT/EZO-AutoCoTRAG-Qwen2.5-32B-Instruct
・AXCXEPT/EZO-AutoCoTRAG-Qwen2.5-72B-Instruct_q4

9月25日 LLM-jp-3 1.8B・3.7B・13B

・llm-jp/llm-jp-3-1.8b
・llm-jp/llm-jp-3-1.8b-instruct
・llm-jp/llm-jp-3-3.7b
・llm-jp/llm-jp-3-3.7b-instruct
・llm-jp/llm-jp-3-13b
・llm-jp/llm-jp-3-13b-instruct

9月25日 Llama 3.2 Vision

VLM

・meta-llama/Llama-3.2-11B-Vision
・meta-llama/Llama-3.2-11B-Vision-Instruct
・meta-llama/Llama-3.2-90B-Vision
・meta-llama/Llama-3.2-90B-Vision-Instruct

9月25日 Llama 3.2 1B・3B

・meta-llama/Llama-3.2-1B
・meta-llama/Llama-3.2-1B-Instruct
・meta-llama/Llama-3.2-3B
・meta-llama/Llama-3.2-3B-Instruct

9月25日 Molmo

VLM

・allenai/Molmo-72B-0924
・allenai/Molmo-7B-D-0924
・allenai/Molmo-7B-O-0924
・allenai/MolmoE-1B-0924

9月30日 llm-jp-3-3.7b-instruct-EZO-Humanities ・ llm-jp-3-3.7b-instruct-EZO-Common

・AXCXEPT/llm-jp-3-3.7b-instruct-EZO-Humanities
・AXCXEPT/llm-jp-3-3.7b-instruct-EZO-Common

10月3日 Gemma 2 Baku 2B

・rinna/gemma-2-baku-2b
・rinna/gemma-2-baku-2b-it

10月3日 Gemma 2 JPN

・google/gemma-2-2b-jpn-it

10月8日 Llama-3.1-Swallow v0.1

・tokyotech-llm/Llama-3.1-Swallow-8B-v0.1
・tokyotech-llm/Llama-3.1-Swallow-70B-v0.1
・tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1
・tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1

10月10日 Entropix

・xjdr-alt/entropix

10月10日 ARIA

Multimodal Native MoE Model

10月15日 Ichigo Llama 3.1

Real Time Voice AI

10月15日 Zamba2-7B-Instruct

Mamba

・Zyphra/Zamba2-7B-Instruct

10月15日 PLaMo-100B

・pfnet/plamo-100b

10月16日 Gemma-ASP

10月16日 Ministral

・mistralai/Ministral-8B-Instruct-2410

10月16日 Llama-3.1-Nemotron-70B

・nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
・nvidia/Llama-3.1-Nemotron-70B-Reward-HF

10月18日 bitnet.cpp

BitNet

・microsoft/BitNet

10月18日 Janus-1.3B

マルチモーダル理解(Image+Text→Text)と生成(Text→Image)の両対応

・deepseek-ai/Janus

10月18日 Meta Spirit LM

・facebookresearch/spiritlm

10月21日 Granite 3.0

10月24日 Aya Expanse

・CohereForAI/aya-expanse-8b
・CohereForAI/aya-expanse-32b

10月31日 SmolLM2

11月8日 Sarashina2-8x70B

11月11日 Llama-3.1-Swallow v0.2

・tokyotech-llm/Llama-3.1-Swallow-8B-v0.2
・tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.2

11月12日 Qwen2.5-Coder 32B

11月14日 Athene-V2 72B

・Nexusflow/Athene-V2-Chat
・Nexusflow/Athene-V2-Agent

11月15日 LLM-jp-3 172B beta2

・llm-jp/llm-jp-3-172b-beta2
・llm-jp/llm-jp-3-172b-beta2-instruct2

11月18日 Pixtral Large ・ Mistral Large 2211

・mistralai/Pixtral-Large-Instruct-2411
・mistralai/Mistral-Large-Instruct-2411

11月20日 LLM-jp-3 VILA 14B

・llm-jp/llm-jp-3-vila-14b

11月28日 QwQ-32B-Preview

・Qwen/QwQ-32B-Preview

12月6日 Llama 3.3 70B

・meta-llama/Llama-3.3-70B-Instruct

As we continue to explore new post-training techniques, today we're releasing Llama 3.3 — a new open source model that delivers leading performance and quality across text-based use cases such as synthetic data generation at a fraction of the inference cost. pic.twitter.com/BNoV2czGKL
— AI at Meta (@AIatMeta) December 6, 2024

12月6日 Qwen2-VL-72B

・Qwen/Qwen2-VL-72B

12月10日 Sarashina2.1-1B ・ Sarashina-Embedding-v1-1B

・sbintuitions/sarashina2.1-1b
・sbintuitions/sarashina-embedding-v1-1b

12月11日 Sarashina2.1-1B-SFT

・Aratako/sarashina2.1-1b-sft

12月13日 Phi-4

・Introducing Phi-4: Microsoft’s Newest Small Language Model Specializing in Complex Reasoning

12月13日 DeepSeek-VL2

・deepseek-ai/deepseek-vl2
・deepseek-ai/deepseek-vl2-small
・deepseek-ai/deepseek-vl2-tiny

12月14日 Command-R7B

・CohereForAI/c4ai-command-r7b-12-2024

12月16日 Apollo

12月17日 Falcom 3

12月23日 Llama-3.1-Swallow-8B-Instruct-v0.3

・tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3

12月24日 llm-jp-3-172b-instruct3

・llm-jp/llm-jp-3-172b-instruct3

12月25日 QvQ-72B-Preview

・Qwen/QVQ-72B-Preview

12月25日 DeepSeek-V3

・deepseek-ai/DeepSeek-V3-Base

・chat.deepseek.com

12月30日 Llama-3.1-Swallow-70B-Instruct-v0.3

・tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3

2025年

1月8日 phi-4 (MIT License)

・microsoft/phi-4
・microsoft/phi-4-gguf

1月14日 MiniMax-Text-01・MiniMax-VL-01

・MiniMaxAI/MiniMax-Text-01
・MiniMaxAI/MiniMax-VL-01

1月20日 DeepSeek-R1

・deepseek-ai/DeepSeek-R1

🚀 DeepSeek-R1 is here!

⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at https://t.co/v1TFy7LHNy today!

🐋 1/n pic.twitter.com/7BlpWAPu6y
— DeepSeek (@deepseek_ai) January 20, 2025

1月20日 DeepSeek-R1-Zero

・deepseek-ai/DeepSeek-R1-Zero

1月20日 DeepSeek-R1-Distill

・deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
・deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
・deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
・deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
・deepseek-ai/DeepSeek-R1-Distill-Llama-70B
・deepseek-ai/DeepSeek-R1-Distill-Llama-8B

1月24日 KARAKURI LM 32B Thinking 2501 Experimental

・karakuri-ai/karakuri-lm-32b-thinking-2501-exp

先月実験でちょろっとやってたQwQの日本語モデルです。https://t.co/GKtdMUrJx3

一貫した日本語で推論できますが、細かいチューニングはしてないので繰り返しが時々発生します。
使う理由は32Bクラスで一貫した日本語で推論内容を見たいときくらいしかないかと思いますが、DeepSeek…
— Tomofumi Nakayama (@txmy) January 23, 2025

1月24日 J-Moshi

・nu-dialogue/j-moshi-ext

1月27日 Qwen2.5-1M

・Qwen/Qwen2.5-14B-Instruct-1M
・Qwen/Qwen2.5-7B-Instruct-1M

1月27日 DeepSeek-R1-Distill-Qwen-14B-Japanese

・cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese
・cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese

1月27日 ABEJA-Qwen2.5-32b-Japanese-v0.1

・abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1

1月27日 phi-4-open-R1-Distill-EZOv1

・AXCXEPT/phi-4-open-R1-Distill-EZOv1

1月27日 DeepSeek-R1-GGUF 1.58bit

・unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_S

・Run DeepSeek R1 Dynamic 1.58-bit

1月27日 Janus-Pro

・deepseek-ai/Janus-Pro-7B
・deepseek-ai/Janus-Pro-1B

1月27日 Qwen2.5-VL

・Qwen/Qwen2.5-VL-72B-Instruct
・Qwen/Qwen2.5-VL-7B-Instruct
・Qwen/Qwen2.5-VL-3B-Instruct

1月29日 DeepSeek-R1-Distill-Qwen-7B-Japanese

・lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese

1月30日 TinySwallow

・SakanaAI/TinySwallow-1.5B
・SakanaAI/TinySwallow-1.5B-Instruct
・SakanaAI/TinySwallow-1.5B-Instruct-q4f32_1-MLC
・SakanaAI/TinySwallow-1.5B-Instruct-GGUF

1月31日 Mistral Small 3

・mistralai/Mistral-Small-24B-Instruct-2501
・mistralai/Mistral-Small-24B-Base-2501

2月5日 LLM-jp-3 instruct3 150M・440M・980M・7.2B

・llm-jp/llm-jp-3-150m
・llm-jp/llm-jp-3-440m
・llm-jp/llm-jp-3-980m
・llm-jp/llm-jp-3-7.2b

2月8日 PLaMo-2-1B

・pfnet/plamo-2-1b

2月12日 RakutenAI-2.0

・Rakuten/RakutenAI-2.0-8x7B-instruct
・Rakuten/RakutenAI-2.0-mini-instruct
・Rakuten/RakutenAI-2.0-8x7B
・Rakuten/RakutenAI-2.0-mini

2月13日 Qwen2.5 Bakeneko 32B

・rinna/qwen2.5-bakeneko-32b
・rinna/qwen2.5-bakeneko-32b-instruct
・rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b

2月19日 R1 1776

・perplexity-ai/r1-1776

2月25日 PLaMo-2-8B

・pfnet/plamo-2-8b

2月25日 Asagi

・llm-jp/llm-jp-3-1.8b-instruct
・llm-jp/llm-jp-3-3.7b-instruct
・llm-jp/llm-jp-3-7.2b-instruct
・llm-jp/llm-jp-3-13b-instruct