人気の記事一覧

Observational Scaling Laws and the Predictability of Language Model Performance

8か月前

AgentBench: Evaluating LLMs as Agents