【マット・ウルフのAIニュース：GoogleとOpenAIの激突！消費者が勝者となる展開【2024年12月21日｜@Matt Wolfe】

2024年12月22日 01:06

GoogleとOpenAIの開発競争が白熱を見せる中、両社から重要な発表が相次いでいます。Googleからは動画生成AI「Veo 2」が公開され、現存する最高の動画モデルとして高い評価を得ています。また、画像生成モデル「Imagen 3」の更新や、複数の画像をブレンドして新しい画像を作成できる「Whisk」も発表されました。文書をポッドキャストに変換できる「NotebookLM」も大幅なアップデートを受け、対話型機能が追加されています。さらに、「Gemini Advanced」の2.0モデルと、より強力な推論能力を持つ「Gemini 2.0 Flash Thinking」も登場しました。一方OpenAIは12日間連続で新機能を発表する企画を実施し、ChatGPTのプロジェクト機能や検索機能の一般提供、電話でChatGPTを利用できる「1-800-ChatGPT」などを発表しました。MetaはRay-Ban Meta Smart Glassのアップデートでリアルタイム翻訳機能を追加し、Instagramでは新しいAI動画編集ツールを予告しています。AI動画生成の分野では、Pikaが2.0モデルを発表し、無料利用期間を設けるなど、各社の開発競争が活発化しています。さらに、OdysseyはPixarの共同創設者であるエド・キャトマルをボードメンバーに迎え、生成的な3Dワールドモデルを披露しました。NVIDIAは249ドルの小型スーパーコンピュータを発表し、ロボット開発向けの強力な処理能力を提供することを目指しています。

※動画を再生してから読むのがオススメです。

It's been another busy week in the world of AI with Google and OpenAI sort of competing to make the biggest announcements of the week.

AIの世界では、GoogleとOpenAIが今週最大の発表を競うかのように、またもや忙しい一週間でした。

It's been really fun to watch.

見ているだけでとても面白かったです。

So let's just go ahead and dig right in, starting with the announcements that Google made this week.

それでは、今週Googleが行った発表から始めて、さっそく詳しく見ていきましょう。

Probably the biggest thing that everybody has been talking about this week has been Google's Veo 2 AI video model.

今週、誰もが話題にしている最大のことは、おそらくGoogleのVeo 2 AIビデオモデルでしょう。

I actually managed to get my hands on early access to it, and I made an entire breakdown of it, testing prompts that you guys gave me on X.

実は私は早期アクセスを手に入れることができ、Xで皆さんからいただいたプロンプトをテストして、その全体的な分析を行いました。

You can find the entire breakdown video of it right here on my YouTube channel.

その詳細な分析動画は、私のYouTubeチャンネルでご覧いただけます。

It's called, This New AI Video Tool is the Best I've Seen, Hands-On with Veo.

タイトルは「This New AI Video Tool is the Best I've Seen, Hands-On with Veo」です。

And that title really says it all.

そしてこのタイトルが全てを物語っています。

This is the best video model that I've seen so far.

これは私がこれまでに見た中で最高のビデオモデルです。

It makes really realistic videos.

非常にリアルな動画を作成します。

It seems to understand physics better.

物理法則をよりよく理解しているように見えます。

It seems to understand people movement a lot better.

人の動きについてもずっとよく理解しているように見えます。

Everything about it is just really, really good.

全ての面において本当に、本当に優れています。

Every single prompt that you give it generates four videos for you.

入力する各プロンプトに対して4つの動画を生成してくれます。

So even if one of the videos is a bit wonky, you still have three more to choose from, making it more likely you're going to get a video that you actually like.

そのため、1つの動画が少しおかしくても、まだ3つの選択肢が残っているので、実際に気に入った動画を手に入れられる可能性が高くなります。

Now, I don't know exactly when this is going to be rolled out publicly, but they do have a waitlist over at labs.google.com/fx/tools/video fx.

正確な一般公開時期はわかりませんが、labs.google.com/fx/tools/video fxでウェイトリストに登録することができます。

If you get on the waitlist, supposedly they're rolling it out to the general public fairly soon.

ウェイトリストに登録すれば、かなり近い将来に一般公開される予定とのことです。

They also have a text to image to video model, which first generates an image.

また、彼らはテキストから画像を経由して動画を生成するモデルも持っています。

And then once it's generated, the image converts that image into a video.

そして、画像が生成されると、その画像を動画に変換します。

but at the moment, there's no way to upload your own image and convert it to a video.

しかし現時点では、自分の画像をアップロードして動画に変換する方法はありません。

Again, make sure you check out my deeper dive video all about Veo 2.

繰り返しになりますが、Veo 2についての詳細な解説動画をぜひご覧ください。

Google also made some updates to their Imagen 3 model.

GoogleはImagen 3モデルもアップデートしました。

It's now generating apparently brighter, better composed images.

現在、より明るく、より構図の良い画像を生成しているようです。

And from what I've seen, all of the images are pretty dang good.

私が見た限りでは、全ての画像がかなり素晴らしいものです。

And this one, most people should have access to it.

そして、このモデルはほとんどの人がアクセスできるはずです。

If you go over to labs.google and you select image effects, this is using the newest Imagen 3 model.

labs.googleにアクセスしてimage effectsを選択すると、最新のImagen 3モデルを使用することができます。

And just like Vio, it will generate four images with every single prompt and they're pretty dang good.

そして、Vioと同様に、各プロンプトに対して4つの画像を生成し、それらはかなり素晴らしいものです。

I mean, in my mind, I feel like AI image generation is a fairly solved problem.

私の考えでは、AI画像生成はかなり解決された問題だと感じています。

You can pretty much generate anything you can imagine in image form now.

今では、想像できるものはほぼ全て画像形式で生成できます。

And all of these platforms are just adding minor details that make them slightly more improved.

そして、これらのプラットフォームは全て、わずかな改善をもたらす細かな詳細を追加しているだけです。

But for the most part, I feel like we've hit that point where you can kind of generate anything you want.

しかし、大部分において、欲しいものは何でも生成できるところまで来ていると感じています。

It might just take a few re-rolls to get there.

そこに到達するには、数回の再生成が必要かもしれません。

Google also showed off a new platform called Whisk, which is this really cool way of blending multiple images together to create a cool new image.

Googleはまた、Whiskという新しいプラットフォームを披露しました。これは複数の画像をブレンドして、クールな新しい画像を作成する本当に素晴らしい方法です。

This one can be found over at labs.google.com/fx/tools/whisk.

これはlabs.google.com/fx/tools/whiskで見つけることができます。

And you might have seen some of these viral examples floating around on social media where people are making plushie versions of themselves.

そして、人々が自分自身のぬいぐるみバージョンを作成している、そんなバイラルな例がソーシャルメディアで流れているのを見たかもしれません。

You can give it a style like this plushie, toss an image of yourself in like I just did with this image and it spits out a plushy version of the image.

このぬいぐるみのようなスタイルを指定し、私が今したように自分の画像を投入すると、その画像のぬいぐるみバージョンが出力されます。

If I click open in tool, you can see the plushified me.

ツールで開くをクリックすると、ぬいぐるみ化された私を見ることができます。

But if you don't like that style, you can upload images of your own styles, your own scenes, and of course the subject, or you can roll the dice and let it do it for you.

しかし、そのスタイルが気に入らない場合は、自分のスタイル、シーン、そしてもちろん被写体の画像をアップロードすることができます。または、サイコロを振るようにしてAIに任せることもできます。

So let's have it do like a pin with whatever this scene is here.

では、ここにあるこのシーンでピンのようなものを作ってみましょう。

And we get this cool blended pin that has me with this background, like hosting a podcast.

そして、ポッドキャストをホストしているような、この背景と私が組み合わさったクールなブレンドされたピンができました。

It's really fun to play with and it's totally free to use right now.

これで遊ぶのは本当に楽しく、現在は完全に無料で使用できます。

Again, it's called Whisk and you can find it over in Google labs at labs.google.

繰り返しになりますが、これはWhiskと呼ばれ、Google labsのlabs.googleで見つけることができます。

Notebook LLM also got some major updates this week.

Notebook LLMも今週、大きなアップデートがありました。

If you're not familiar with Notebook LLM, it's the platform where you can upload documents.

Notebook LLMをご存知ない方のために説明すると、これは文書をアップロードできるプラットフォームです。

and then chat with those documents.

そして、それらの文書とチャットすることができます。

But the feature that everybody's really talking about is the feature where you can actually have it turn your documents into a podcast.

しかし、誰もが本当に話題にしている機能は、実際に文書をポッドキャストに変換できる機能です。

Well, Notebook LLM got this redesign this week, so it's a completely overhauled kind of cleaner layout, but they also added this interactive mode.

さて、Notebook LLMは今週リデザインされ、完全に一新されたよりクリーンなレイアウトになりましたが、このインタラクティブモードも追加されました。

So I uploaded a transcript from a podcast with Dario Amidai.

そこで、私はDario Amidaiとのポッドキャストの書き起こしをアップロードしました。

Let's go ahead and have it generate a podcast.

では、ポッドキャストを生成してみましょう。

And now when I'm listening back to this podcast, I can go into interactive mode here.

そして今、このポッドキャストを聞き返すとき、ここでインタラクティブモードに入ることができます。

Okay.

はい。

Ready to dive in.

始める準備ができました。

Today we're tackling AI.

今日はAIについて取り上げます。

Oh yeah.

ああ、そうですね。

Specifically, how fast these systems are scaling up.

特に、これらのシステムがいかに急速にスケールアップしているかについてです。

It's incredible.

信じられないほどです。

And how one company, Anthropic.

そして、ある企業、Anthropicが。

Oh, hey, welcome.

ああ、こんにちは、ようこそ。

What's up？

調子はどうですか？

Hey, I'm just curious.

ねえ、ちょっと気になったんですが。

Can we just kind of get right into it？

すぐに本題に入ってもいいですか？

tell me who this Dario Amadei guy is.

このDario Amadeiという人は誰なのか教えてください。

Oh, hey there.

ああ、こんにちは。

Yeah, absolutely.

はい、もちろんです。

Great question.

良い質問ですね。

I'm glad you asked.

質問してくれて嬉しいです。

So Dario Amadei, he's the CEO of Anthropic, this AI company we've been talking about.

ええと、Dario Amadeiは、私たちが話していたこのAI企業、AnthropicのCEOです。

And he's kind of a big deal in the AI world.

そして、彼はAIの世界では重要人物なんです。

Right.

そうですね。

And what's interesting is that he's not just, oh, go for it.

そして興味深いのは、彼は単なる…ああ、どうぞ。

Yeah.

はい。

What makes him a big deal？

彼が重要視されているのはなぜですか？

I don't really want to get into all the rambly stuff.

長々とした話は避けたいと思います。

I'm just kind of curious.

ただ少し興味があるだけです。

Why is it a big deal？

なぜそれが大きな話題なのですか？

What was he doing before Anthropic？

Anthropic以前は何をしていたのですか？

Okay.

わかりました。

So you want the quick version on why he's a big deal.

つまり、彼が重要視されている理由の要約が知りたいということですね。

Got it.

了解しました。

Yeah.

はい。

Let's uncheck this.

これのチェックを外しましょう。

Basically, he's at the forefront of AI development and safety.

基本的に、彼はAI開発と安全性の最前線にいます。

Before Antropic, he was deeply involved in scaling laws.

Anthropic以前は、スケーリング則に深く関わっていました。

Which we'll get into, but basically means understanding how these systems get smarter.

詳しく説明しますが、基本的にはこれらのシステムがどのように賢くなるかを理解することを意味します。

So basically, now you can create these podcasts using any content you want.

つまり基本的に、今では好きなコンテンツを使ってこれらのポッドキャストを作成できます。

You can upload text files or PDFs or pull in YouTube links or whatever you want.

テキストファイルやPDF、YouTubeのリンクなど、好きなものをアップロードできます。

And then while the podcast is going, you click the join button, you can interrupt them, and you can ask them questions about the document.

そしてポッドキャストの再生中に参加ボタンをクリックすると、中断して文書について質問することができます。

It is super, super cool.

これは本当に、本当にクールです。

And this is now available to everybody.

そして今やこれは誰でも利用できます。

They've rolled it out into everybody's notebook LLM accounts for free.

すべての人のノートブックLLMアカウントに無料で展開されました。

But you know what？

でもね？

Google just kept shipping this week.

Googleは今週も次々と新機能をリリースし続けました。

They had so many cool updates.

とても多くのクールなアップデートがありました。

They also rolled out a new 2.0 advanced model inside of their Gemini advanced platform.

また、Gemini高度プラットフォーム内に新しい2.0高度モデルもリリースしました。

So in order to have Gemini advanced, you do have to pay for the Gemini upgrade on your Google account, but you can now use their most advanced Large Language Model.

Gemini高度版を使用するにはGoogleアカウントでGeminiアップグレードの支払いが必要ですが、最も高度な大規模言語モデルを使用できるようになりました。

So if I head over to gemini.google.com, since I'm a Gemini advanced member, I do have a dropdown here where you can see multiple options, including 2.0 experimental advanced.

gemini.google.comに行くと、私はGemini高度会員なので、2.0実験的高度版を含む複数のオプションが表示されるドロップダウンメニューがあります。

And if we take a quick look over at the chatbot arena here, where they compare the various Large Language Models, we can see the Gemini experimental model is the top ranking model right now.

そして、様々な大規模言語モデルを比較するチャットボットアリーナを見てみると、Gemini実験モデルが現在トップにランクされているのがわかります。

I'm guessing this is what we're getting when we play around with the new Gemini advanced 2.0 model.

これが新しいGemini高度版2.0モデルを使用しているときに得られるものだと推測しています。

Notice which model is right here, almost tied for first place.

ここにあるモデルに注目してください。ほぼ1位と同点です。

This Gemini 2.0 Flash Thinking.

このGemini 2.0 Flash Thinkingです。

Guess when that one rolled out？

これがいつリリースされたか分かりますか？

That one came out this week as well.

これも今週リリースされました。

Here's Logan Kilpatrick from Google saying, just when you thought it was over, we're introducing Gemini 2.0 Flash Thinking, a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts.

GoogleのLogan Kilpatrickが「もう終わりだと思った時に、より強力な推論能力を持ち、その思考過程を表示する新しい実験モデル、Gemini 2.0 Flash Thinkingを導入します」と述べています。

This is apparently Google's answer to OpenAI's 01 model.

これは明らかにOpenAIの01モデルに対するGoogleの回答です。

You can actually test this one out over in Google's AI Studio.

これはGoogle AI Studioで実際にテストすることができます。

If you go to aistudio.google.com, over on the right, you have the option to choose a model here, and you'll notice that Gemini 2.0 Flash Thinking Experimental is one of the options available, and as of right now, it appears to be free to use.

aistudio.google.comにアクセスすると、右側にモデルを選択するオプションがあり、Gemini 2.0 Flash Thinking Experimentalが利用可能なオプションの1つとして表示され、現時点では無料で使用できるようです。

Let's give it the old strawberry test.

古典的なストロベリーテストをやってみましょう。

How many R's are in the word strawberry？

「strawberry」という単語にはいくつの「R」がありますか？

We can see that it thought through.

その思考過程を見ることができます。

You can actually expand the model thoughts here, identify the target word, the word is strawberry, focus on the target letter, R, examine the words letter by letter, S, not an R, T, not an R, R, yes, count equals one.

ここでモデルの思考を展開できます。対象の単語を特定し、単語はstrawberry、対象の文字Rに注目し、文字を1つずつ確認します。S、Rではない、T、Rではない、R、はい、カウント1。

And you can see it go through and actually count the Rs in the word strawberry, finally summing it up, one plus one equals three.

そして、strawberryの中のRを実際にカウントしていく過程が見え、最終的に1足す1は3と合計しています。

And then stating the answer, strawberry.

そして回答を述べています、strawberry。

There are three Rs in the word strawberry.

strawberryという単語には3つのRがあります。

Obviously, this isn't the most advanced test, but anybody that has some really complex logic problems or math, I highly recommend checking it out in Google AI Studio and seeing how you think it compares to OpenAI's 01 model.

明らかにこれは最も高度なテストではありませんが、複雑な論理問題や数学の問題を持っている人は、Google AI Studioでチェックして、OpenAIの01モデルと比較してどう思うか確認することを強くお勧めします。

And while we're on the topic of Google, obviously Google owned YouTube just rolled out a new feature.

そしてGoogleの話題が出たところで、明らかにGoogleが所有するYouTubeが新機能をリリースしたばかりです。

A lot of people are concerned about having their content that they put on the web being trained for AI purposes.

多くの人々がウェブ上に投稿したコンテンツがAIの学習に使用されることを懸念しています。

Well, now YouTube makes that opt-in.

そこで、YouTubeはこれをオプトイン方式にしました。

So if you are okay with AI training on your YouTube videos, you can check a box inside of YouTube now to allow third-party companies to train AI models using your channel content.

YouTubeビデオでのAI学習を許可する場合、YouTube内でチェックボックスをオンにして、サードパーティ企業があなたのチャンネルコンテンツを使用してAIモデルを学習することを許可できるようになりました。

I talk about AI, I'm a proponent of AI.

私はAIについて話し、AIを支持しています。

So of course I said, sure, use my videos to train AI.

そのため当然、私のビデオをAIの学習に使用することを承諾しました。

You could uncheck this box or you can select only the parties that you're willing to let train.

このボックスのチェックを外すか、学習を許可する企業のみを選択することができます。

So if I only want OpenAI to be able to train on my videos, I can select just OpenAI.

たとえば、OpenAIだけに私のビデオでの学習を許可したい場合は、OpenAIのみを選択できます。

But again, I'm totally cool with them training on my content, but totally understand the reasons why some people wouldn't be okay with that.

繰り返しになりますが、私は自分のコンテンツでの学習を全く気にしませんが、一部の人々がそれを望まない理由も完全に理解できます。

Now, the interesting thing about all of these things that Google has been releasing is it feels like it's been a day for day competition with OpenAI because OpenAI originally announced their 12 days of OpenAI announcements where every single day for 12 days, they would make some sort of new announcement or some sort of new rollout.

Googleがリリースしているこれらすべての事柄について興味深いのは、OpenAIとの日々の競争のように感じられることです。なぜなら、OpenAIは当初、12日間のOpenAIアナウンスメントを発表し、12日間毎日何らかの新しい発表や新しいリリースを行うと宣言したからです。

Now the day you're watching this video is the 12th day of OpenAI and they've probably announced something new and it's probably something pretty big.

このビデオを見ている今日はOpenAIの12日目で、おそらく新しい何かを発表しており、それはかなり大きなものかもしれません。

I would imagine whatever they announce on the very last day of the 12 days is going to be one of the bigger announcements that they're gonna make, be totally wrong.

12日間の最終日に発表されるものは、彼らが行う大きな発表の1つになるだろうと想像します。完全に間違っているかもしれませんが。

For all I know on the day you're watching this, they released that you can now send fax messages to OpenAI.

私が知る限り、あなたがこれを見ている日に、OpenAIにファックスメッセージを送信できるようになったと発表したかもしれません。

I don't know, but it's probably going to be something a little bit bigger than that.

分かりませんが、おそらくそれよりも少し大きな何かになるでしょう。

But here's the five announcements they've announced since I made my last AI news breakdown.

しかし、私が最後のAIニュースの要約を作成して以来、彼らが発表した5つの発表をご紹介します。

On day seven, they announced projects inside of ChatGPT, which is essentially folders.

7日目に、彼らはChatGPT内のプロジェクトを発表しました。これは本質的にフォルダのことです。

It's like the same thing as projects inside of Claude.

これはClaudeのプロジェクト機能と同じようなものです。

If you've ever used Claude's projects, If I go to my ChatGPT account, you can see over on projects, I've got a test project.

Claudeのプロジェクトを使ったことがある方なら分かると思いますが、私のChatGPTアカウントのプロジェクトセクションにテストプロジェクトがあります。

I can create a new project.

新しいプロジェクトを作成できます。

I'm going to call it test to project.

これをテストプロジェクトと名付けることにします。

You can tell I haven't really been using this much yet.

まだあまり使用していないことがお分かりでしょう。

And this project can have its own custom instructions and its own set of files that it uses every single time you prompt it with something.

このプロジェクトには、プロンプトを入力するたびに使用する独自のカスタム指示と独自のファイルセットを設定できます。

So if you want to make a project that is specifically designed just to answer questions about a certain products like documentation, you can upload all the documentation into the files.

特定の製品に関する質問に特化したプロジェクトを作りたい場合は、そのドキュメントをすべてファイルにアップロードできます。

And then anytime you prompt something, it will use that documentation to help with the response.

そうすれば、プロンプトを入力するたびに、そのドキュメントを参照して回答を生成します。

On the eighth day of OpenAI's announcements, they announced that the search functionality is being released for everybody, no matter which ChatGPT plan you're on.

OpenAIの発表8日目には、ChatGPTのプランに関係なく、検索機能が全ユーザーに開放されることが発表されました。

Now, if you're a pro or plus user, you've had this for a little while now, but if you were a free user of ChatGPT, you now have this as well.

ProユーザーやPlusユーザーの方は既にしばらく使えていましたが、無料ユーザーの方も今では使えるようになりました。

So let me jump into the ChatGPT-4.0 model.

それでは、ChatGPT-4.0モデルを見てみましょう。

You can see the search the web button.

ウェブ検索ボタンが表示されているのが分かります。

I can select this.

これを選択できます。

And anytime I ask it a question now, it's gonna search the web for the answer.

そして、質問をするたびに、ウェブを検索して回答を提供します。

For instance, if I say, what does Matt Wolf's YouTube channel talk about？

例えば、「Matt WolfのYouTubeチャンネルは何について話しているの？」と聞くと、

It will do a search and answer the questions here.

検索を実行してここで質問に答えます。

And funny enough, it actually pulled up a YouTube video that I did on the Hustled Flowchart podcast with Joe Fear all about what I do on my YouTube channel.

面白いことに、実際にJoe Fearと一緒に出演したHustled Flowchartポッドキャストの動画が表示されました。そこでは私のYouTubeチャンネルについて話しています。

On the ninth day, they announced that the O1 preview model, the model that's really good at thinking and reasoning is now available in the API.

9日目には、思考と推論が得意なO1プレビューモデルがAPIで利用可能になったことが発表されました。

So if you're a developer, before you were only able to use the GPT-4-0 model, now they actually gave you access to O1 to use that in whatever software tools you're building.

開発者の方々は、以前はGPT-4-0モデルしか使えませんでしたが、今では開発中のソフトウェアツールでO1を使用できるようになりました。

If you go to OpenAI's Playground, over on the right where you select your model, you now have access to O1 Mini and O1 Preview as part of the API.

OpenAIのPlaygroundで、右側のモデル選択メニューにO1 MiniとO1 PreviewがAPIの一部として表示されるようになりました。

On day 10, they announced 1-800-CHAT-GPT, which is essentially a way to use your phone and call in and have a conversation with ChatGPT.

10日目には1-800-CHAT-GPTが発表されました。これは基本的に電話でChatGPTと会話ができるサービスです。

I saw a lot of people saying that this is really good for people outside of the US that maybe don't have internet access.

インターネットにアクセスできない米国外の人々にとって便利だという声を多く見かけました。

They could call in and use this.

電話をかけてこのサービスを利用できるというわけです。

But if you look at the fine print on the video right here, it actually says US only for phone calls.

しかし、このビデオの細字を見ると、実際には電話は米国内からのみ利用可能と書かれています。

Yeah, I don't really get the thinking here, but it's essentially the advanced voice mode that if you're in the US, you can call in and use.

はい、この考え方はよく分かりませんが、基本的に米国内にいる人が電話で利用できる高度な音声モードということです。

Here's what it sounds like.

こんな感じです。

Hello again, it's ChatGPT, an AI assistant.

こんにちは、AI アシスタントのChatGPTです。

Our conversation may be reviewed for safety.

会話は安全のために確認される場合があります。

How can I help you？

どのようにお手伝いできますか？

Is your refrigerator running？

冷蔵庫は動いていますか？

Yes, it's running smearedly.

はい、スムーズに動いています。

So I better go catch it, right？

じゃあ、追いかけないとね？

Classic.

古典的ですね。

Hey, that was my line.

あれ、それは私のセリフでした。

You set it up so perfectly.

あなたが完璧にセットアップしてくれたので。

I couldn't resist.

我慢できませんでした。

What's next？

次は何でしょう？

Anyway, you can ask it whatever you would ask.

とにかく、何でも質問できます。

ChatGPT voice mode.

ChatGPTの音声モードです。

You just call 1-800-CHAT-GPT.

1-800-CHAT-GPTに電話するだけです。

They even showed using it on a rotary phone if you own a rotary phone.

ダイヤル式電話をお持ちの方は、それでも使用できることが示されました。

And on day 11, the day that I'm recording this, they announced that the Mac app for ChatGPT now works with more tools.

そして11日目、この動画を録画している日には、ChatGPTのMacアプリがより多くのツールに対応するようになったことが発表されました。

Now I'm not on a Mac right now, so I can't show you the new tools that it works with, but basically you can give it access to different tools on your Mac.

今はMacを使っていないので新しいツールをお見せできませんが、基本的にMac上の様々なツールへのアクセスを許可できます。

They did say that windows is going to get these features soon, but right now it can only view certain tools on your Mac, which I find funny because their biggest partner is Microsoft.

Windowsでもまもなくこれらの機能が利用可能になると言っていますが、現時点ではMac上の特定のツールしか表示できません。最大のパートナーがMicrosoftなのに、これは面白いですね。

You'd think they'd prioritize PC and windows users first, but no, they prioritize Mac users first.

PCやWindowsユーザーを優先すると思いきや、実際にはMacユーザーが優先されています。

In my opinion, compared to the previous couple of weeks of OpenAI announcements, none of these were that great, but I really have high hopes for day 12 Friday, the day this video is going live.

私の意見では、過去数週間のOpenAIの発表と比べて、これらはそれほど素晴らしいものではありませんでしたが、この動画が公開される金曜日の12日目には大きな期待を寄せています。

I do think there's going to be some sort of big announcement.

何か大きな発表があると思います。

I don't know what it's going to be yet.

まだ何になるかは分かりません。

So I'm excited for that because this week's announcements were kind of on the weak side.

今週の発表はやや物足りなかったので、それが楽しみです。

If you ask my opinion, I mean, they're all cool.

私の意見では、確かにどれも素晴らしいです。

They're all great advancements, but nothing that really got me excited, like some of the Google stuff that rolled out.

素晴らしい進歩ではありますが、Googleが発表したような興奮するようなものはありませんでした。

Veo, Whisk, NotebookLM updates, Gemini with Thinking, Gemini Advanced.

Veo、Whisk、NotebookLMのアップデート、思考機能付きGemini、Gemini Advancedなど。

I mean, Google crushed it this week and last week with their announcements.

Googleは今週も先週も、その発表で圧倒的な存在感を示しました。

While in the past, OpenAI usually kind of overshadowed Google's announcements.

過去には、OpenAIがGoogleの発表を影で覆い隠すような感じでしたが。

The last two weeks, I feel like Google crushed OpenAI with the announcements.

この2週間は、GoogleがOpenAIを発表で圧倒したと感じています。

It's just really fun to watch.

見ていて本当に面白いです。

I mean, I posted a post up on X and I said, who's winning the week, Google or OpenAI？

Xに投稿して「今週勝っているのはGoogleとOpenAIのどちらか？」と聞いてみました。

And pretty much everybody said Google's winning, but in reality, the consumers are winning because we're watching this battle play out.

ほとんどの人がGoogleが勝っていると言いましたが、実際には消費者が勝者です。なぜなら、私たちはこの戦いの展開を見守っているからです。

And because this battle's playing out, we're getting cooler and cooler tools by the day.

そして、この戦いが展開されているおかげで、日々より素晴らしいツールが登場しています。

You know, there's a lot of great AI video tools out there right now, tools like Sora and Veo.

現在、SoraやVeoのような素晴らしいAIビデオツールが多く存在しています。

They're all very promising, but they're only gonna get you a few second clips and it doesn't come with audio.

どれも非常に有望ですが、数秒のクリップしか得られず、音声も付いていません。

They're not really ideal for making longer video content.

より長い動画コンテンツを作るのには実際には適していません。

That's why for today's video, I partnered with NVIDIA AI.

そのため、今日の動画ではNVIDIA AIと提携しました。

It's a single tool that will generate scenes for you, compile them into a full video, add audio for you, and basically do the entire process of creating a short video with AI for you all in a single platform.

シーンを生成し、それらを完全な動画にまとめ、音声を追加し、基本的にAIを使用して短い動画を作成する全過程を1つのプラットフォームで行うことができる単一のツールです。

So check this out.

では、これをご覧ください。

Within Video's new version 3.0, you can give it a prompt of whatever video you want, and it will create the entire thing for you.

Videoの新しいバージョン3.0では、望む動画のプロンプトを与えるだけで、すべてを作成してくれます。

I'm talking generate the script for you, generate all of the videos inside of that script for you, add sound effects, add music, and even do the voiceover in your own voice if you want.

スクリプトの生成、そのスクリプトに基づく全ての動画の生成、効果音の追加、音楽の追加、さらに希望すれば自分の声でのナレーション録音まで行ってくれます。

They have simple templates that make it really easy, like create a short video, you just follow the instructions here, an explainer video, animated films, and a ton of other pre-built workflows that you can choose from so you're not starting from scratch at all.

短い動画の作成など、ここの指示に従うだけの簡単なテンプレート、説明動画、アニメーション映画、その他多くの事前構築されたワークフローが用意されているので、まったくのゼロからスタートする必要はありません。

And of course, you can use your own script if you've already got something written up.

もちろん、すでに何か書き上げている場合は、自分のスクリプトを使用することもできます。

Let's say you need an ad for your latest course about how to learn piano.

ピアノの学習方法に関する最新のコースの広告が必要だとしましょう。

Let's go ahead and use this template for an animated ad here.

ここでこのアニメーション広告用のテンプレートを使ってみましょう。

We'll tell it to create a 30 second animated advertisement for YouTube about a beginner's course on learning to play the piano.

ピアノ初心者コースについての30秒のYouTube用アニメーション広告を作成するように指示します。

Now, if you want, you can select longer videos, you can select different platforms, you can select to use only generated clips, only generated images, or only stock footage.

必要に応じて、より長い動画を選んだり、異なるプラットフォームを選んだり、生成されたクリップのみ、生成された画像のみ、またはストック映像のみを使用するように選択できます。

Let's go ahead and use generated clips and we can set the style.

生成されたクリップを使用して、スタイルを設定してみましょう。

Let's do it in a Disney Pixar style.

ディズニーピクサースタイルでやってみましょう。

Now I can go through and select this type of background music, which language I wanted to be in, whether I wanted to have subtitles, I could pick specific voice actors, add a watermark, select my music preference, or I could just let NVIDIA figure all that stuff out for me.

このタイプのバックグラウンドミュージック、使用言語、字幕の有無、特定の声優の選択、ウォーターマークの追加、音楽の好みの選択ができますし、もしくはこれらすべてをNVIDIAに任せることもできます。

So let's see what it comes up with for me for my beginner's piano course.

では、私の初心者ピアノコースについて何を作ってくれるか見てみましょう。

I'll click continue here and we can see it pre-generated a prompt for me.

ここで続行をクリックすると、事前に生成されたプロンプトを確認できます。

Create a 30 second animated advertisement for YouTube about a beginner's course on learning to play the piano.

ピアノ学習初心者コースについての30秒のYouTubeアニメーション広告を作成。

Use only generated clips.

生成されたクリップのみを使用。

Use a Disney Pixar style.

ディズニーピクサースタイルを使用。

Let's leave it that simple and generate the video.

そのシンプルな状態のまま、動画を生成してみましょう。

Now this does take a little bit of time because it is writing the script, generating all of the clips for it, adding the voiceover, adding the background music.

スクリプトの作成、すべてのクリップの生成、ナレーションの追加、バックグラウンドミュージックの追加を行っているため、少し時間がかかります。

It's doing all of that behind the scenes from that single prompt.

それらすべてを単一のプロンプトから舞台裏で処理しています。

And after about 15 minutes of generating, here's the video ad that it created.

約15分の生成時間の後、作成された動画広告がこちらです。

Ever dreamed of playing the piano？

ピアノを弾くことを夢見たことはありませんか？

With Matt's online beginner piano course, that dream is just a keystroke away.

Mattのオンラインピアノ初心者コースで、その夢は一歩先に実現します。

Our fun lessons make learning easy.

楽しいレッスンで学習が簡単になります。

Play your first song in just one week.

たった1週間で最初の曲が弾けるようになります。

Join thousands who've discovered their inner musician.

内なる音楽家を見出した何千人もの人々の仲間入りをしましょう。

Enroll now at mattsbeginnerpiano.com and unlock the melody within you.

mattsbeginnerpiano.comで今すぐ登録して、あなたの中にある音楽性を解き放ちましょう。

I really liked the generation.

生成結果は本当に気に入りました。

The video of it was really good, but the music was almost like this weird, creepy music.

動画は本当に良かったのですが、音楽はどこか奇妙で不気味な感じでした。

And so if we want to change that, I can just give it a prompt, like change the background music to a more upbeat, happy piano song.

そこでこれを変更したい場合は、「バックグラウンドミュージックをよりアップビートで明るいピアノ曲に変更」というようなプロンプトを与えるだけでいいのです。

We'll click generate again.

もう一度生成をクリックします。

And within seconds, we have a new version.

そして数秒で新しいバージョンができあがります。

Ever dreamed of playing the piano with Matt's online beginner piano？

Mattのオンラインピアノ初心者コースでピアノを弾くことを夢見たことはありませんか？

Much better.

ずっと良くなりました。

But you know what？

でも、知っていますか？

I actually have my voice trained in doing video as well.

実は私の声も動画用に学習させてあるんです。

Let's change the narration to my voice.

ナレーションを私の声に変更してみましょう。

I'll just give it the prompt, change the narration to Matt's voice and click generate.

「ナレーションをMattの声に変更」というプロンプトを与えて、生成をクリックするだけです。

About a minute later, I've got another variation.

約1分後、別のバージョンができあがりました。

Ever dreamed of playing the piano with Matt's online beginner piano course？

Mattのオンラインピアノ初心者コースでピアノを弾くことを夢見たことはありませんか？

That dream is just a keystroke away.

その夢は一歩先に実現します。

Much better.

ずっと良くなりました。

And this is just scratching the surface of what I can do within video.

これはvideoでできることのほんの一部に過ぎません。

I can give it a prompt to change pretty much any element I want about the video, or I can click on edit, regenerate any single scene from my video, replace it with stock footage, upload my own video, upload my own music or choose music from their stock music library, change anything about the script and who's actually narrating the script, and so much more.

動画のほぼすべての要素を変更するプロンプトを与えたり、編集をクリックして動画の任意のシーンを再生成したり、ストック映像に置き換えたり、自分の動画をアップロードしたり、自分の音楽をアップロードしたり、ストック音楽ライブラリから音楽を選んだり、スクリプトや実際のナレーターに関するあらゆる要素を変更したりなど、さらに多くのことができます。

Unlike most of the other video platforms that are available right now, You can control pretty much every single element of the output of your video with NVIDIA AI.

現在利用可能な他のほとんどの動画プラットフォームとは異なり、NVIDIA AIを使用すると、動画出力のほぼすべての要素を制御できます。

I can even change the whole video to another language if I want and have my ad running to another entire demographic.

必要に応じて動画全体を別の言語に変更し、全く異なる層向けに広告を展開することもできます。

So if you wanna try one of the lowest barrier to entry video tools and one of the most customizable video tools on the market right now, head on over to nvidia.io and get started today.

現在市場で最も参入障壁が低く、かつ最もカスタマイズ可能な動画ツールの1つを試してみたい場合は、nvidia.ioにアクセスして今すぐ始めてください。

Thank you so much to NVIDIA for sponsoring this video.

本動画のスポンサーであるNVIDIAに心から感謝いたします。

If you're a coder, we got a big announcement out of GitHub this week in that the GitHub Copilot now has a free tier that you can use directly inside of VSCode.

プログラマーの方々へ、今週GitHubから大きな発表がありました。GitHub CopilotがVSCode内で直接使用できる無料プランを導入したのです。

If we take a look at the plans here on the GitHub Copilot page, their free plan allows up to 50 uses per month and their pro plan at only $10 per month allows unlimited uses.

GitHub Copilotのプランページを見ると、無料プランでは月50回まで使用でき、月額わずか10ドルのプロプランでは無制限に使用できます。

For context, Cursor is 20 bucks a month and Devon is $500 per month to use.

参考までに、Cursorは月20ドル、Devonは月500ドルかかります。

As far as code completions, you can get up to 2000 per month for free or unlimited on their $10 a month plan.

コード補完に関しては、無料で月2000回まで、または月額10ドルのプランで無制限に利用できます。

So this is big news for coders who are on an extreme budget and only wanna use free tools.

これは、極めて予算が限られていて無料ツールのみを使用したいプログラマーにとって大きなニュースです。

And since we're talking about GitHub, which is owned by Microsoft, let's talk about a couple of the things that Microsoft rolled out this week, including real-time translation inside of live captions.

GitHubの話が出たついでに、MicrosoftがGitHubを所有していることから、今週Microsoftが発表したいくつかの機能について話しましょう。その中にはライブキャプション内のリアルタイム翻訳も含まれています。

Now you do have to have a Copilot plus PC for this to work, but you can essentially watch videos in any language and it will subtitle it.

これを使用するにはCopilot対応PCが必要ですが、基本的にどの言語の動画でも字幕付きで視聴できます。

And it even works with like video calls.

これはビデオ通話でも機能します。

You can be talking on a zoom call with somebody else and it will translate in real time what they're saying directly on your screen.

Zoomで誰かと通話している際、相手の発言がリアルタイムで画面上に翻訳されて表示されます。

And Bing image creator also got an update this week using the latest DALI model called model PR 16.

また、Bing画像クリエイターも今週、model PR 16と呼ばれる最新のDALIモデルを使用してアップデートされました。

It allows users to create images twice as fast and you can create images directly from within the Bing search bar if you want.

ユーザーは2倍の速さで画像を作成でき、必要に応じてBing検索バーから直接画像を作成することができます。

They've also made it much easier to share your images once they're generated.

生成された画像の共有も、より簡単になりました。

But if I go over to bing.com and tell it to create an image of a monkey on roller skates for me.

しかし、bing.comにアクセスしてローラースケートを履いた猿の画像を作成するように指示すると。

Notice that right at the top, Bing image creator starts to create the image.

上部でBing画像クリエイターが画像の作成を開始することに注目してください。

I don't know if it's bogged down or something right now, but every time I'm trying to generate an image, I'm running into an error.

現在システムに負荷がかかっているのかもしれませんが、画像を生成しようとするたびにエラーが発生しています。

Hopefully that's something that Microsoft will work out, but supposedly you'll be able to do this right here inside of the Bing search.

Microsoftが解決してくれることを願っていますが、理論的にはBing検索内で直接これができるはずです。

The Meta Ray-Ban glasses got an update this week as well with real-time video.

MetaのRay-Banグラスも今週、リアルタイムビデオ機能でアップデートされました。

So very similar to what we saw with Project Astra, And even with the new ChatGPT with vision, now your glasses can actually watch a real-time feed of what's going on around you.

Project Astraや新しいChatGPT with visionで見たものと非常によく似ていて、今やあなたのメガネで周囲の出来事をリアルタイムで見ることができます。

It's also going to allow you to have real-time conversations that are translated between languages.

また、言語間でリアルタイムに翻訳された会話をすることも可能になります。

So it says Live AI lets wearers continuously converse with Meta's AI assistant, Meta AI to reference things they discussed earlier in the conversation.

Live AIを使用すると、着用者はMetaのAIアシスタントであるMeta AIと継続的に会話を行い、以前の会話で議論した内容を参照できるとのことです。

You no longer have to say, hey Meta to get it to start up.

起動するために「hey Meta」と言う必要がなくなりました。

It says Live AI also works with real-time video.

Live AIはリアルタイムビデオでも機能すると述べています。

wearers can ask questions about what they're seeing in real time.

着用者はリアルタイムで見ているものについて質問することができます。

For example, what's around their neighborhood.

例えば、近所に何があるかなどです。

It's basically OpenAI's advanced mode with Vision or Project Astra just built into your Ray-Bans.

基本的にはOpenAIのVision搭載の高度なモードやProject Astraが、あなたのRay-Banに組み込まれているようなものです。

I haven't gotten this update yet in my Ray-Bans, but I am excited to test this stuff out.

私のRay-Banにはまだこのアップデートが来ていませんが、これらの機能をテストするのが楽しみです。

It's also got the ability to say Shazam the song and whatever song is going on around you, it'll tell you what song that is.

また、「Shazam the song」と言うと、周りで流れている曲が何かを教えてくれる機能も備えています。

So some cool new features that are supposedly rolling out now.

このように、現在いくつかのクールな新機能が展開されているとのことです。

Since we're talking about Meta, Instagram started teasing their new AI video editing tool.

Metaの話が出たついでに、Instagramが新しいAIビデオ編集ツールのティーザーを開始しました。

You know, Meta has an AI video generator and I was talking about it on a recent podcast with Tim from Theoretically Media and we couldn't figure out what Meta was gonna use this AI video tool for.

ご存知の通り、MetaはAIビデオジェネレーターを持っており、最近Theoretically MediaのTimとのポッドキャストでこれについて話していましたが、Metaがこのビデオツールをどのように使用するつもりなのか分かりませんでした。

Well, Adam Masseri here, the CEO of Instagram, kind of gave us a hint.

そこで、InstagramのCEOであるAdam Masseriが、ヒントを与えてくれました。

We're working on some really exciting AI tools for you video creators out there.

私たちは、動画クリエイターの皆さんのために、とてもエキサイティングなAIツールを開発中です。

A lot of you make amazing content that makes Instagram what it is and we wanna give you more tools to help realize your ideas.

多くの方々がInstagramを形作る素晴らしいコンテンツを作成しており、私たちはそのアイデアの実現を助けるためのより多くのツールを提供したいと考えています。

And you should be able to do anything you want with your videos.

そして、あなたは動画で望むことは何でもできるようになるはずです。

You should be able to change your outfit or change the context in which you're sitting or add a chain, whatever you can think of.

服装を変えたり、座っている場所の背景を変えたり、チェーンを追加したり、思いつくことは何でもできるようになるはずです。

Now, all that you're seeing here are some of the early research models that we've been exploring.

現在ご覧いただいているのは、私たちが探求している初期の研究モデルの一部です。

And we're hoping to bring some of these into the Instagram app next year.

そして、これらの一部を来年Instagramアプリに導入することを期待しています。

So keep an eye out, more to come.

今後の展開にご期待ください、さらに多くのものが来ます。

All right, continuing on with the topic of AI video, Pika rolled out their new 2.0 model this week.

さて、AIビデオの話題に続いて、Pikaは今週新しい2.0モデルをリリースしました。

And they have this really cool feature called scene ingredients.

そして、シーンイングレディエンツという本当にクールな機能があります。

If I log into my Pika account, you can see there's a new button here that says ingredients and it allows you to upload multiple scene ingredients, including characters, objects, and locations.

Pikaアカウントにログインすると、イングレディエンツという新しいボタンがあり、キャラクター、オブジェクト、場所など、複数のシーン要素をアップロードできます。

It's almost like that wisp that we just saw from Google, but for AI video.

これは、先ほどGoogleで見たウィスプのようなものですが、AIビデオ用です。

Let's say I really like this jacket here.

例えば、ここにあるこのジャケットが気に入ったとします。

I could pull in an image of myself.

自分の画像を取り込むことができます。

and then the image of that jacket and then give it a prompt like a man with a beard walking in a blue and orange jacket.

そしてそのジャケットの画像を取り込み、「青とオレンジのジャケットを着てひげを生やした男性が歩いている」というようなプロンプトを与えます。

Ideally, I'll get what looks like me walking with this jacket on.

理想的には、このジャケットを着て歩いている自分のような映像が得られるはずです。

And well, I got the guy walking, but it looks to be the original guy.

まあ、男性が歩いている映像は得られましたが、元の男性のようです。

It didn't really swap the face.

顔の入れ替えは実際にはできませんでした。

I do need to play with this Pika 2 a little bit more.

このPika 2をもう少し使いこなす必要がありますね。

My last attempt of using ingredients also turned out kind of wonky.

前回のイングレディエンツを使用した試みも、ちょっと変な結果になりました。

If you were watching my live stream, you saw me get this result.

ライブストリームを見ていた方は、この結果を見たと思います。

But I'm sure there'll be people in the comments letting me know what I'm doing wrong and how to actually make this new Pika feature work.

でも、コメント欄で私が何を間違えているのか、この新しいPikaの機能を実際にどうやって使えばいいのかを教えてくれる人がいるはずです。

Because when I see some of these other ones that other people have done, like making themselves fly or putting themselves next to the girl with the pearl earring.

他の人が作成した例、例えば自分が空を飛んでいたり、真珠の耳飾りの少女の隣に自分を配置したりしているのを見ると。

They all seem to come out really, really good, which just makes me think that I just need to get better at prompting it.

どれも本当に素晴らしい出来栄えで、私はただプロンプトの書き方を上達させる必要があるんだと思います。

Oh, and by the way, if you want to try out Pika, they're actually making it free for the next few days.

そうそう、Pikaを試してみたい方に朗報ですが、実は今後数日間は無料で使えます。

They put out this post on X, a giant pre-holiday gift from the Pika team.

Pikaチームは X に投稿を出しました、これは休暇前の大きな贈り物です。

We're giving everyone free unlimited access to Pika 2.0 from today until December 22nd.

本日から12月22日まで、誰でもPika 2.0に無制限で無料アクセスできます。

Anyone on any plan can generate as many videos as they want using all the scene ingredients they want while it's cold outside at pika.art.

外が寒い間、pika.artで、どのプランのユーザーでも好きなだけビデオを生成でき、好きなだけシーンイングレディエンツを使用できます。

Pretty cool of pika.

Pikaのかなりクールな対応ですね。

The video model Kling also got an update this week.

ビデオモデルのKlingも今週アップデートがありました。

Here on X, they said, we're excited to introduce the Kling AI 1.6 model.

Xで彼らは、「Kling AI 1.6モデルの導入を発表できることを嬉しく思います」と述べています。

With the update, we have significantly improved the response to prompt, the visual aesthetics, and the physical actions, hoping to bring more consistent and vivid results.

このアップデートで、プロンプトへの応答、視覚的な美しさ、物理的な動きを大幅に改善し、より一貫性のある鮮やかな結果をもたらすことを目指しています。

So this new model has improved prompt adherence, more consistent and dynamic results.

このNewモデルは、プロンプトへの適合性が向上し、より一貫性のあるダイナミックな結果が得られます。

It supports standard and professional modes, achieving a 190% overall improvement compared with Kling 1.5 and has a 720p model.

標準モードとプロフェッショナルモードをサポートし、Kling 1.5と比較して全体的に190%の改善を達成し、720pモデルを備えています。

Now Kling isn't one that I've played around with a ton, but we did again make a podcast recently with Tim from Theoretically Media, which will be out within the next week or so, where we actually break down all of the various AI video tools and compare them against each other.

私はKlingをそれほど多く使用していませんが、最近Theoretically MediaのTimとポッドキャストを収録し、来週あたりに公開予定です。そこで実際にさまざまなAIビデオツールを分析し、相互に比較しています。

And we put them all on a tier list.

そして、それらすべてをティアリストにまとめました。

And if I remember correctly, Tim put Kling in A-Tiers, so he finds it to be one of the better AI video models.

記憶が正しければ、TimはKlingをAティアに位置づけており、より優れたAIビデオモデルの1つだと評価しています。

Since we're talking about AI video companies, Runway announced the Runway Talent Network, which is a new media platform where you can discover and hire people that are actively using AI to create art and video for whatever project you need help with.

AIビデオ企業の話題に関連して、RunwayはRunway Talent Networkを発表しました。これは、プロジェクトに必要なアートやビデオの作成にAIを積極的に活用している人材を発見し、雇用できる新しいメディアプラットフォームです。

This Odyssey company showed off their generative world model.

このOdyssey社は彼らの生成的世界モデルを披露しました。

They also announced that Ed Catmull, one of the co-founders of Pixar, is on their board.

また、ピクサーの共同創設者の一人であるエド・キャットムルが彼らの取締役会に参加していることも発表しました。

So more people from the entertainment industry are entering the AI industry on boards of these various AI companies.

このように、エンターテインメント業界からより多くの人々が、様々なAI企業の取締役会を通じてAI業界に参入しています。

But check out this generative world model here.

しかし、この生成的世界モデルをご覧ください。

It appears to generate what looks kind of like a Nerf or a Gaussian splat all from a prompt.

プロンプトから、Nerfやガウシアンスプラットのようなものを生成しているように見えます。

You can see a 1970s interior destroyed by snow and it generated this.

雪で破壊された1970年代のインテリアを見ることができ、これが生成されました。

An office interior from the 2000s.

2000年代のオフィスインテリア。

An underground workshop with a muscle car covered in a white cloth.

白い布で覆われたマッスルカーのある地下作業場。

And these prompts are generating these 3D spaces here.

そして、これらのプロンプトがここでこれらの3D空間を生成しています。

And then even cooler, here's this outdoor scene, a street corner with a classic American home wide shot.

そしてさらにクールなのは、この屋外シーン、クラシックなアメリカンホームのワイドショットのある街角です。

And it generated this sort of 3D video here, a large art way with bright LED windows, a street in London with a brick wall, a small room overflowing with plants, a creepy haunted house lit by a lamp.

そしてここで、明るいLEDウィンドウのある大きなアート通路、レンガ壁のあるロンドンの通り、植物であふれる小さな部屋、ランプに照らされた不気味なお化け屋敷といった3Dビデオを生成しました。

Like this is pretty ridiculous.

これはかなり驚異的です。

This is so cool.

これはとてもクールです。

This one even has people in it, a family in the kitchen and it's snowing and it knows that the snow needs to be outside and not in the kitchen, slow moving lava, a serene coral reef.

これには人も登場し、キッチンにいる家族と雪が降っていて、雪はキッチンの中ではなく外にあるべきということを理解しています。そしてゆっくりと流れる溶岩や、穏やかなサンゴ礁も。

I mean, there's a lot of examples here.

つまり、ここには多くの例があります。

I will make sure it's linked up below, but check it out.

下にリンクを貼っておきますので、ぜひチェックしてみてください。

So it is using a Gaussian splat here and they actually managed to load it into unreal engine.

ここではガウシアンスプラットを使用しており、実際にアンリアルエンジンに読み込むことに成功しています。

So this one looks like a scene that they generated with their tool, loaded into unreal engine.

これは彼らのツールで生成したシーンをアンリアルエンジンに読み込んだように見えます。

and then probably loaded this person in inside of unreal engine.

そして恐らく、この人物をアンリアルエンジン内に読み込んだのでしょう。

Now, if you want to get your hands on it, you do have to apply for early access.

さて、これを試してみたい場合は、アーリーアクセスに申し込む必要があります。

I did apply myself.

私も申し込みました。

And as soon as I get my hands on it, I'm going to show it off because this looks amazing.

すぐに手に入れたら、素晴らしく見えるのでみんなに見せびらかすつもりです。

And in the world of even more cool research, I came across this Genesis project this week.

さらに素晴らしい研究の世界では、今週このGenesis（ジェネシス）プロジェクトに出会いました。

It's a generative physics engine that's able to create 4D dynamic worlds.

4次元のダイナミックな世界を作り出すことができる生成物理エンジンです。

And this one, you just need to see the project webpage to see what it's capable of here.

このプロジェクトが何をできるのか知るには、プロジェクトのウェブページを見るだけでいいんです。

It did this character motion generation here.

ここではこのようなキャラクターの動きを生成しました。

Check out this prompt, a miniature Wukong holding a stick in his hand sprints across the table surface for three seconds, then jumps into the air and swings his right arm downward landing.

このプロンプトを見てください。「手に棒を持った小さな孫悟空が3秒間テーブルの表面を走り、空中に飛び上がり、右腕を振り下ろしながら着地する。」

The camera begins with a closeup of his face, then steadily follows the character while gradually zooming out.

カメラは彼の顔のクローズアップから始まり、徐々にズームアウトしながら安定してキャラクターを追跡します。

When the monkey leaps into the air at the highest point of the jump, the motion pauses for a few seconds.

猿が空中にジャンプし、最高点に達したとき、動きが数秒間止まります。

The camera circles around the character for 360 degrees and slowly ascends before the action resumes.

アクションが再開される前に、カメラはキャラクターの周りを360度回転し、ゆっくりと上昇します。

That was the prompt they gave it.

これが彼らが与えたプロンプトでした。

And you were just watching that video as I was reading it.

私がそれを読んでいる間、あなたはちょうどそのビデオを見ていたんですね。

Holy crap.

すごい。

Here's some more.

他にもあります。

A Japanese samurai performs boxing.

日本の侍がボクシングを披露します。

A Chinese soldier performs Gangnam style dance.

中国の兵士がカンナムスタイルのダンスを披露します。

A Roman soldier walks forward like a zombie.

ローマの兵士がゾンビのように前進します。

And the hope is to be able to use these models to help train robots, it appears.

そして、これらのモデルをロボットの訓練に使用できるようにすることが期待されているようです。

It says, Genesis aims to use generative robotic agent and physics engine to automatically generate robotic policies and demonstration data for various skills under different scenarios.

Genesisは生成的ロボットエージェントと物理エンジンを使用して、さまざまなシナリオ下での様々なスキルに対するロボットの方針とデモンストレーションデータを自動生成することを目指していると述べています。

Here's some other generations of like chocolate going on a candy bar or something.

ここではチョコレートがキャンディバーに載っているような他の生成例があります。

I wish I actually knew the prompts for these, but like this is like a sponge or something twisting and then eventually breaking, some fruit falling, some chocolate melting, a space station as the camera floats through it, a school of fish.

これらのプロンプトを実際に知りたかったのですが、スポンジのようなものがねじれて最終的に壊れたり、果物が落ちたり、チョコレートが溶けたり、カメラが宇宙ステーションを浮遊しながら通り抜けたり、魚の群れがいたりします。

I mean, some of the stuff that it's showing that it could generate is pretty mind-blowing.

つまり、生成できると示されているものの一部は本当に驚くべきものです。

And look at how many people were involved in creating this project.

そして、このプロジェクトの作成に関わった人の数を見てください。

This is absolutely wild.

これは本当に驚くべきことです。

The paper for it isn't out yet, it doesn't appear, but it's being released as open source under the Apache 2.0 license.

論文はまだ公開されていないようですが、Apache 2.0ライセンスのオープンソースとして公開されています。

And then this demo here is kinda going viral.

そして、このデモがある意味バイラルになっています。

So check out this prompt.

このプロンプトを見てください。

A water droplet drops onto a beer bottle.

水滴がビール瓶に落ちます。

and then slowly slides down along the bottle's surface.

そして、ゆっくりとボトルの表面に沿って滑り落ちます。

And here's the generation.

そして、これが生成結果です。

It's showing a Heineken bottle and you can see it follow that water droplet down the bottle.

ハイネケンのボトルが表示され、その水滴がボトルを伝って落ちていくのが見えます。

Here's three more prompts.

さらに3つのプロンプトがあります。

Camera revolves around the bottle while gradually moving downward.

カメラが徐々に下方に移動しながらボトルの周りを回転します。

A close-up shot from above tracking the motion of the droplet and camera is positioned on the right side of the scene and tracks the droplet motion.

上方からの水滴の動きを追跡するクローズアップショットと、シーンの右側に配置されたカメラが水滴の動きを追跡します。

It's creating the same video from three different views.

3つの異なる視点から同じビデオを作成しています。

Camera view A, B, and C and it generates all three of these with the Heineken bottle following the water droplet from three separate camera angles.

カメラビューA、B、Cがあり、3つの異なるカメラアングルから水滴を追跡するハイネケンボトルのすべてを生成します。

Visualize all the forces applied to the droplet and its velocity.

水滴にかかるすべての力とその速度を可視化します。

And now it's showing the visualization of that droplet with these arrows showing the effects of gravity, velocity, friction, and the support force.

そして今、重力、速度、摩擦、支持力の効果を示す矢印で、その水滴の可視化を表示しています。

It then zooms into microscopic mode.

その後、顕微鏡モードにズームインします。

Like this is just absolutely wild.

これは本当に信じられないほどすごいです。

I am completely blown away by what I'm seeing.

私は見ているものに完全に圧倒されています。

And they also go on to say it is cross-platform.

さらに、これはクロスプラットフォームだとも述べています。

So it will work if you have an NVIDIA GPU, an AMD GPU, Apple M series chip, or an Intel GPU or CPU.

つまり、NVIDIA GPU、AMD GPU、Apple Mシリーズチップ、またはIntel GPUやCPUがあれば動作します。

It'll work on anything.

どんなものでも動作します。

I have no idea how to use this yet, but I'm going to figure it out because this is something I want to play with more than anything I've come across recently, but I've got to move on from my nerd out fest here.

まだ使い方は分かりませんが、最近出会ったもので一番遊びたいものなので、これを理解しようと思います。でも、このオタク的な興奮から先に進まなければなりません。

Let's talk about some AI image news.

AIイメージに関するニュースについて話しましょう。

Leonardo got an update this week, which yes, I must disclaim.

Leonardoが今週アップデートされました。はい、私は開示しなければなりません。

I am an advisor to Leonardo.

私はLeonardoのアドバイザーです。

So I do own a tiny sliver of equity, but they have some cool new features.

そのため、私は少しの株式を所有していますが、彼らにはいくつかの素晴らしい新機能があります。

They have a few new models.

彼らには新しいモデルがいくつかあります。

There's a brand new Phoenix 1.0 model, which is an improved model over their 09 model.

09モデルから改良された全く新しいPhoenix 1.0モデルがあります。

And they've also rolled in flux.

そして、fluxも組み込まれました。

So if you really like the realism of the flux model, you can now use Leonardo to generate with flux.

そのため、fluxモデルのリアリズムが本当に好きなら、今はLeonardoを使用してfluxで生成できます。

So if I switch to the flux model and give it a prompt like a man standing on the ceiling upside down, I can generate images with the realism of flux.

そのため、fluxモデルに切り替えて「天井に逆さまに立っている男性」のようなプロンプトを与えると、fluxのリアリズムで画像を生成できます。

And now switching over to the Phoenix 1.0 model, let's do just a man standing on the ceiling.

そして今、Phoenix 1.0モデルに切り替えて、単に「天井に立っている男性」をやってみましょう。

I'll get rid of the upside down.

「逆さま」は取り除きます。

That's kind of redundant.

それは少し冗長です。

Let's see what it gives us.

何が出てくるか見てみましょう。

We get some slightly more cartoony looking, but still aesthetically pleasing men standing on the ceiling.

少しよりカートゥーン調だけれど、それでも美的に魅力的な天井に立つ男性たちが得られます。

Now, I probably should have put upside down if I actually wanted to get the upside down like this one, but you get an idea of what it can generate here.

このような逆さまを実際に得たかったなら、おそらく「逆さま」を入れるべきでしたが、ここでどのようなものが生成できるかわかると思います。

Midjourney also got a new update with their mood boards this week.

Midjourneyは今週、ムードボード機能の新しいアップデートを行いました。

And this is essentially a way to train Midjourney on a specific style that you wanna see it generate over and over again.

これは基本的に、あなたが繰り返し生成したい特定のスタイルでMidjourneyを訓練する方法です。

So if you head over to midjourney.com and log in and click on personalize, you get this option for create mood board.

midjourney.comにアクセスしてログインし、パーソナライズをクリックすると、ムードボード作成のオプションが表示されます。

So let's create a mood board here.

それでは、ここでムードボードを作成してみましょう。

Once you're in here, you can upload images or add from a link or add from a gallery and then generate images that reference the images you trained into your mood board.

ここでは、画像をアップロードしたり、リンクから追加したり、ギャラリーから追加したりして、ムードボードで訓練した画像を参照して新しい画像を生成することができます。

It's kind of like mid journeys version of a Laura.

これは、いわばMidjourneyバージョンのLoraのようなものです。

Let's go ahead and pull in a bunch of images of my face here and see if I can actually generate based on my look.

では、私の顔の画像をたくさん取り込んで、私の外見をベースに生成できるか試してみましょう。

Let's just call this wolf as the mood board and we'll click copy code.

このムードボードを「wolf」と名付けて、コードをコピーしましょう。

And this is the code that it gives us that we can now add to any prompt.

これが与えられたコードで、どのプロンプトにも追加できるようになりました。

So I'll copy that code.

そのコードをコピーします。

Let's go to create here.

では、ここで作成してみましょう。

And let's just say a man looking out a window along with my little mood board code.

「窓の外を見つめる男性」というプロンプトに、私のムードボードコードを追加してみましょう。

And well, it sort of followed the style of the images, but it didn't really make it look like me.

まあ、画像のスタイルはある程度踏襲していますが、実際に私に似せることはできていませんでした。

So it looks like it's more for like the color and composition of the images, but not yet to be able to train it on like your own face.

どうやら、これは画像の色や構図のためのものであり、まだ自分の顔を学習させるようなものではないようです。

But that's mood boards.

以上がムードボードについてです。

Magnific rolled out a new AI image model called SuperReal, which is designed to create ultra realistic images directly inside of Magnific.

Magnificは、Magnific内で直接超リアルな画像を作成するように設計された「SuperReal」という新しいAI画像モデルを公開しました。

CREA rolled out some new features allowing you to edit images and do some more kind of impressive in-painting and even pull in other images and put jackets on people and things like that.

CREAは、画像の編集や印象的なインペインティング、さらには他の画像を取り込んで人物にジャケットを着せるなどの新機能をリリースしました。

So they actually put like a Balenciaga jacket on this person and put them in Apple Vision Pro goggles by pulling in images of those things and masking out areas.

実際に、それらのアイテムの画像を取り込んで領域をマスクすることで、この人物にバレンシアガのジャケットを着せ、Apple Vision Proのゴーグルを装着させています。

ElevenLabs just rolled out a new text-to-speech model called Flash, which generates text-to-speech in 75 milliseconds.

ElevenLabsは、75ミリ秒でテキスト読み上げを生成する「Flash」という新しいテキスト読み上げモデルをリリースしました。

So we're gonna be able to have really real-time speech with AI because of models like this, where I can ask AI a question and the response is gonna be near instant because these text-to-speech models are getting so dang fast.

このようなモデルのおかげで、AIに質問をすると、テキスト読み上げモデルが非常に高速になっているため、ほぼ瞬時に応答が得られる、本当のリアルタイムの会話が可能になるでしょう。

This week, researchers at Florida Atlantic University developed an artificial intelligence system that it can actually read sign language.

今週、フロリダ大西洋大学の研究者たちが、手話を実際に読み取ることができる人工知能システムを開発しました。

So theoretically, you can be on like a Zoom call with somebody who's signing to you and actually get the translation of what they're signing.

理論的には、手話で話しかけてくる相手とZoomで通話をしながら、その手話の翻訳をリアルタイムで得ることができます。

So that seems like a pretty cool advancement and good use of AI.

これは非常にクールな進歩であり、AIの良い活用方法だと思われます。

The robotic company Aptronic partnered with Google DeepMind this week.

ロボット企業のAptronicは今週、Google DeepMindとパートナーシップを結びました。

The AI-powered humanoid robotics company announced it entered a strategic partnership agreement with Google DeepMind.

このAIを活用したヒューマノイドロボット企業は、Google DeepMindと戦略的パートナーシップ契約を締結したことを発表しました。

It'll bring together the best-in-class artificial intelligence with cutting-edge hardware and embodied intelligence.

これにより、最高クラスの人工知能と最先端のハードウェア、そして実体化された知能が統合されることになります。

I'm a big fan of Google DeepMind and not just because they flew me out to London and showed me their offices and let me experiment with a lot of their stuff, but because I think they actually do really cool work as well.

私はGoogle DeepMindの大ファンです。ロンドンに招待されてオフィスを見せてもらい、多くの製品を試させてもらったからというだけでなく、彼らが実際にとてもクールな仕事をしていると思うからです。

So excited to see how this partnership plays out over time.

このパートナーシップが時間とともにどのように展開されていくのか、とても楽しみです。

And finally, NVIDIA this week unveiled a miniature supercomputer.

そして最後に、NVIDIAは今週、ミニチュアスーパーコンピュータを発表しました。

Our brand new AI computer, the little tiny Jetson Nano, little Oren computer.

私たちの新しいAIコンピュータ、小さなJetson Nano、小さなOrenコンピュータです。

The thing that's really amazing is that a long time ago, starting with Xavier, you guys might have know that we created a brand new type of processor.

本当に驚くべきことは、かなり前にXavierから始まって、皆さんもご存知かもしれませんが、私たちが全く新しいタイプのプロセッサを作ったことです。

It was a robotics processor.

それはロボティクス・プロセッサでした。

Nobody understood what we were building at the time.

当時は私たちが何を作っているのか、誰も理解していませんでした。

And we imagined that someday these deep learning models would evolve and we would have robots of all kinds.

そして私たちは、いつかこれらのディープラーニングモデルが進化し、あらゆる種類のロボットが生まれるだろうと想像していました。

Everything that moves would be robotic.

動くものすべてがロボット化されるだろうと。

And now here we are, we're seeing all kinds of amazing robots, robots on wheels, robots on legs, two legs, three legs.

そして今、私たちはあらゆる種類の素晴らしいロボットを目にしています。車輪のロボット、脚のロボット、二足歩行、三足歩行のロボットなどです。

And of course, general humanoid robotics are nearly upon us.

そしてもちろん、汎用ヒューマノイドロボットの時代がもうすぐ訪れようとしています。

This is a brand new Jetson Nano Super, almost 70 trillion operations per second, 25 watts and $249.

これは新しいJetson Nano Superで、1秒間に約70兆回の演算、消費電力25ワット、価格249ドルです。

It runs everything that the HGX does, even runs Large Language Models.

HGXができることはすべて実行でき、大規模言語モデルも動かすことができます。

And can't wait for all of you to try it.

皆さんに試していただけるのが待ち遠しいです。

It's available everywhere.

どこでも入手可能です。

Enjoy robotics.

ロボティクスをお楽しみください。

So as you can see, that's designed to power robots and it's under 300 bucks.

ご覧の通り、これはロボットに電力を供給するように設計されており、300ドル以下です。

Super exciting times we're living in.

私たちは非常にエキサイティングな時代に生きています。

As two minute paper says, what a time to be alive.

Two Minute Paperが言うように、「生きていて良かった」と思える時代です。

That's all I got for you today.

今日はこれで以上です。

I'm imagining things are actually going to slow down the week of Christmas next week, but in the grand scheme of things, the AI world continues to speed up and we're getting more and more huge announcements by the day.

来週のクリスマス週は実際にペースが落ちると想像していますが、大局的に見るとAIの世界は加速し続けており、日々より多くの重大な発表がなされています。

I can't wait to see what 2025 holds.

2025年にどんなことが待っているのか、楽しみでなりません。

There's going to be some big stuff, some stuff I've already been able to see, but I'm not allowed to talk about yet.

大きな出来事がありますし、すでに見ることができたものもありますが、まだ話すことができないものもあります。

It's going to be a fun year.

楽しい一年になるでしょう。

Hopefully you're hanging out with me on this channel for the ride.

このチャンネルで一緒に旅を楽しんでいただければと思います。

I'm going to be at CES in a couple of weeks showing off all the latest, greatest tech from that event.

数週間後にCESに参加し、そのイベントの最新かつ最高のテクノロジーをすべてお見せする予定です。

So if you're not already, make sure you like this video, subscribe to this channel.

まだの方は、このビデオにいいねを押し、このチャンネルを購読してください。

I'll make sure really cool AI tech videos show up in your YouTube feed.

あなたのYouTubeフィードに、とてもクールなAIテック動画が表示されるようにします。

Also, I want to mention real quick, I am doing live streams every single Monday at 11 AM Pacific time.

また、手短にお伝えしたいのですが、毎週月曜日の午前11時（太平洋時間）にライブ配信を行っています。

I'm calling it the AI sandbox where we're going to get in, talk about the latest AI news that happened over the past week.

これをAIサンドボックスと呼んでおり、ここで先週発生した最新のAIニュースについて話し合います。

We're going to play with the tools.

私たちはツールを使って遊びます。

I'm going to get suggestions on some of the tools.

いくつかのツールについて提案をもらう予定です。

I'm going to show you early access to some of the tools, and we're just going to have fun with AI for a couple hours.

いくつかのツールの先行アクセスをお見せし、数時間AIで楽しく過ごす予定です。

We're going to do it live, and you can ask me anything, and I will share advice, share tools, help you with automations in your business, all sorts of cool stuff.

ライブで行い、何でも質問していただけます。アドバイスやツールの共有、ビジネスの自動化のサポートなど、さまざまなクールなことをお手伝いします。

The last one we did was a blast.

前回行ったものは大好評でした。

People loved it.

みんな気に入ってくれました。

You can go watch the replay right now, and we're doing it again this coming Monday at 11 a.m., and ideally every Monday at 11 a.m.

今すぐリプレイを見ることができ、今度の月曜日の午前11時に再び開催します。理想的には毎週月曜日の午前11時に行います。

Pacific.

太平洋時間です。

So thank you once again for tuning in.

それでは、ご視聴いただき、重ねてお礼申し上げます。

I really, really appreciate you.

本当に、本当に感謝しています。

I will see you in the next video.

次の動画でお会いしましょう。

Bye-bye.

さようなら。

【マット・ウルフのAIニュース：GoogleとOpenAIの激突！消費者が勝者となる展開【2024年12月21日｜@Matt Wolfe】

いいなと思ったら応援しよう！