動画をダウンロードする

2024年10月17日 15:18

実現したこと

講義等の教育活動において、Youtubeなどの動画を使って説明したいことがあります。動画をローカルにダウンロードする様々なオンラインツールがありますが、広告が鬱陶しいのと複数の動画をダウンロードするのがまためんどくさいですね。Pythonで楽にするコードを書きましたので紹介します。

ただし、CCライセンスのものを選択するなど著作権回りは注意が必要です！！！また、授業目的での著作物の利用についても注意が必要なので、必ず確認してからにしてください💦

スクリプトの概要

1. CSVファイルやURLから動画URLを読み込む
• コマンドライン引数から、単一の動画URLまたはCSVファイルに含まれる複数の動画URLを指定します。
• CSVファイルを使用した場合、すべてのURLを順次処理します。
2. 動画のダウンロードを実行する
• 指定されたURL（もしくはCSVファイル内のURLリスト）から、YouTubeなどのサービスの動画をダウンロードします。
• 動画だけでなく、音声だけのダウンロードも可能です。
3. 開始時間と終了時間を指定してダウンロードする
• 動画全体ではなく、特定の部分（例えば、開始時間 00:02:00 から終了時間 00:05:00 まで）だけをダウンロードすることができます。
4. 進捗を表示し、処理後に結果を報告
• ダウンロードの進捗はプログレスバーで表示され、完了メッセージが表示されます。

環境構築

ffmpegをインストール

# For mac

brew install ffmpeg

# For ubuntu

sudo apt update && sudo apt install ffmpeg -y

Conda仮想環境構築

mkdir video_downloader
cd video_downloader
nano video_downloader.yml

# video_downloader.yml
name: video_downloader
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.12
  - yt-dlp
  - tqdm

conda env create -f video_downloader.yml

conda activate video_downloader

スクリプト

import argparse
import os
import yt_dlp
import csv
from tqdm import tqdm

class DownloadProgress:
    def __init__(self):
        self.pbar = None

    def download_hook(self, d):
        if d['status'] == 'downloading':
            if self.pbar is None:
                total = d.get('total_bytes') or d.get('total_bytes_estimate')
                self.pbar = tqdm(total=total, unit='B', unit_scale=True, desc=d['filename'])
            downloaded = d.get('downloaded_bytes', 0)
            self.pbar.update(downloaded - self.pbar.n)
        elif d['status'] == 'finished':
            if self.pbar is not None:
                self.pbar.close()
            print(f"Download completed. Converting...")

def download_media(url, quality, format, audio_only, audio_quality, output_dir, start_time, end_time):
    progress = DownloadProgress()
    if audio_only:
        ydl_opts = {
            'format': 'bestaudio/best',
            'postprocessors': [{
                'key': 'FFmpegExtractAudio',
                'preferredcodec': format,
                'preferredquality': str(audio_quality),
            }],
            'outtmpl': os.path.join(output_dir, '%(title)s.%(ext)s'),
            'quiet': True,
            'no_warnings': True,
            'progress_hooks': [progress.download_hook],
        }
    else:
        ydl_opts = {
            'format': f'{format}[height<={quality}]' if quality else f'{format}/bestvideo+bestaudio/best',
            'outtmpl': os.path.join(output_dir, '%(title)s.%(ext)s'),
            'quiet': True,
            'no_warnings': True,
            'progress_hooks': [progress.download_hook],
        }

    # Add time range options if specified
    if start_time or end_time:
        ydl_opts['download_ranges'] = download_range_func(start_time, end_time)
        ydl_opts['force_generic_extractor'] = True

    # Error handling options
    ydl_opts.update({
        'ignoreerrors': True,
        'no_color': True,
        'geo_bypass': True,
        'nocheckcertificate': True,
        'extractor_args': {'youtube': {'skip': ['dash', 'hls']}},
    })

    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        try:
            ydl.download([url])
            print(f"Media saved in {output_dir}")
        except Exception as e:
            print(f"An error occurred: {str(e)}")
            print("Trying alternative method...")
            try:
                ydl_opts['format'] = 'bestvideo+bestaudio/best'
                ydl.download([url])
                print(f"Media saved in {output_dir} using alternative method")
            except Exception as e:
                print(f"Alternative method also failed: {str(e)}")
                print("Please try updating yt-dlp or check if the video is available in your region.")

def download_range_func(start_time, end_time):
    start_seconds = time_to_seconds(start_time)
    end_seconds = time_to_seconds(end_time)

    def func(info_dict, ydl_obj):
        return [{
            'start_time': start_seconds,
            'end_time': end_seconds,
        }]
    return func

def time_to_seconds(time_str):
    if time_str:
        h, m, s = map(int, time_str.split(':'))
        return h * 3600 + m * 60 + s
    return None

def read_urls_from_csv(file_path):
    """Reads a CSV file and extracts URLs from it."""
    urls = []
    with open(file_path, newline='') as csvfile:
        reader = csv.reader(csvfile)
        for row in reader:
            if row:  # Check if row is not empty
                urls.append(row[0])  # Assuming URLs are in the first column
    return urls

def main():
    parser = argparse.ArgumentParser(description="Video/Audio Downloader")
    parser.add_argument("url", nargs='?', help="URL of the video (if not using --file)")
    parser.add_argument("-q", "--quality", type=int, default=1080, 
                        choices=[1080, 720, 480, 360],
                        help="Maximum video quality (height in pixels). "
                             "Choices are 1080 (default), 720, 480, or 360. "
                             "The highest available quality not exceeding "
                             "this value will be downloaded. "
                             "Ignored if --audio-only is used.")
    parser.add_argument("-f", "--format", default="mp4",
                        help="Desired media format (default: mp4). "
                             "For video: mp4, webm, mkv. "
                             "For audio (with --audio-only): mp3, m4a, wav, etc.")
    parser.add_argument("-o", "--output", default=".", 
                        help="Output directory (default: current directory)")
    parser.add_argument("--audio-only", action="store_true",
                        help="Download audio only")
    parser.add_argument("--audio-quality", type=int, default=192,
                        help="Audio bitrate in kbps (default: 192). "
                             "Common values: 128, 192, 256, 320. "
                             "Only applicable with --audio-only.")
    parser.add_argument("--start-time", type=str, 
                        help="Start time of the video (format: HH:MM:SS)")
    parser.add_argument("--end-time", type=str, 
                        help="End time of the video (format: HH:MM:SS)")
    parser.add_argument("--file", type=str, 
                        help="Path to a CSV file containing URLs of videos to download")

    args = parser.parse_args()

    if not os.path.exists(args.output):
        os.makedirs(args.output)

    if args.file:
        # If a file is specified, read URLs from the file and download each video
        urls = read_urls_from_csv(args.file)
        for url in urls:
            download_media(url, args.quality, args.format, args.audio_only, args.audio_quality, 
                           args.output, args.start_time, args.end_time)
    elif args.url:
        # If a URL is specified, download the single video
        download_media(args.url, args.quality, args.format, args.audio_only, args.audio_quality, 
                       args.output, args.start_time, args.end_time)
    else:
        print("Error: You must provide either a URL or a CSV file with the --file option.")
        parser.print_help()

if __name__ == "__main__":
    main()

Usage

python video_downloader.py --help
usage: video_downloader.py [-h] [-q {1080,720,480,360}] [-f FORMAT] [-o OUTPUT] [--audio-only] [--audio-quality AUDIO_QUALITY]
                           [--start-time START_TIME] [--end-time END_TIME] [--file FILE]
                           [url]

YouTube Video/Audio Downloader

positional arguments:
  url                   URL of the video (if not using --file)

options:
  -h, --help            show this help message and exit
  -q {1080,720,480,360}, --quality {1080,720,480,360}
                        Maximum video quality (height in pixels). Choices are 1080 (default), 720, 480, or 360. The highest
                        available quality not exceeding this value will be downloaded. Ignored if --audio-only is used.
  -f FORMAT, --format FORMAT
                        Desired media format (default: mp4). For video: mp4, webm, mkv. For audio (with --audio-only): mp3, m4a,
                        wav, etc.
  -o OUTPUT, --output OUTPUT
                        Output directory (default: current directory)
  --audio-only          Download audio only
  --audio-quality AUDIO_QUALITY
                        Audio bitrate in kbps (default: 192). Common values: 128, 192, 256, 320. Only applicable with --audio-only.
  --start-time START_TIME
                        Start time of the video (format: HH:MM:SS)
  --end-time END_TIME   End time of the video (format: HH:MM:SS)
  --file FILE           Path to a CSV file containing URLs of videos to download

Examples

基本的な使用法（デフォルト設定）:
python video_downloader.py "https://www.XXX.com/ZZZ"

複数動画の一括ダウンロード：
python video_downloader.py --file xxx.csv

CSV file example
https://xxx.be/example1
https://xxx.be/example2
https://xxx.be/example3

品質を指定してダウンロード:
python video_downloader.py "https://www.XXX.com/ZZZ" -q 360

特定のフォーマットを指定してダウンロード:
python video_downloader.py "https://www.XXX.com/ZZZ" -f webm

品質とフォーマットを指定してダウンロード:
python video_downloader.py ”https://www.XXX.com/ZZZ” -q 480 -f mp4

特定のディレクトリにダウンロード:
python video_downloader.py ”https://www.XXX.com/ZZZ” -o ~/Downloads

音声のみをダウンロード:
python video_downloader.py ”https://www.XXX.com/ZZZ” --audio-only

音声のみを特定の品質とフォーマットでダウンロード:
python video_downloader.py ”https://www.XXX.com/ZZZ” --audio-only -f m4a --audio-quality 256

プレイリスト内の特定の動画をダウンロード:
python video_downloader.py ”https://www.XXX.com/ZZZ”

最低品質でダウンロード（データ節約）:
python video_downloader.py ”https://www.XXX.com/ZZZ” -q 360

特定のファイル名でダウンロード:
python video_downloader.py ”https://www.XXX.com/ZZZ” -o "~/Videos/%(title)s-%(resolution)s.%(ext)s"

まとめ

これで、YouTubeなどから動画を効率的に一括ダウンロードできるようになりました。特定の時間範囲を指定して部分的にダウンロードする機能も備えており、CSVファイルを使えば大量の動画も一度に処理できます。

動画のダウンロードが自動化されることで、手作業でのダウンロードにかかる手間を削減でき、より効率的に動画の管理が可能になります。研究や趣味など、他の重要な活動に時間を割くことができるようになると嬉しいですね！

繰り返しになりますが、CCライセンスのものを選択するなど著作権回りは注意が必要です！！！また、授業目的での著作物の利用についても注意が必要なので、必ず確認してからにしてください💦

今後も便利なスクリプトやツールを紹介していきますので、ぜひお楽しみに！質問や改善点があれば、ぜひコメントでお知らせください。