microCMSでコンテンツ投稿したらAWS S3で記事を生成したい

2024年9月16日 16:59

※注意※
私的なメモなので雑

0.現状

去年末 microCMS ＆ S3 を使って日記ページを構築している旨を書いた気がします。

現状、S3側には"microCMSのAPIを叩いて記事を取得する"だけのスクリプトが記述してあるだけで、日記の内容が直接記述されている訳ではないのであります。
アクセスされるたびにmircroCMSのAPIを叩いて取得してるので、データ転送量がかかってしまいます。(無料プランだと20GB/月)

現状です　API Gateway,LambdaかましてるのはAPIキー見られたくないから

月半ばでもデータ転送量は0.02GBほどなので心配はいらないのですが、
いくつか心配事が…

1.心配事

microCMSから応答がないときに記事が生成されない
- ページを見に行ってもAPIが死んでたら何も記事が見えない状態になってしまいます(今まで止まったの見たことないですが)
microCMSを退会できない
- するつもりないけど、解約したら全部の記事が見えなくなっちゃう
データ転送量(20GB/月)
- 現状心配いらないけど、制限に達したら結構な月額かけないといけない

そういうことで、microCMSで記事を投稿したら記事が作成される(S3に格納される)ようにしたいと思います。お試しで。

2.実装

やりたいこと

microCMSで記事が投稿された際のwebhookでS3に記事(html)を格納する
記事一覧ページを更新する
更新/削除時も動作させる(記事and一覧の更新/削除)

前提条件

S3で静的ウェブサイトとして構築済みであること
該当のS3に以下のページ一覧ページが格納されていること
- ページ一覧のページて

<!DOCTYPE html>
<html lang="ja">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>記事一覧</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            background-color: #f4f4f9;
            color: #333;
            margin: 0;
            padding: 20px;
        }
        h1 {
            text-align: center;
            color: #444;
        }
        ul {
            list-style-type: none;
            padding: 0;
            max-width: 600px;
            margin: 20px auto;
        }
        li {
            background-color: #fff;
            margin: 10px 0;
            padding: 15px;
            border-radius: 8px;
            box-shadow: 0 0 10px rgba(0, 0, 0, 0.1);
        }
        a {
            text-decoration: none;
            color: #007acc;
            font-weight: bold;
            display: block;
        }
        a:hover {
            color: #005f99;
        }
    </style>
</head>
<body>
    <h1>記事一覧</h1>
    <ul>
        <!-- リンクはLambda関数で自動追加 -->
    </ul>
</body>
</html>

以下の内容でLambda関数の作成をします

関数名
- 任意でOK
ランタイム
- Python

関数は以下の通り

import json
import boto3
from botocore.exceptions import ClientError
from datetime import datetime, timezone, timedelta

s3 = boto3.client('s3')
# S3バケット名
bucket_name = 'バケット名をここに入力'
# ページ一覧ファイル名
pages_file = 'article/pages.html'

def lambda_handler(event, context):
    body = json.loads(event['body'])
    #新規作成・更新・削除判定
    content_type = body['type']
    # コンテンツID取得
    content_id = body['id']
    #生成するページ名(今回はコンテンツID.html)
    file_name = f"article/{content_id}.html"

    # 記事新規作成時
    if content_type == "new":
        #内容取得
        content = body['contents']['new']['publishValue']
        #タイトル取得
        title = content['title']
        
        # UTCからJSTへの変換
        utc_time = datetime.strptime(content['publishedAt'], '%Y-%m-%dT%H:%M:%S.%fZ')
        jst_time = utc_time.replace(tzinfo=timezone.utc).astimezone(timezone(timedelta(hours=9)))
        jst_time_str = jst_time.strftime('%Y-%m-%d %H:%M:%S')

        # 公開日時を含むHTMLコンテンツの生成
        html_content = f"""
        <html>
        <head>
            <meta charset="UTF-8">
            <title>{title}</title>
        </head>
        <body>
            <h1>{title}</h1>
            {content['content']}
            <p>公開日時: {jst_time_str}</p>
        </body>
        </html>
        """
        # S3バケットに格納
        s3.put_object(
            Bucket=bucket_name,
            Key=file_name,
            Body=html_content,
            ContentType='text/html'
        )
        update_pages_file(content_id, title, 'add')

    #記事更新時
    elif content_type == "edit":
        content = body['contents']['new']['publishValue']
        title = content['title']

        # 既存のファイルから公開日時を取得
        existing_html = s3.get_object(Bucket=bucket_name, Key=file_name)['Body'].read().decode('utf-8')
        start_index = existing_html.find("公開日時: ") + len("公開日時: ")
        end_index = existing_html.find("</p>", start_index)
        jst_time_str = existing_html[start_index:end_index]

        # 公開日時を含むHTMLコンテンツの生成
        html_content = f"""
        <html>
        <head>
            <meta charset="UTF-8">
            <title>{title}</title>
        </head>
        <body>
            <h1>{title}</h1>
            {content['content']}
            <p>公開日時: {jst_time_str}</p>
        </body>
        </html>
        """
        # S3バケットに格納
        s3.put_object(
            Bucket=bucket_name,
            Key=file_name,
            Body=html_content,
            ContentType='text/html'
        )
        update_pages_file(content_id, title, 'edit')

    # 記事削除時の動作
    elif content_type == "delete":
        # S3バケットから削除
        s3.delete_object(Bucket=bucket_name, Key=file_name)
        update_pages_file(content_id, '', 'delete')
    
    return {
        'statusCode': 200,
        'body': json.dumps('Success')
    }

# 記事一覧ページ更新
def update_pages_file(content_id, title, action):
    try:
        response = s3.get_object(Bucket=bucket_name, Key=pages_file)
        pages_html = response['Body'].read().decode('utf-8')
    # ない場合作成
    except ClientError as e:
        if e.response['Error']['Code'] == 'NoSuchKey':
            pages_html = "<html><body><h1>記事一覧</h1><ul></ul></body></html>"
        else:
            raise

    start = pages_html.find("<ul>")
    end = pages_html.find("</ul>")
    list_items = pages_html[start+4:end]
    
    # 記事作成時、記事一覧に追加
    if action == 'add':
        new_item = f'<li><a href="{content_id}.html">{title}</a></li>\n'
        list_items = new_item + list_items

    # 編集時、同じコンテンツIDのタイトルを置換(タイトルが変わるときもあるので)
    elif action == 'edit':
        old_item_start = list_items.find(f'<a href="{content_id}.html">')
        old_item_end = list_items.find("</a>", old_item_start) + 4
        old_item = list_items[old_item_start:old_item_end]
        
        new_item = f'<a href="{content_id}.html">{title}</a>'
        list_items = list_items.replace(old_item, new_item)

    # 削除時の動作
    elif action == 'delete':
        # id のみで検索して該当するリンクを削除
        list_items = remove_link_by_id(list_items, content_id)
    
    updated_html = pages_html[:start+4] + list_items + pages_html[end:]
    
    # 編集したhtmlを格納
    s3.put_object(
        Bucket=bucket_name,
        Key=pages_file,
        Body=updated_html,
        ContentType='text/html'
    )

# ここら辺もう記憶ないですが、大体これで動作します
def remove_link_by_id(list_items, content_id):
    start = list_items.find(f'<a href="{content_id}.html">')
    if start == -1:
        return list_items
    end = list_items.find("</a>", start) + 4
    item_start = list_items.rfind("<li>", 0, start)
    item_end = list_items.find("</li>", end) + 5
    
    # 前後の改行も含めて削除する
    if item_start > 0 and list_items[item_start-1] == '\n':
        item_start -= 1
    if item_end < len(list_items) and list_items[item_end] == '\n':
        item_end += 1

    return list_items[:item_start] + list_items[item_end:]

本当に最後の記事削除時のやつの記憶があいまいで…()

Lambdaの実行ロールに以下を追加
- S3の特定バケットに対する操作権限を与えています

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::バケット名/*"
        }
    ]
}

関数URLを認証なし(パブリック)で作成
- 絶対に外部に漏れないように…

AWSの操作はここまでです

あとはmicroCMSでwebhookの設定を行います
先ほど取得した関数URLを登録しましょう
このへんに書いた気がします！

3.動作テスト

↓こんな感じで投稿すると

記事一覧が作成されて

デカすぎる画像のページが作成されます

編集/削除時も動きます(画像略)

4.いいとこ・わるいとこ

(既存のものと比べて)メリット
- webhookはデータ転送量に加算されないようなので、データ転送量の節約になる
- microCMSから退会してもページは残る
わるいとこ・どうにかしたいこと
- 個別ページが生のhtmlすぎる
  - まあこれでいいかという感じもある
- S3の料金がかさむ
  - 記事が増えるだけ料金は増える
- 画像取得する際はmicroCMSと通信してる(はず)
  - どうしようもない？どっかに自分で保存できるようにしないといけないかも

そんな感じで実装できました！
記事更新Lambdaと組み合わせるのもまた一興

ちなみにページが更新される際、
CloudFrontでもってるキャッシュを削除するLambdaも実行されておりますが、それはまたどこかでまとめようかと思います

いいなと思ったら応援しよう！

セブンのハッシュポテト代になる