一枚絵があれば動く。Talking-Head-Anime-3のインストールして、ポーズデータでスムーズに動かす　サーバ編

2024年2月20日 17:38

引き続き、後半のサーバ編です。前回の記事で作成したTalkingHeadAnimefaceクラスを利用してFastAPIによるサーバを構築します。サーバ化の最大のメリットは完全な並列処理が可能になる点です。同一PC内でも別途サーバを準備してもクライアント側は同じなので、システム構成が柔軟にできます。ただし、通信のオーバーヘッドが生じるので生成時間は多少長くなることは避けれれません。

環境は前回準備したとおりです。サーバ化にあたり、クライアントで利用するクラスを準備しました。TalkingHeadAnimefaceクラスと互換性があり、サーバとの通信機能を備えています。

TalkingHeadAnimefaceInterfaceクラス

使えるメソッドはTalkingHeadAnimefaceと同じです。
get_init_dic　　DIC形式のポーズデータ用のテンプレートを取得します。
get_pose　　　パック形式ポースデータをリポジトリの形式に変換します。
get_pose_dic　DIC形式のポーズデータから画像を生成します
load_img　　　利用するキャラクタのテンプレートをアップロードします。
inference　　　リポジトリの形式のポーズデータから生成します。
　　　　　　　　毎回キャラ画像をロードします。
inference_pos　パックされたポーズデータから画像を生成します。
　　　　　　　　画像は事前ロードが必要です。
inference_img　リポジトリの形式のポーズデータから生成します
　　　　　　　　画像は事前ロードが必要です。
inference_dic　　DIC形式のポーズデータから画像を生成します。
　　　　　　　　画像は事前ロードが必要です。

poser_api_v1_2_server

FastAPIによるTalkingHeadAnimefaceラッパーです。
各メソッドに対応するエンドポイントを準備しています。送受信画像データやポーズデータはpickleでバイト化しています。ここはBase64などの適切な方法を使うべきでしょうけど、クライアント側コード統一のため、及び実績ある手法であるため、敢えて変えていません。

イニシャライズやモデルのロード

TalkingHeadAnimefaceを定義することでTalkingHeadAnimefaceクラス内で実行されます。

Tkh=TalkingHeadAnimeface()

エンドポイントの例

Dict形式ポーズデータから画像生成

@app.post("/inference_dic/")
def inference_dic(pose:UploadFile = File(...),img_number:int= Form(...),user_id:int= Form(...)):
    current_dic = pose.file.read()
    current_pose_dic=(pickle.loads(current_dic))
    out_image=Tkh.inference_dic(current_pose_dic,img_number,user_id)
    #−−−−−生成画像を返信
    images_data = pickle.dumps(out_image, 5)  # tx_dataはpklデータ
    return Response(content= images_data, media_type="application/octet-stream")

テストプログラム

こちらもTalkingHeadAnimefacを使う場合と同じです。

パック化したポーズデータから生成する例

    #サンプル ４　inference_pos()を使用　パック形式　イメージは事前ロード
    if test=="4": 
        input_image = Image.open(filename)
        input_image.show()
        img_number=Thi.load_img(input_image,user_id)
        packed_pose=["happy", [0.5,0.0], "wink", [1.0,0.0], [0.0,0.0], [0.0,0.0], "ooo", [0.0,0.0], [0.0,0.0], 0.0, 0.0,0.0, 0.0]
        result, out_image=Thi.inference_pos(packed_pose,img_number,user_id) 
        out_image.show()

Dict形式のポーズデータから画像を生成する例

TalkingHeadAnimefacを使う場合のTEST-7と同じです。

    #サンプル 8　inference_dic() 　poseはDICT形式で直接サーバを呼ぶ　イメージは事前ロード　  DICT形式で必要な部分のみ選んで連続変化させる
    if test=="8": 
        div_count=30
        input_image = Image.open(filename)
        imge = np.array(input_image)
        imge = cv2.cvtColor(imge, cv2.COLOR_RGBA2BGRA)
        cv2.imshow("image",imge)
        cv2.waitKey()
        img_number=Thi.load_img(input_image,user_id)
        pose_dic=pose_dic_org #Pose 初期値
        current_pose_list=[]
        for i in range(int(div_count/2)):
            start_time=time.time()
            current_pose_dic=pose_dic
            current_pose_dic["eye"]["menue"]="wink"
            current_pose_dic["eye"]["left"]=i/(div_count/2)
            current_pose_dic["head"]["y"]=i*3/(div_count/2)
            current_pose_dic["neck"]=i*3/(div_count/2)
            current_pose_dic["body"]["y"]=i*5/(div_count/2)
            start_time=time.time()
            result,imge = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(imge)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(div_count):
            start_time=time.time()
            current_pose_dic["eye"]["left"]=1-i/(div_count/2)
            current_pose_dic["head"]["y"]=3-i*3/(div_count/2)
            current_pose_dic["neck"]=3-i*3/(div_count/2)
            current_pose_dic["body"]["y"]=5-i*5/(div_count/2)
            start_time=time.time()
            result,imge = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(imge)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(div_count):
            start_time=time.time()
            current_pose_dic["eye"]["left"]=i/div_count
            current_pose_dic["eye"]["right"]=i/div_count
            current_pose_dic["head"]["y"]=-3+i*3/(div_count/2)
            current_pose_dic["neck"]=-3+i*3/(div_count/2)
            current_pose_dic["body"]["z"]=i*3/div_count
            current_pose_dic["body"]["y"]=-5+i*5/(div_count/2)
            start_time=time.time()
            result,imge = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(imge)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(div_count):
            start_time=time.time()
            current_pose_dic["eye"]["left"]=0.0
            current_pose_dic["eye"]["right"]=0.0
            current_pose_dic["head"]["y"]=3-i*3/(div_count/2)
            current_pose_dic["neck"]=3-i*3/(div_count/2)
            current_pose_dic["body"]["z"]=3-i*3/div_count
            current_pose_dic["body"]["y"]=5-i*5/(div_count/2)
            start_time=time.time()
            result,imge = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(imge)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        image_show(imge)
        cv2.waitKey(5000)

テスト−8を実行するスクリプト

python poser_client_v1_2_test.py --host http://192.168.11.59:8001 --test 8

生成動画（リスタイムで生成された画像の録画）

talking-head-animeface-3で作成した動画。 pic.twitter.com/HvjnOVBp79
— ゆずき (@uzuki425) February 13, 2024

生成速度の比較

GPU：4060-ti
TalkingHeadAnimefacクラスを直接使う場合
22mS
TalkingHeadAnimefaceInterfaceをローカルホスト経由で利用する場合
29mS
USB-LANアダプタ経由でクライアントPCから接続して利用する場合
クラシアンと側もUSB-LANアダプタ経由です
38mS
かなり遅くなります。ここは2.5G−LAN経由などに変更したいところです。
（経験的にかなりの差がでます）

全ソースコード

サーバー側ソースコード

import torch
import time
import numpy as np
import pickle
import cv2
from PIL import Image
import argparse
from time import sleep
from  poser_api_v1_2_class import TalkingHeadAnimeface

# ===================================     FastAPI  ==============================
from fastapi import FastAPI, File, UploadFile, Form,  Query
from fastapi.responses import HTMLResponse,StreamingResponse,JSONResponse,Response
from pydantic import BaseModel


app = FastAPI()

Tkh=TalkingHeadAnimeface()

@app.post("/get_init_dic/")
def get_init_dic():
    pose_dic_org=Tkh.get_init_dic()
    org_dic = pickle.dumps(pose_dic_org,5)
    return Response(content= org_dic, media_type="application/octet-stream")
    
@app.post("/load_img/")
def load_img(image: UploadFile = File(...),user_id:int= Form(...)):
    image_data = image.file.read()
    image_data =(pickle.loads(image_data))#元の形式にpickle.loadsで復元
    image_data = image_data.convert("RGBA")
    img_number=Tkh.load_img(image_data,user_id)
    result="OK"
    print("img_number=",img_number)
    return {'message':result,"img_number":img_number}

@app.post("/inference_org/")
def inference_org(image:UploadFile = File(...),pose:UploadFile = File(...)): #基本イメージ生成、イメージは毎回ロード
    image_data = image.file.read()
    current_pose = pose.file.read()
    image_data =(pickle.loads(image_data))#元の形式にpickle.loadsで復元
    input_image = image_data.convert("RGBA")
    current_pose=(pickle.loads(current_pose))
    out_image=Tkh.inference(input_image,current_pose)
    #−−−−−生成画像を返信
    images_data = pickle.dumps(out_image, 5)  # tx_dataはpklデータ
    return Response(content= images_data, media_type="application/octet-stream")

@app.post("/inference_pos/")
def inference_pos(pose:UploadFile = File(...),img_number:int= Form(...),user_id:int= Form(...)):
    packed_pose = pose.file.read()
    packed_pose=(pickle.loads(packed_pose))
    print(packed_pose)
    out_image=Tkh.inference_pos(packed_pose,img_number,user_id)
    #−−−−−生成画像を返信
    images_data = pickle.dumps(out_image, 5)  # tx_dataはpklデータ
    return Response(content= images_data, media_type="application/octet-stream")

@app.post("/inference_dic/")
def inference_dic(pose:UploadFile = File(...),img_number:int= Form(...),user_id:int= Form(...)):
    current_dic = pose.file.read()
    current_pose_dic=(pickle.loads(current_dic))
    out_image=Tkh.inference_dic(current_pose_dic,img_number,user_id)
    #−−−−−生成画像を返信
    images_data = pickle.dumps(out_image, 5)  # tx_dataはpklデータ
    return Response(content= images_data, media_type="application/octet-stream")

@app.post("/inference_img/")
def inference_img(current_pose:list= Form(...),img_number:int= Form(...),user_id:int= Form(...)):
    current_pose = [float(item) for item in current_pose]
    out_image=Tkh.inference_img(current_pose,img_number,user_id)
    #−−−−−生成画像を返信
    images_data = pickle.dumps(out_image, 5)  # tx_dataはpklデータ
    return Response(content= images_data, media_type="application/octet-stream")

@app.post("/get_pose/")
def get_pose(pose:UploadFile = File(...)):
    pkl_pack= pose.file.read()
    pose_pack=(pickle.loads(pkl_pack))
    pose=Tkh.get_pose(pose_pack)
    pose_data = pickle.dumps(pose, 5)  # tx_dataはpklデータ
    return Response(content= pose_data, media_type="application/octet-stream")

@app.post("/get_pose_dic/")
def get_pose_dic(pose:UploadFile = File(...)):
    pose= pose.file.read()
    pose_dic=(pickle.loads(pose))
    pose=Tkh.get_pose_dic(pose_dic)
    pose_data = pickle.dumps(pose, 5)  # tx_dataはpklデータ
    return Response(content= pose_data, media_type="application/octet-stream")

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8001)

クライアント側TalkingHeadAnimefaceInterfaceクラス

import numpy as np
import cv2
from PIL import Image
import time
import requests
import pickle

#generation Classのバリエーション
#
# inference(self,input_img,current_pose):                    #pose=リポジトリの形式、イメージは毎回ロード
# inference_img(self,current_pose,img_number,user_id):       # pose=リポジトリの形式  イメージは事前ロード,複数画像対応
# inference_pos(self,packed_current_pose,img_number,user_id):# pose=パック形式　イメージは事前ロード,複数画像対応
# inference_dic(self,current_dic,img_number,user_id):        # pose=Dict形式 イメージは事前ロード,複数画像対応

# ユーティリティClass
# get_pose(self,pose_pack):        #パック形式 =>リポジトリの形式変換
# get_init_dic(self):              #Dict形式の初期値を得る
# get_pose_dic(self,dic):          #Dict形式 => リポジトリの形式変換
# load_img(self,input_img,user_id):# 画像をVRAMへ登録

class TalkingHeadAnimefaceInterface():
    def __init__(self,host):
        userid=0
        self.url=host

    def get_init_dic(self):
        response = requests.post(self.url+"/get_init_dic/") #リクエスト
        if response.status_code == 200:
            pose_data = response.content
            org_dic =(pickle.loads(pose_data))#元の形式にpickle.loadsで復元
        return org_dic  
      
    def get_pose(self,pose_pack):
        #-----パック形式
        #0  eyebrow_dropdown: str :            "troubled", "angry", "lowered", "raised", "happy", "serious"
        #1  eyebrow_leftt, eyebrow_right:      float:[0.0,0.0]
        #2  eye_dropdown: str:                 "wink", "happy_wink", "surprised", "relaxed", "unimpressed", "raised_lower_eyelid"
        #3  eye_left, eye_right :              float:[0.0,0.0]
        #4  iris_small_left, iris_small_right: float:[0.0,0.0]
        #5 iris_rotation_x, iris_rotation_y : float:[0.0,0.0]
        #6  mouth_dropdown: str:               "aaa", "iii", "uuu", "eee", "ooo", "delta", "lowered_corner", "raised_corner", "smirk"
        #7  mouth_left, mouth_right :          float:[0.0,0.0]
        #8  head_x, head_y :                   float:[0.0,0.0]
        #9  neck_z,                            float
        #10 body_y,                            float
        #11 body_z:                            float
        #12 breathing:                         float
        #
        # Poseの例
        # pose=["happy",[0.5,0.0],"wink", [i/50,0.0], [0.0,0.0], [0.0,0.0],"ooo", [0.0,0.0], [0.0,i*3/50],i*3/50, 0.0, 0.0, 0.0]
        
        pose_pack_pkl = pickle.dumps(pose_pack, 5)
        files = {"pose":("pos.dat",pose_pack_pkl, "application/octet-stream")}#listで渡すとエラーになる
        response = requests.post(self.url+"/get_pose/", files=files) #リクエスト
        if response.status_code == 200:
            pose_data = response.content
            pose =(pickle.loads(pose_data))#元の形式にpickle.loadsで復元
            result = response.status_code
        return pose  
  
    def get_pose_dic(self,dic):
            #サンプル Dict形式
            #"mouth"には2種類の記述方法がある"lowered_corner"と”raised_corner”は左右がある
            #  "mouth":{"menue":"aaa","val":0.0},
            #  "mouth":{"menue":"lowered_corner","left":0.5,"right":0.0},　これはほとんど効果がない
            #
            #pose_dic={"eyebrow":{"menue":"happy","left":0.5,"right":0.0},
            #        "eye":{"menue":"wink","left":0.5,"right":0.0},
            #        "iris_small":{"left":0.0,"right":0.0},
            #        "iris_rotation":{"x":0.0,"y":0.0},
            #        "mouth":{"menue":"aaa","val":0.7},
            #        "head":{"x":0.0,"y":0.0},
            #        "neck":0.0,
            #        "body":{"y":0.0,"z":0.0},
            #        "breathing":0.0
            #        }
            
        current_dic = pickle.dumps(dic, 5)
        files = {"pose":("pos.dat",current_dic, "application/octet-stream")}#listで渡すとエラーになる
        response = requests.post(self.url+"/get_pose_dic/", files=files) #リクエスト
        if response.status_code == 200:
            pose_data = response.content
            pose =(pickle.loads(pose_data))#元の形式にpickle.loadsで復元
            result = response.status_code
        return pose   

    def load_img(self,input_img,user_id):
        print("load_img")
        images_data = pickle.dumps(input_img, 5) 
        files = {"image": ("img.dat",  images_data, "application/octet-stream")}
        data = {"user_id": user_id}
        response = requests.post(self.url+"/load_img/", files=files, data=data) #リクエスト送信
        if response.status_code == 200:
            response_data = response.json()
            print("response_data =",response_data)
            img_number=response_data["img_number"]
        else:
            img_number=-1
        return img_number

    def inference(self,input_img,current_pose):#基本イメージ生成、イメージは毎回ロード
        start_time=time.time()
        images_data = pickle.dumps(input_img, 5)
        current_pose2 = pickle.dumps(current_pose, 5)
        files = {"image": ("img.dat",images_data, "application/octet-stream"),
                 "pose":("pos.dat",current_pose2, "application/octet-stream")}#listで渡すとエラーになる
        response = requests.post(self.url+"/inference_org/", files=files) #リクエスト
        if response.status_code == 200:
            image_data = response.content
            image =(pickle.loads(image_data))#元の形式にpickle.loadsで復元
            result = response.status_code
        return result, image
    
    def inference_pos(self,packed_pose,img_number,user_id):#イメージは事前ロード
        packed_pose = pickle.dumps(packed_pose, 5)
        files={"pose":("pos.dat",packed_pose, "application/octet-stream"),}
              # "img_number":img_number,
              # "user_id": user_id,}#listで渡すとエラーになる
        data = {"user_id": user_id,"img_number":img_number,}
        response = requests.post(self.url+"/inference_pos/", files=files, data=data) #リクエスト送信
        if response.status_code == 200:
            image_data = response.content
            image =(pickle.loads(image_data))#元の形式にpickle.loadsで復元
            result = response.status_code
        return result, image

    def inference_dic(self,current_dic,img_number,user_id):#イメージは事前ロード
        data = {"img_number":img_number,"user_id": user_id,}
        current_dic2 = pickle.dumps(current_dic, 5)
        files={"pose":("pos.dat",current_dic2, "application/octet-stream")}#listで渡すとエラーになる
        response = requests.post(self.url+"/inference_dic/", data=data,files=files) #リクエスト送信
        if response.status_code == 200:
            image_data = response.content
            image =(pickle.loads(image_data))#元の形式にpickle.loadsで復元
            result = response.status_code
        return result, image
        
    def inference_img(self,current_pose,img_number,user_id):#イメージ事前ロード用生成 イメージは事前ロード
        data = {"current_pose":current_pose,"img_number":img_number,"user_id": user_id,}
        response = requests.post(self.url+"/inference_img/", data=data) #リクエスト送信
        if response.status_code == 200:
            image_data = response.content
            image =(pickle.loads(image_data))#元の形式にpickle.loadsで復元
            result = response.status_code
        return result, image

クライアント側テストプログラム

import numpy as np
import pickle
import cv2
from PIL import Image
import argparse
from time import sleep
import time
from poser_client_v1_2_class import TalkingHeadAnimefaceInterface

#PIL形式の画像を動画として表示
def image_show(imge):
    imge = np.array(imge)
    imge = cv2.cvtColor(imge, cv2.COLOR_RGBA2BGRA)
    cv2.imshow("Loaded image",imge)
    cv2.waitKey(1)

def main():

    print("TEST")
    
    parser = argparse.ArgumentParser(description='Talking Head')
    parser.add_argument('--filename','-i', default='000001.png', type=str)
    parser.add_argument('--test', default=0, type=str)
    parser.add_argument('--host', default='http://0.0.0.0:8001', type=str)
    args = parser.parse_args()
    
    test =args.test
    print("TEST=",test)
    filename =args.filename

    Thi=TalkingHeadAnimefaceInterface(args.host)

    user_id=0

    #pose_dic_orgの設定。サーバからもらう
    pose_dic_org = Thi.get_init_dic()
    
    #pose_dic_orgの設定。自分で書く
    #pose_dic_org={"eyebrow":{"menue":"happy","left":0.0,"right":0.0},
    #          "eye":{"menue":"wink","left":0.0,"right":0.0},
    #          "iris_small":{"left":0.0,"right":0.0},
    #          "iris_rotation":{"x":0.0,"y":0.0},
    #          "mouth":{"menue":"aaa","val":0.0},
    #          "head":{"x":0.0,"y":0.0},
    #          "neck":0.0,
    #          "body":{"y":0.0,"z":0.0},
    #          "breathing":0.0
    #          }
   
    
    #*************************　便利な独自pose形式　*****************************
    #-----パック形式
    #0  eyebrow_dropdown: str :            "troubled", "angry", "lowered", "raised", "happy", "serious"
    #1  eyebrow_leftt, eyebrow_right:      float:[0.0,0.0]
    
    #2  eye_dropdown: str:                 "wink", "happy_wink", "surprised", "relaxed", "unimpressed", "raised_lower_eyelid"
    #3  eye_left, eye_right :              float:[0.0,0.0]

    #4  iris_small_left, iris_small_right: float:[0.0,0.0]
    #5 iris_rotation_x, iris_rotation_y : float:[0.0,0.0]
    
    #6  mouth_dropdown: str:               "aaa", "iii", "uuu", "eee", "ooo", "delta", "lowered_corner", "raised_corner", "smirk"
    #7  mouth_left, mouth_right :          float:[0.0,0.0]
      
    #8  head_x, head_y :                   float:[0.0,0.0]
    #9  neck_z,                            float
    #10 body_y,                            float
    #11 body_z:                            float
    #12 breathing:                         float
    #
    # Poseの例
    # pose=["happy",[0.5,0.0],"wink", [i/50,0.0], [0.0,0.0], [0.0,0.0],"ooo", [0.0,0.0], [0.0,i*3/50],i*3/50, 0.0, 0.0, 0.0]
    #
    #-----Dict形式
    #"mouth"には2種類の記述方法がある"lowered_corner"と”raised_corner”は左右がある
    #  "mouth":{"menue":"aaa","val":0.0},
    #  "mouth":{"menue":"lowered_corner","left":0.5,"right":0.0},　これはほとんど効果がない
    #
    #pose_dic={"eyebrow":{"menue":"happy","left":0.5,"right":0.0},
    #        "eye":{"menue":"wink","left":0.5,"right":0.0},
    #        "iris_small":{"left":0.0,"right":0.0},
    #        "iris_rotation":{"x":0.0,"y":0.0},
    #        "mouth":{"menue":"aaa","val":0.7},
    #        "head":{"x":0.0,"y":0.0},
    #        "neck":0.0,
    #        "body":{"y":0.0,"z":0.0},
    #        "breathing":0.0
    #        }

    #*************************リポジトリの引数形式***********************
    #-----　展開したcurrent_poseの形式
    #current_pose= [
    #0  troubled_eyebrow_left=0.0,
    #1  troubled_eyebrow_right=0.0,
    #2  angry_eyebrow_left= 0.0,
    #3  angry_eyebrow_right 0.0,
    #4  lowered_eyebrow_left= 0.0,
    #5  lowered_eyebrow_right 0.0,
    #6  raised_eyebrow_left= 0.0,
    #7  raised_eyebrow_right 0.0,
    #8  happy_eyebrow_left= 0.0,
    #9  happy_eyebrow_right 0.02,
    #10 serious_eyebrow_left= 0.0,
    #11 serious_eyebrow_right=0.0,
    
    #12 eye_wink_left= 0.0,
    #13 eye_wink_right=0.0,
    #14 eye_happy_wink=0.0,
    #15 eye_happy_wink=0.0,
    #16 eye_suprised_left=0.0,
    #17 eye_suprised_right=0.0,
    #18 eye_relaxed_left=0.0,
    #19 eye_relaxed_right=0.0,
    #20 eye_unimpressed_left=0.0,
    #21 eye_unimpressed_right=0.0,
    #22 eye_raised_lower_eyelid_left=0.0,
    #23 eye_raised_lower_eyelid_right=_0.0,
    
    #24 iris_small_left=0.0,
    #25 iris_small_right0.0,
    
    #26 mouth_dropdown_aaa=0.0,
    #27 mouth_dropdown_iii=0.0,
    #28 mouth_dropdown_uuu=0.0,
    #29 mouth_dropdown_eee=0.0,
    #30 mouth_dropdown_ooo=0.0,
    #31 mouth_dropdown_delta=0.0,
    #32 mouth_dropdown_lowered_corner_left=0.0,
    #33 mouth_dropdown_lowered_corner_right=0.0,
    #34 mouth_dropdown_raised_corner_left=0.0,
    #35 mouth_dropdown_raised_corner_right=0.0,
    #36 mouth_dropdown_smirk=0.0,
    
    #37 iris_rotation_x=0.0,
    #38 iris_rotation_y=0.0,
    #39 head_x=0.0,
    #40 head_y=0.0,
    #41 neck_z=0.0,
    #42 body_y=0.0,
    #43 body_z=0.0,
    #44 breathing= 0.0
    #]

    if test=="0":
        user_id=0
        input_image  = Image.open(filename)
        input_image.show()
        image_number = Thi.load_img(input_image,user_id)
        print("image_number=",image_number)


    #サンプル　1　ベタ書き リポジトリの形式　 inference()を使用 イメージは毎回ロード
    if test=="1":  #inference()のテスト
        input_image = Image.open(filename)
        input_image.show()
        #image_number = Thi.load_img(input_image,user_id)
        current_pose = [0.0, 0.0,
                   0.0, 0.0,
                   0.0, 0.0,
                   0.0, 0.0,
                   0.0, 0.0,
                   0.0, 0.0,
                   
                   0.0, 0.5,
                   0.0, 0.0,
                   0.0, 0.0,
                   0.0, 0.0,
                   0.0, 0.0,
                   0.0, 0.0,

                   0.0, 0.0,

                   0.0,
                   0.0,
                   0.0,
                   0.0,
                   0.0,
                   0.0,
                   0.0, 0.0,
                   0.0, 0.0,
                   0.767,
                   0.566,
                   0.626,

                   0.747,
                   0.485,

                   0.444,
                   0.232,
                   
                   0.646,
                   1.0]
        result, out_image=Thi.inference(input_image,current_pose)
        out_image.show()

    #サンプル ２　inference()を使用　パック形式をリポジトリの形式に変換 　イメージは毎回ロード  #packed_pose=>current_pose2
    if test=="2": 
        input_image = Image.open(filename)
        input_image.show()
        packed_pose=["happy", [0.5,0.0], "wink", [1.0,0.0], [0.0,0.0], [0.0,0.0], "ooo", [0.0,0.0], [0.0,0.0], 0.0, 0.0,0.0, 0.0]
        current_pose2=Thi.get_pose(packed_pose) #packed_pose=>current_pose2
        result, out_image=Thi.inference(input_image,current_pose2) 
        out_image.show()
        
    #サンプル ３ inference(）を使用　Dict形式をget_pose_dicdでリポジトリの形式に変換　 イメージは毎回ロード
    if test=="3":
        input_image  = Image.open(filename)
        input_image.show()
        #サンプル Dict形式 
        #"mouth"には2種類の記述方法がある"lowered_corner"と”raised_corner”は左右がある
        #  "mouth":{"menue":"aaa","val":0.0},
        #  "mouth":{"menue":"lowered_corner","left":0.5,"right":0.0},　これはほとんど効果がない
        pose_dic={"eyebrow":{"menue":"happy","left":1.0,"right":0.0},
                "eye":{"menue":"wink","left":0.5,"right":0.0},
                "iris_small":{"left":0.0,"right":0.0},
                "iris_rotation":{"x":0.0,"y":0.0},
                "mouth":{"menue":"aaa","val":0.7},
                "head":{"x":0.0,"y":0.0},
                "neck":0.0,
                "body":{"y":0.0,"z":0.0},
                "breathing":0.0
                }
        pose=Thi.get_pose_dic(pose_dic)#Dic-> pose変換
        print(pose)
        result, out_image=Thi.inference(input_image,pose)
        out_image.show()
      
    #サンプル ４　inference_pos()を使用　パック形式　イメージは事前ロード
    if test=="4": 
        input_image = Image.open(filename)
        input_image.show()
        img_number=Thi.load_img(input_image,user_id)
        packed_pose=["happy", [0.5,0.0], "wink", [1.0,0.0], [0.0,0.0], [0.0,0.0], "ooo", [0.0,0.0], [0.0,0.0], 0.0, 0.0,0.0, 0.0]
        result, out_image=Thi.inference_pos(packed_pose,img_number,user_id) 
        out_image.show()
        
    #サンプル 5　inference_dic()を使用 　DICT形式で直接サーバを呼ぶ　イメージは事前ロード    
    if test=="5": 
        input_image = Image.open(filename)
        img_number=Thi.load_img(input_image,user_id)
        pose_dic=pose_dic_org #Pose 初期値
        current_pose_list=[]
        for i in range(20):
            start_time=time.time()
            current_pose_dic=pose_dic
            current_pose_dic["eye"]["menue"]="wink"#pose_dicに対して動かしたい必要な部分だけ操作できる
            current_pose_dic["eye"]["left"]=i*2/40 #
            start_time=time.time()
            result,out_image = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(out_image)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
            
    ##サンプル 6 inference_img() pose=リポジトリの形式(ベタ書き)　
    if test=="6": 
        input_image = Image.open(filename)
        input_image.show()
        img_number=Thi.load_img(input_image,user_id)
        for i in range(100):
            current_pose3= [0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0, 0.0, 0.0, i/100, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,0.0, 0.0, 0.0, 0.0,
                            0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,0.0, 0.0, 0.78, 0.57, 0.63, 0.75, 0.49, 0.43,0.23, 0.65,1.0]
            result,out_image=Thi.inference_img(current_pose3,img_number,user_id)
            image_show(out_image)
       
            
    #サンプル 7　inference_img() poseはパック形式形式をリポジトリの形式に変換 イメージは事前ロード,パック形式で連続変化させる  
    if test=="7": 
        input_image = Image.open(filename)
        input_image.show()
        imge = np.array(input_image)
        imge = cv2.cvtColor(imge, cv2.COLOR_RGBA2BGRA)
        cv2.imshow("Loaded image",imge)
        cv2.waitKey()
        img_number=Thi.load_img(input_image,user_id)
        print("img_number=",img_number)
        for i in range(50):
            packed_current_pose=[
                "happy",[0.5,0.0],"wink", [i/50,0.0], [0.0,0.0], [0.0,0.0],"ooo", [0.0,0.0], [0.0,i*3/50],i*3/50, 0.0, 0.0, 0.0]
            start_time=time.time()
            current_pose=Thi.get_pose(packed_current_pose) #packed_pose=>current_pose2
            result,out_image=Thi.inference_img(current_pose,img_number,user_id)
            image_show(out_image)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(100):
            packed_current_pose=[
                "happy", [0.5,0.0], "wink",[1-i/50,0.0], [0.0,0.0], [0.0,0.0], "ooo", [0.0,0.0], [0.0,3-i*3/50], 3-i*3/50, 0.0, 0.0, 0.0,]
            start_time=time.time()
            current_pose2=Thi.get_pose(packed_current_pose)#packed_current_pose==>current_pose2
            result,out_image=Thi.inference_img(current_pose2,img_number,user_id)
            image_show(out_image)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(100):
            packed_current_pose=[
                "happy", [0.5,0.0], "wink", [i/100,i/100], [0.0,0.0], [0.0,0.0], "ooo", [0.0,0.0], [0.0,-3+i*3/50], -3+i*3/50,0.0, 0.0,0.0,]
            start_time=time.time()
            current_pose2=Thi.get_pose(packed_current_pose)#packed_current_pose==>current_pose2
            result,out_image=Thi.inference_img(current_pose2,img_number,user_id)
            image_show(out_image)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(100):
            packed_current_pose=[
                "happy", [0.5,0.0], "wink", [0.0,0.0], [0.0,0.0], [0.0,0.0], "ooo",  [0.0,0.0], [0.0,3-i*3/100],  3-i*3/100,  0.0, 0.0, 0.0,]
            start_time=time.time()
            current_pose2=Thi.get_pose(packed_current_pose) #packed_current_pose==>current_pose2
            result,out_image=Thi.inference_img(current_pose2,img_number,user_id)
            image_show(out_image)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        image_show(out_image)
        cv2.waitKey(5000)
        

    #サンプル 8　inference_dic() 　poseはDICT形式で直接サーバを呼ぶ　イメージは事前ロード　  DICT形式で必要な部分のみ選んで連続変化させる
    if test=="8": 
        div_count=30
        input_image = Image.open(filename)
        imge = np.array(input_image)
        imge = cv2.cvtColor(imge, cv2.COLOR_RGBA2BGRA)
        cv2.imshow("image",imge)
        cv2.waitKey()
        img_number=Thi.load_img(input_image,user_id)
        pose_dic=pose_dic_org #Pose 初期値
        current_pose_list=[]
        for i in range(int(div_count/2)):
            start_time=time.time()
            current_pose_dic=pose_dic
            current_pose_dic["eye"]["menue"]="wink"
            current_pose_dic["eye"]["left"]=i/(div_count/2)
            current_pose_dic["head"]["y"]=i*3/(div_count/2)
            current_pose_dic["neck"]=i*3/(div_count/2)
            current_pose_dic["body"]["y"]=i*5/(div_count/2)
            start_time=time.time()
            result,imge = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(imge)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(div_count):
            start_time=time.time()
            current_pose_dic["eye"]["left"]=1-i/(div_count/2)
            current_pose_dic["head"]["y"]=3-i*3/(div_count/2)
            current_pose_dic["neck"]=3-i*3/(div_count/2)
            current_pose_dic["body"]["y"]=5-i*5/(div_count/2)
            start_time=time.time()
            result,imge = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(imge)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(div_count):
            start_time=time.time()
            current_pose_dic["eye"]["left"]=i/div_count
            current_pose_dic["eye"]["right"]=i/div_count
            current_pose_dic["head"]["y"]=-3+i*3/(div_count/2)
            current_pose_dic["neck"]=-3+i*3/(div_count/2)
            current_pose_dic["body"]["z"]=i*3/div_count
            current_pose_dic["body"]["y"]=-5+i*5/(div_count/2)
            start_time=time.time()
            result,imge = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(imge)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        for i in range(div_count):
            start_time=time.time()
            current_pose_dic["eye"]["left"]=0.0
            current_pose_dic["eye"]["right"]=0.0
            current_pose_dic["head"]["y"]=3-i*3/(div_count/2)
            current_pose_dic["neck"]=3-i*3/(div_count/2)
            current_pose_dic["body"]["z"]=3-i*3/div_count
            current_pose_dic["body"]["y"]=5-i*5/(div_count/2)
            start_time=time.time()
            result,imge = Thi.inference_dic(current_pose_dic,img_number,user_id)
            image_show(imge)
            print("Genaration time=",(time.time()-start_time)*1000,"mS")
        image_show(imge)
        cv2.waitKey(5000)
            
if __name__ == "__main__":
    main()

一枚絵があれば動く。Talking-Head-Anime-3のインストールして、ポーズデータでスムーズに動かす サーバ編