蛇の悪魔討伐戦その1

2023年8月27日 00:42

概要

　退魔の本の3章、4章を基に実際にコーディングしていく。今回、測定データを読みこんでグラフを描画するためのライブラリを作る。そのために、データファイルを読み込んで値オブジェクトでできたnp.ndarrayオブジェクトを生成するためのライブラリを作ろうとした。結果として、そもそも値オブジェクトを作る部分で詰んでしまっている。ひとまず、思いつくだけ試したので、現状の成果を残す。

やりたいこと

　以下のような粉末X線回折（XRD）のデータを読み込む。前半は測定条件などがあり、後半に角度（2θ）、散乱強度の値の組がスペース区切りで記録されている。

xrd_data.txt
1  Sample sample_name
2  Comments condition 1
   ...
23 StopTime 2023/08/10 13:00:00
24 StopTemp 0
   (2theta) (diffraction intensity)
25 5 88.3333
26 5.04 85
27 5.08 76.25
28 5.12 78.75
   ...

このデータを、この程度の気軽さで扱えるようにしたい。

#データファイル読み込み
xrd_data = (いい感じのクラス).read_xrd_data("xrd_data.txt")

(matplotlibの描画準備)

#XRDパターンの描画
ax.plot(
    xrd_data.two_theta,
    xrd_data.intensity
)

そして、退魔の本の教えに従い、以下のようなことはエラーで弾くようにしたい。

theta_array = np.array([0.0, 0.1, 0.2]) #全然別件で使おうとした変数

xrd_data.two_theta = theta_array #破壊的な再代入

intensity_shift: float = 500 #y軸描画をずらすための変数

ax.plot(
    xrd_data.two_theta + intensity_shift, #一行間違えたとこに書いちゃった
    xrd_data.intensity
)

やるべきこと

　データをnp.ndarray[float]ではなく、np.ndarray[オリジナルの値オブジェクト]とする。そのうえで、データの再代入を禁止し、加減乗除を行う時の右辺の型を制限することで解決する。幸い、np.ndarrayは中の型の加減乗除が規定されていれば、配列の要素すべてに対して右辺を作用させる（線形代数のような行列の掛け算をしない）ため、np.ndarrayを継承して新たな部分型を作る必要が無い。

　データの読み込みは測定機器によって別々にする。複数の測定機器で同種の測定をすることがあるため、データの読み込み用の関数はクラスメンバではなく外部関数にする。

　すなわち、以下のようにライブラリ、クラス設計を行う。ValueObjectクラスを抽象クラス（インターフェース）の役割として、それを継承させる形でそれぞれの値オブジェクトクラスを実装する。データを読み込んだ後はそれぞれのカラムの値に応じた値オブジェクト型の数列（np.ndarray[値オブジェクト型]）を所有する。今回は反射角（two theta）と反射強度（diffraction intensity）の2つの値をもつデータをまとめたクラスを作る。

xrd.py
 
class ValueObject:
	__value: float

	def __init__(self, value: self):
		(処理)
		
	@property
	def value(self)->float:
		return self.__value
	
	def __add__(self, added_value):
		(処理)
	
	def __sub__(self, subed_value):
		(処理)
	
	def __mul__(self, muled_value):
		(処理)
	
	def __dev__(self, deved_value):
		(処理)

	def __str__(self):
		(処理)
	
	def __repr__(self):
		return str(self._value)
	
	def __float__(self):
		return self.value

# 独自の値オブジェクト
class Theta(ValueObject):
	pass
 
class DiffractionInteisity(ValueObject):
	pass
 
Theta_Array = NewType("Theta_Array", np.ndarray[Theta])
Diffracton_Intensity_Array = \
	NewType("Diffracton_Intensity_Array", np.ndarray[DiffractionIntensity])
 
class XRDPattern:
    __two_theta: Theta_Array
    __intensity: Diffracton_Intensity_Array

    def __init__(
        self, 
        two_theta: Union[list[float], np.ndarray[float]], 
        intensity: Union[list[float], np.ndarray[float]]
        ):
        (初期化処理)

    @property
    def two_theta(self):
        return self.__two_theta
    
    @property
    def intensity(self):
        return self.__intensity

xrd_reader.py

def read_XRD_data(
	file_path: str
	)->Optional[xrd.XRDPattern]:
    
	(処理)

    return xrd_data

やったこと

　はっきり言って、七転八倒だった。思っていたのと違う動作をするものが多くて大変だった。

immutatorデコレータの作成

　まず、クラスメソッドの引数が不変になるように、関数にdeepcopyした引数を渡すデコレータを定義した。これは前回で動作確認済み。デコレータのdocstringが表示されないよう、functoolsのwrapデコレータを使用。

from functools import wraps

def immutator(func):
	"""
	This decorator passes deepcopied aruments list.
	It is guaranteeed that the all original arguments will not be overwritten.
	"""
	@wraps(func)
	def wrapper(*args, **kwargs):
		args_copy = tuple(copy(arg) for arg in args)
		kwargs_copy = {
			key: copy(value) for key, value in kwargs.items()
		}
		return func(*args_copy, **kwargs_copy)
	return wrapper

def self_mutator(func):
	"""
	This decorator passes deepcopied aruments list other than itself.
	It is guaranteeed that the all original arguments will not be overwritten.
	"""

	@wraps(func)
	def wrapper(*args, **kwargs):
		
		# Checking wheatehr the first arg is self
		
		# Check the length of the arguments and get the firstr argument
		if len(args)==1:
			first_arg = args
		elif len(args)==0:
			raise InvalidDecorator("This decorator is used in class method with a self argument.")
		else:
			first_arg = args[0]
		
		# extract the name of the method which is calling this decorator
		func_name = func.__name__


		# try to get the id of class method
		try:
			first_arg.__getattribute__(func_name)

		except AttributeError:
			# Meaning that the object does not have the method whose name is tha same as the method calling this decorator.
			raise InvalidDecorator("This decorator must be used in class method.")
	
		if len(args)==1:
			args_copy = args
		else:
			args_copy = tuple([args[0]]) + \
				tuple(copy(arg) for arg in args[1:])
		kwargs_copy = {
			key: copy(value) for key, value in kwargs.items()
		}
		return func(*args_copy, **kwargs_copy)
	return wrapper

値オブジェクトでやりたいこと

　今回作成する値オブジェクトの要件は以下の通り。

メンバにfloat型の変数を一個持つ。このメンバはprivateにする。
値の書き換えは許可しない。init特殊メソッドでのみ許可する。値を取得するインスタンスメソッドのみ定義する。
加減特殊メソッドは右辺が自身と同型の時のみ許可する。
乗除特殊メソッドは右辺がfloatまたはintの時のみ許可する。
matplotlibでのグラフ描画のため、floatへのキャストを定義する。
np.ndarrayのmax()関数などのため、比較特殊メソッドを実装する。
ついでに、str特殊メソッドも実装しておく。

値オブジェクト実装（抽象クラス編）

まずは以下のように、init関数を書いてみた。C++のテンプレートクラスのようにしたかったので、ジェネリクス型を使っている。デバッグ出力用に repr特殊メソッドも定義している。

from typing import Generic


Ty = TypeVar('Ty')

class ValueObject(Generic[Ty]):
	__value: Ty

	def __init__(self, value: Ty):
		if type(self)!=Ty:
			raise TypeError(f"An invalid object is subtracted.\nAllowd type is "\
				+str(Ty) +", but "+str(type(value))+" is used")
		self.__value = value
		
	@property
	def value(self)->Ty:
		return self._value

	def __repr__(self):
		return str(self.__value)


class ValueObjectF(ValueObject[float]):
	pass

しかしこれのテストはうまくいかない。エラー出力もなんか変だが、どうやら予想した動作ではないようだ。

from BFC_libs.common import ValueObjectF

value_object_instance = ValueObjectF(1.0)

>> TypeError: An invalid object is subtracted.
>> Allowd type is ~Ty, but  is used

試しに、__init__関数内でTyとtype(value)を出力させると以下のようになった。

def __init__(self, value: Ty):
		if type(value)!=Ty:
			print("Type Error!")
			print("Required type is "+ str(Ty) + ", but " + str(type(value)) + " is used.")
(以降省略)

>> Type Error!
>> Required type is ~Ty, but <class 'float'> is used.

C++のテンプレートクラスのようにTyがfloatに置き換わったクラスが作られているのかと思いきやそうではないらしい。思えば、C++ではコンパイル時にTy型をfloat型などに書き換えたクラスを作っているが、Pythonは実行時型チェックなので仕組みが違うのかもしれない。

　ここは「int型も受け入れられる値オブジェクトを作っても使わない可能性がある」ということも考えて、float型のみ受け入れる値オブジェクトを作った。

class ValueObject:
	__value: float

	def __init__(self, value: float):
		if type(value)!=float:
			print("Type Error!")
			print("Allowed type is float, but " + str(type(value)) + " is used.")
			raise TypeError("An invalid object was substituted.\nAllowd type is float, but "+str(type(value))+" was used")
		self._value = value

	@property
	def value(self)->float:
		return self.__value

これは想定される動作をしてくれる。これで値の破壊的代入と型チェックを行ってくれるようになった。

value_object_instance = ValueObject(1.0) #OK　

print(value_object_instance.value) #値の取得はOK

value_object_instance.value = 5.0 #AttributeError!

value_object_instance_int = ValueObject(1) #TypeError!

しかし、raiseされているエラー文は相変わらずstr(type(value))が表示されない。デバッグ時にこれは不便。標準出力は意図した通りなので、文字列変換が間違っているわけではない。

>> 1.0
>> Type Error!
>> Allowed type is float, but <class 'int'> is used.

>> TypeError: An invalid object is subtracted.
>> Allowd type is float, but  was used

いろいろ試して分かったが、どうやらraiseされたエラー文のうち、<>で囲まれた文字列は無視されるようだ。どこにもそんなの書いてなかったやん。

raise ValueError("str1 str2 <bracket> str3 str4")

>>ValueError: str1 str2  str3 str4

そんなわけで、最終的にinit特殊メソッドの例外処理は以下のようにした。送られる例外も意図した通りになった。

	def __init__(self, value: float):
		if type(value)!=float:
			error_report_raw = "An invalid object was substituted.\nAllowd type is float, but "+str(type(value))+" was used"
			error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
			raise TypeError(error_report_blacket_replaced)
		self._value = value

value_object_instance_int = ValueObject(1)

>> TypeError: An invalid object was substituted.
>> Allowd type is float, but class 'int' was used

なお、Pythonの型付けは強いので、C++におけるintからfloatのような暗黙的キャストはしないので、継承した子クラスを使って以下のような悪魔召喚はできない。素晴らしい。

class Theta(ValueObject):
    pass

class DiffractionIntensity(ValueObject):
    pass



theta_A = Theta(1.0)

theta_B = Theta(2.0)

theta_C = Theta(DiffractionIntensity(100.0)) #ValueObject型は__float__特殊メソッドを実装しているが、それは暗黙的には働かない

>> An invalid object was substituted.
>> Allowd type is float, but class '__main__.DiffractionIntensity' was used

次に加算、減算特殊メソッドを定義する。左辺右辺ともに不変なので、自作のimmutatorデコレータを使っている。ここでは同じ型同士の加減のみが許可されている。これにより、このクラスを継承した型同士でも、異なる型ならば加減は許可されない。しかし返り値の方は親クラスであるValueObject型になってしまう。

class ValueObject:
（省略）
	@immutator
	def __add__(self, added_value):
		
		# error
		if type(self)!=type(added_value):
			error_report_raw = "An invalid object was substituted.\nAllowd type is "+str(type(self))+", but "+str(type(added_value))+" was used"
			error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
			raise TypeError(error_report_blacket_replaced)
		
		# normal process
		return ValueObject(self.value+added_value.value)

とりあえず基底クラスでテストしてみると、問題なく動いている。ここまではいい。

value_object_instanceA = ValueObject(1.0)

value_object_instanceB = ValueObject(2.0)

value_object_instanceC = value_object_instanceA + value_object_instanceB # OK

print(f"A: {value_object_instanceA}, B: {value_object_instanceB}, C: {value_object_instanceC}")
>> A: 1.0, B: 2.0, C: 3.0

value_object_instanceD = value_object_instanceA + 1.0
>> TypeError: An invalid object was substituted.
>> Allowd type is class '(module名).ValueObject', but class 'float' was used

しかし継承したクラスを使うと、想定通りではあるが、やってほしい動作をしない。やはり返り値の方が親クラスの物になっているからだ。

class Theta(ValueObject):
    pass

theta_A = Theta(1.0)

theta_B = Theta(2.0)

theta_C = theta_A + theta_B #これ自体は動く

theta_D = theta_C + theta_A #しかしtheta_CはTheta型でないため、TypeErrorになる
>> TypeError: An invalid object was substituted.
>> Allowd type is class '(モジュール名).ValueObject', but class '__main__.Theta' was used

すこし考え込んで、以下のようなデコレータを新たに実装してみた。これを値オブジェクトにしたいクラスにデコレートすることで、そのクラス自身のinit特殊メソッドを呼び出す。このやり方なら目的の動作をしてくれる。

def value_object(cls: type):
	
	@immutator
	def addition (self: cls, added_value: cls):
		if type(self)!=type(added_value):
			error_report_raw = "An invalid object was substituted.\nAllowd type is "+str(type(self))+", but "+str(type(added_value))+" was used"
			error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
			raise TypeError(error_report_blacket_replaced)
	
		return cls(self.value + added_value.value)

	cls.__add__ = addition

	return cls

#------------------------------------------------
@value_object
class Theta(ValueObject):
    pass

@value_object
class DiffractionIntensity(ValueObject):
    pass

theta_A = Theta(1.0)

theta_B = Theta(2.0)

theta_C = theta_A + theta_B

theta_D = theta_C + theta_A

print(f"A: {theta_A}, B: {theta_B}, C: {theta_C} D: {theta_D}")
>> A: 1.0, B: 2.0, C: 3.0 D: 4.0

diffraction_intensity_A = DiffractionIntensity(1200.0)

diffraction_intensity_B = diffraction_intensity_A + theta_A
>> TypeError: An invalid object was substituted.
>> Allowd type is class '__main__.DiffractionIntensity', but class '__main__.Theta' was used

……抽象クラスの継承ではなくてデコレータの方がいいのでは？？

値オブジェクト実装（デコレータ編）

ということで今までの実装を全部デコレータに移してみた。

def value_object(cls: type):
	# 関数の実態定義
	def innitialization(self: cls, value: float):
		if type(value)!=float:
			error_report_raw = "An invalid object was substituted.\nAllowd type is float, but "+str(type(value))+" was used"
			error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
			raise TypeError(error_report_blacket_replaced)
		self.__value = value
	
	@property
	def value(self: cls):
		return self.__value
	
	@immutator
	def addition (self: cls, added_value: cls)->cls:
		if type(self)!=type(added_value):
			error_report_raw = "An invalid object was substituted.\nAllowd type is "+str(type(self))+", but "+str(type(added_value))+" was used"
			error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
			raise TypeError(error_report_blacket_replaced)
	
		return cls(self.value + added_value.value)
	
	@immutator
	def cast_to_float(self: cls)->float:
		return self.value
	
	@immutator
	def cast_to_str(self: cls)->str:
		return str(self.value)
	
	@immutator
	def print(self: cls)->str:
		return str(self.value)
	# 関数オブジェクトを代入
	cls.__init__ = innitialization
	cls.value = value
	cls.__add__ = addition
	cls.__float__ = cast_to_float
	cls.__str__ = cast_to_str
	cls.__repr__ = print


	return cls

テストしてみると想定通りに動いている。

@value_object
class Theta:
    pass

@value_object
class DiffractionIntensity:
    pass

theta_A = Theta(1.0)

theta_B = Theta(2.0)

theta_C = theta_A + theta_B

theta_D = theta_C + theta_A

print(f"A: {theta_A}, B: {theta_B}, C: {theta_C} D: {theta_D}")
>> A: 1.0, B: 2.0, C: 3.0 D: 4.0

diffraction_intensity_A = DiffractionIntensity(1200.0)

diffraction_intensity_B = diffraction_intensity_A + theta_A
>> An invalid object was substituted.
>> Allowd type is class '__main__.DiffractionIntensity', but class '__main__.Theta' was used

しかしインスタンスメソッドの追加は動的に行われるため、VS codeのインテリジェンスではメンバのサジェスチョンが機能しなくなってしまった。普通に不便。

しかしどういう理屈か、デコレータ内でclass定義してそれを返すようにするとVS codeのインテリジェンスは働いてくれるようだ。これは使えそう。一周回ってまた抽象クラスっぽい実装になってきた。

def value_object(cls: type):
	# 抽象クラス的な何か
	class new_cls(cls):
		@property
		def value(self)->float:
			pass
		
		pass
（省略）

	new_cls.value = value

	return new_cls

ということで関数定義を追加。

def value_object(cls: type):
	# 抽象クラス的な何か
	class new_cls(cls):
		@property
		def value(self)->float:
			pass
		
		def __init__(self, value: float):
			pass

		def __add__(self, added_value: cls)->cls:
			pass

		def __float__(self)->float:
			pass

		def __str__(self)->str:
			pass

		def __repr__(self)->str:
			pass
		pass
		
	def innitialization(self: cls, value: float):

（省略）

	new_cls.__init__ = innitialization
	new_cls.value = value
	new_cls.__add__ = addition
	new_cls.__float__ = cast_to_float
	new_cls.__str__ = cast_to_str
	new_cls.__repr__ = print


	return new_cls

これならいくつかの関数の引数や返り値の型が分かる。

変数の型の表示はnew_clsになってしまった。困る。

@dataclassデコレータでどのようにしているのか調べたところ、どうやら上記のやり方は少し正しくないらしく、クラスを返すのではなく、クラスを返す関数オブジェクトを返すのが正しいらしい？なので改良してみた。

def value_object(cls: type):
	# ラップ関数
	def wrap(cls):
		# 抽象クラス
		class new_cls(cls):
			@property
			def value(self)->float:
				pass
			
			def __init__(self, value: float):
				pass

			def __add__(self, added_value: cls)->cls:
				pass

			def __float__(self)->float:
				pass

			def __str__(self)->str:
				pass

			def __repr__(self)->str:
				pass
			pass
		# 関数の実態定義	
		def innitialization(self: cls, value: float):
			if type(value)!=float:
				error_report_raw = "An invalid object was substituted.\nAllowd type is float, but "+str(type(value))+" was used"
				error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
				raise TypeError(error_report_blacket_replaced)
			self.__value = value
		
		@property
		def value(self: cls):
			return self.__value
		
		@immutator
		def addition (self: cls, added_value: cls)->cls:
			if type(self)!=type(added_value):
				error_report_raw = "An invalid object was substituted.\nAllowd type is "+str(type(self))+", but "+str(type(added_value))+" was used"
				error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
				raise TypeError(error_report_blacket_replaced)
		
			return cls(self.value + added_value.value)
		
		@immutator
		def cast_to_float(self: cls)->float:
			return self.value
		
		@immutator
		def cast_to_str(self: cls)->str:
			return str(self.value)
		
		@immutator
		def print(self: cls)->str:
			return str(self.value)
		
		#関数オブジェクトの代入
		new_cls.__init__ = innitialization
		new_cls.value = value
		new_cls.__add__ = addition
		new_cls.__float__ = cast_to_float
		new_cls.__str__ = cast_to_str
		new_cls.__repr__ = print


		return new_cls
	
	return wrap

　インスタンスの型は想定通りに表示されるようになった。しかしインスタンス関数については情報が取れなくなってしまった。初期化に必要な引数もわからないし、valueメソッドも候補に出なくなってしまった。

値オブジェクト実装（ごり押し編）

　インスタンスメソッドの実装自体はデコレータの方がよさそうだが、VS codeのインテリジェンスを活用するには継承の方に分がありそう。そこで、抽象クラスに純粋仮想関数的なものを実装してそれを継承させつつ、関数のオーバーロードを的なものをデコレータで実装することで解決してみる。

　とりあえず、抽象クラスはこんな感じ。中身すっからかん。これ単体が使われると見た目に反する動作をするため、このクラスのinitメソッドが呼ばれた時はエラーを投げる。

class ValueObject:

	# 関数の引数と返り値だけ

	_value: float

	def __init__(self, value: float):
		raise TypeError("Use with @value_object decorator")
				
	@property
	def value(self)->float:
		pass
	
	def __add__(self, added_value):
		pass

	def __str__(self):
		pass

	def __repr__(self)->str:
		pass

	def __float__(self):
		pass

一方で、デコレータの実装はこう。こっちは単体で使われても実害はないため、特にエラーは投げない。

def value_object(cls: type):
	
	# wrqp関数
	def wrap(cls):
		# 関数の実態定義
		def innitialization(self: cls, value: float):
			if type(value)!=float:
				error_report_raw = "An invalid object was substituted.\nAllowd type is float, but "+str(type(value))+" was used"
				error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
				raise TypeError(error_report_blacket_replaced)
			self._value = value
		
		@property
		def value(self: cls):
			return self._value
		
		@immutator
		def addition (self: cls, added_value: cls)->cls:
			if type(self)!=type(added_value):
				error_report_raw = "An invalid object was substituted.\nAllowd type is "+str(type(self))+", but "+str(type(added_value))+" was used"
				error_report_blacket_replaced = error_report_raw.replace("<", "").replace(">", "")
				raise TypeError(error_report_blacket_replaced)
		
			return cls(self.value + added_value.value)
		
		@immutator
		def cast_to_float(self: cls)->float:
			return self.value
		
		@immutator
		def cast_to_str(self: cls)->str:
			return str(self.value)
		
		@immutator
		def print(self: cls)->str:
			return str(self.value)
		# 関数オブジェクトの代入
		cls.__init__ = innitialization
		cls.value = value
		cls.__add__ = addition
		cls.__float__ = cast_to_float
		cls.__str__ = cast_to_str
		cls.__repr__ = print

		return cls
	return wrap

次のコードでテストしてみようとすると、見慣れないエラーが返ってきた。調べてみたところ、変更不可能な属性を書き換えようとすると起こるらしい。今回の場合、floatクラスのinit関数の書き換えは許可されていない、とのこと。どうやらデコレータがうまく動いていないらしい。さっきテストを怠ってしまったが、どうやら関数オブジェクトを返すやり方がうまくいっていないらしく、Thetaの実態はクラスではなく関数になっていて、引数として入れられた1.0のinitメソッドの書き換えとして認識されたらしい。クラスのデコレータの扱いが悪いようなのだが、クラスへのデコレータは調べても調べても出てこない……。

@value_object
class Theta(ValueObject):
    pass

theta_A = Theta(1.0)
>> 'float' object attribute '__init__' is read-only

助けて

　今回は悪魔を寄せ付けないライブラリを作ろうとしたが、そもそも想定の動作をしてくれるものが作れなかった。一応、デコレータを使うやり方だと「コードが動く」状態にはなるが、「VS codeのインテリジェンスが全く使い物にならない」という、ライブラリとしてはどうかと思う状態になってしまった。解決方法が分かり次第リベンジしようと思う。