![見出し画像](https://assets.st-note.com/production/uploads/images/166132136/rectangle_large_type_2_ba1c255863b37fb001ada1b73b29bf44.png?width=1200)
【python】Excelファイルを読み込んでjsonファイルに出力する【Pandas】
1.前提
Pandasがインストール済
openpyxlがインストール済
2.Excelファイルを準備
今回は総務省のサイトにある都道府県コード及び市区町村コードを使用します。
下記からもダウンロードできます。作業しやすように、1部加工しています。※シート名がsample
![](https://assets.st-note.com/img/1734438101-YtWQvBi2lMVyzEJF8boG3Tph.png?width=1200)
3.pythonの実行
#ライブラリのインポート
import json
import pandas as pd
#Excelファイルを読み込み
df = pd.read_excel("000925835.xlsx", sheet_name="sample")
#出力ファイル名
json_path = 'data.json'
#jsonファイル出力
with open(json_path, "w", encoding="utf-8") as f:
json.dump(df.to_dict(orient="records"), f)
jsonは Python標準ライブラリなので、インストールする必要はありません。
4.出力されたjsonファイル
[
{
"municipalityCode": 11002,
"prefectures": "\u5317\u6d77\u9053",
"cities": "\u672d\u5e4c\u5e02",
"prefecturesKana": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73",
"citiesKana": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c"
},
{
"municipalityCode": 11011,
"prefectures": "\u5317\u6d77\u9053",
"cities": "\u672d\u5e4c\u5e02\u4e2d\u592e\u533a",
"prefecturesKana": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73",
"citiesKana": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff81\uff6d\uff73\uff75\uff73\uff78"
},
{
"municipalityCode": 11029,
"prefectures": "\u5317\u6d77\u9053",
"cities": "\u672d\u5e4c\u5e02\u5317\u533a",
"prefecturesKana": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73",
"citiesKana": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff77\uff80\uff78"
},
{
"municipalityCode": 11037,
"prefectures": "\u5317\u6d77\u9053",
"cities": "\u672d\u5e4c\u5e02\u6771\u533a",
"prefecturesKana": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73",
"citiesKana": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff8b\uff76\uff9e\uff7c\uff78"
}
]
5.非ASCII文字変換しない(日本語をそのまま出力する)
ensure_ascii=False を指定する
python
import json
import pandas as pd
# excel読み込み
df = pd.read_excel("000925835.xlsx", sheet_name="sample")
# JSON出力
json_path = 'data.json'
with open(json_path, "w", encoding="utf-8") as f:
json.dump(df.to_dict(orient="records"), f, indent=4,ensure_ascii=False)
ensure_ascii=Falseを追加
出力jsonファイル
[
{
"municipalityCode": 11002,
"prefectures": "北海道",
"cities": "札幌市",
"prefecturesKana": "ホッカイドウ",
"citiesKana": "サッポロシ"
},
{
"municipalityCode": 11011,
"prefectures": "北海道",
"cities": "札幌市中央区",
"prefecturesKana": "ホッカイドウ",
"citiesKana": "サッポロシチュウオウク"
},
{
"municipalityCode": 11029,
"prefectures": "北海道",
"cities": "札幌市北区",
"prefecturesKana": "ホッカイドウ",
"citiesKana": "サッポロシキタク"
},
{
"municipalityCode": 11037,
"prefectures": "北海道",
"cities": "札幌市東区",
"prefecturesKana": "ホッカイドウ",
"citiesKana": "サッポロシヒガシク"
}
(略)
]
6.ヘッダー行を指定する、インデント(空白や改行を)なくす
ヘッダーが先頭行ではない場合は、Excelを読む時にヘッダー行を指定する必要があります。
2行目をヘッダーとする場合は、 header=1 にする。※0からカウントするので2行目を指定する時は1になる。
indentを指定しない(デフォルトの設定)と、空白や改行を含めないため、jsonファイルサイズが少なくなります。
import json
import pandas as pd
# excel読み込み
df = pd.read_excel("000925835.xlsx", sheet_name="sample", header=1)
# JSON出力
json_path = 'data.json'
with open(json_path, "w", encoding="utf-8") as f:
json.dump(df.to_dict(orient="records"), f)
[{"011002": 11011, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u4e2d\u592e\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff81\uff6d\uff73\uff75\uff73\uff78"}, {"011002": 11029, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u5317\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff77\uff80\uff78"}, {"011002": 11037, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u6771\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff8b\uff76\uff9e\uff7c\uff78"}, {"011002": 11045, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u767d\u77f3\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff7c\uff9b\uff72\uff7c\uff78"}, {"011002": 11053, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u8c4a\u5e73\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff84\uff96\uff8b\uff97\uff78"}, {"011002": 11061, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u5357\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff90\uff85\uff90\uff78"}, {"011002": 11070, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u897f\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff86\uff7c\uff78"}, {"011002": 11088, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u539a\u5225\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff71\uff82\uff8d\uff9e\uff82\uff78"}, {"011002": 11096, "\u5317\u6d77\u9053": "\u5317\u6d77\u9053", "\u672d\u5e4c\u5e02": "\u672d\u5e4c\u5e02\u624b\u7a32\u533a", "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73": "\uff8e\uff6f\uff76\uff72\uff84\uff9e\uff73", "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c": "\uff7b\uff6f\uff8e\uff9f\uff9b\uff7c\uff83\uff72\uff88\uff78"}]