
ðãååãSpotify API ã䜿ã£ãŠBe:Firstã®æ¥œæ²ããŒã¿ã2023幎床ã®äžçãã¬ã³ããåæããŠã¿ãïŒ
ãã°ããæµè¡æ²ãªããŠèå³ããªãããããæ¥æ¬ã®æµè¡ãã«ãçãã£ãç§ãããªããšãªãå£ããããæ²ã«åºäŒã£ããããããã®ãã³ãã¯æµ·å€ã§æŽ»èºããããšã倢èŠãŠãè¥æã¢ãŒãã£ã¹ãã ãšãã
ããã§èª¿ã¹ãããªã£ãã®ãã圌ãã®ä»å¹Žã®ä»£è¡šæ²ã¯ãå
šäžçã®Spotifyã§ä»å¹ŽïŒïŒïŒïŒå¹Žæµè¡ã£ãŠããæ²ãšã©ãããã䌌ãŠããã®ããã€ãŸããSpotifyã§è¯ãèãããŠããæ²ãšå±æ§ãªã©ã®æ²èª¿ã䌌éã£ãŠããã°ããããããŠåœŒãã®æ¥œæ²ãå
šäžçã«å±ãããããªããïŒããæãåæã«åæããŠãããŸããã
ä»åãã£ãŒãã£ãŒããã¢ãŒãã£ã¹ãã¯ã
Be: First (Mainstream)
ã§ãã
ïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒ
ãååãã§ãããã®èšäºã§ã¯SpotifyããŒã¿å
šäœã®åæããå§ããŠãããããšæããŸãããåŸåãã§ã¯å®éã«Spotify APIã䜿ã£ãŠèª¿ã¹ãã楜æ²ã®ããŒã¿ãååŸããä»å¹Žã®æµè¡ã®æ¥œæ²ãšç
§ããåãããŠåæããŠãããŸãã
ïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒïŒ
ããŒã¿ãœãŒã¹
ãŸãã¯ïŒïŒïŒïŒå¹ŽïŒçŸæç¹ãŸã§ã®ïŒæ¥œæ²ããŒã¿ãïœïœïœãã¡ã€ã«ã§ããŠã³ããŒãããŠãããžã§ã¯ãçšã®ãã€ã¬ã¯ããªãŒã«ä¿åããŸãã
ïŒïŒïŒïŒå¹ŽåºŠã®æ¥œæ²ããŒã¿ãè²ã ãªè§åºŠããèŠãŠãããŸãã
df.info()

df.info()ã§ããŒã¿ã®å
šäœåãèŠãŸãã
ãŸãäžçªå·ŠåŽã®ã³ã©ã ã¯ã€ã³ããã¯ã¹ã§ããã¡ãªã¿ã«Pythonèšèªã¯ïŒããã«ãŠã³ããRèšèªã¯çµ±èšèšèªãªã®ã§ïŒããæ°ããŸãã
ãã®æ¬¡ã®ã³ã©ã ã¯æ¥œæ²ã®å±æ§ãªã©ã瀺ããã©ã¡ã¿ãŒããã®é£ã«ã¯Non-Null CountãšãããŸããããã¯NullïŒäœãèšå®ãããŠããªãå€ïŒãã©ãã ãå«ãŸããŠãããç¥ãããšãã§ããŸãã"in_shazam_charts"ãšãããã©ã¡ã¿ãŒã«Nullãå«ãŸããŠãããããªã®ã§å¯ŸåŠããªããã°ãªããŸããããããŠäžçªå³ãããŒã¿ã®ã¿ã€ãã§ãããã©ã¡ã¿ãŒã®ããŒã¿ã¿ã€ãã¯ãªããžã§ã¯ããint (æŽæ°å)ãªããšãããããŸãã
ãã®å
šäœåãã次ã®ã¹ãããã§ãããããŒã¿ã»ã¯ãªãŒãã³ã°ããšåŒã°ããäœæ¥ãã©ã®ããã«è¡ã£ãŠããã®ãèããŠããããšã«ãªãã®ã§ãããããŒã¿ã¯ãªãŒãã³ã°ã¯ããªãã®ã¹ãããããã¯ããã¯ãå¿
èŠãšãªãã®ã§ãããã«ã€ããŠã¯ãŸã次ã®æ©äŒã«è§ŠããããšãšããŸãã
ç°¡åãªããžã¥ã¢ã©ã€ãŒãŒã·ã§ã³ïŒå¯èŠåïŒãšã¢ããªã·ã¹
ããã§ã¯Pythonãšããããã°ã©ãã³ã°èšèªã䜿çšããŠããŒã¿ã®å¯èŠåãããŠãããã©ããªé³æ¥œãä»å¹Žã¯æµè¡ã£ãŠããã®ããèŠãŠãããããšæããŸãã
plt.figure(figsize=(10,8))
sns.countplot(data=df, x='released_year')
plt.title('Distribution of Tracks Produced Over Time')
plt.xticks(rotation=90)
plt.show()
äŸãã°äžã®ç°¡åãªã³ãŒãã§ãã®ãããªããŒã°ã©ããçŸããŸãã

å幎床ïŒïŒïŒïŒïŒå¹ŽïŒã®å€ãã®æ¥œæ²ãä»å¹Žã«ãªã£ãŠãèŽãããŠããäºãããããŸãã
months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]
sns.boxplot(data=df, x="streams", y='released_month', order=months)
plt.title('Boxplot of Stream Volume by Released Month in 2023')
plt.ylabel('Release Month')
plt.show()

ãããŠïŒïŒïŒïŒå¹ŽïŒçŸæç¹ïŒã§æãã¹ããªãŒãã³ã°ãããŠãã人æ°æ²ã®ãªãªãŒã¹æã®ããã¯ã¹ããããã§ãã
ïŒæãšïŒæã«ãªãªãŒã¹ããã楜æ²ãä»ã®æã«æ¯ã¹ãŠå€ãããšãããããŸãã
artist_counts = df['artist(s)_name'].value_counts()
df_artist_counts = pd.DataFrame({'Artist': artist_counts.index, 'Count': artist_counts.values})
df_sorted_artists = df_artist_counts.sort_values(by ='Count', ascending=False)
df_top_10 = df_sorted_artists.head(10)
sns.barplot(data=df_top_10,x='Count', y='Artist')
plt.title("Top 10 Artists With Total Streams in 2023")
plt.show()

ãããŠïŒïŒïŒïŒå¹ŽïŒçŸæç¹ïŒã§æãã¹ããªãŒãã³ã°ãããŠããã¢ãŒãã£ã¹ãããããïŒïŒã§ããïŒäœã¯ãã¯ããã€ã©ãŒã»ã¹ãŠã£ãããå§å·»ã§ãã
楜æ²ã®å±æ§ïŒAttributesïŒ
ãããŠããããã¯æ¥œæ²ã®å±æ§ã«ã€ããŠèª¿ã¹ãŠãããŸããããã§èšãå±æ§ãšã¯ãããã³ãµããªãã£ãããšãã«ã®ãŒããã¢ã³ãŒã¹ãã£ãã¯ãã¹ããªã©ãããããã®æ¥œæ²ãã©ãããæ§è³ªã®æ²ãªã®ããæ°å€åããŠããŸãã
ãããŠå±æ§ã®ããŒã¿ãçšããŠçžé¢åæïŒCorrelation AnalysisïŒãããŒããããã§å¯èŠåããŠãããŸãã
track_attributes = ['danceability_%',
'valence_%',
'energy_%',
'acousticness_%',
'instrumentalness_%',
'liveness_%',
'speechiness_%',
'bpm']
plt.figure(figsize=(10,8))
sns.heatmap(df[track_attributes].corr(),
vmin=-1,
vmax=1,
annot=True,
cmap=('RdBu_r'))
plt.xticks(rotation=45)
plt.show()

äžã«ããããŒãããããã¿ããšéãããã¯ã¹ãã¢ã³ãŒã¹ãã£ãã¯ãã¹ãšãšãã«ã®ãŒãå°ã匷ãã®çžé¢é¢ä¿ã«ããããšãããããŸããããå°ã調ã¹ãŠã¿ãŸãããã
plt.figure(figsize=(10,6))
sns.regplot(x='acousticness_%', y='energy_%', data=df_sorted_100)
plt.show()

æ©æ¢°åŠç¿ãçšããäºæž¬ã¢ãã«ãæ§ç¯ããéã«ã¯ã説æå€æ°ã®äžã«çžé¢ä¿æ°ãé«ãçµã¿åããããªããã©ãã確ãããå¿ èŠæ§ããããŸããçžé¢ä¿æ°ãé«ãçµã¿åãããå«ãŸããŠããå Žåã¯å€éå ±ç·æ§ãšåŒã°ãã幟ã€ãã®èª¬æå€æ°ãåãé€ãããããå¿ èŠãããããã§ãããã®èª¬æã¯é·ããªãã®ã§ãããããŸãå¥ã®æ©äŒã«ãæ©æ¢°åŠç¿ã«ã€ããŠã®Noteã«æžããããšæããŸãã
plt.figure(figsize=(10,6))
sns.countplot(x='key',
data=df,
palette='viridis',
order=df['key'].value_counts().index)
plt.title('Distribution of Songs by Key')
plt.xlabel('Key')
plt.ylabel('Count')
plt.show()

ã©ã®ããŒãããŒã¹ã«ããŠããããã®æ¥œæ²ãäœãããã®ãããããŠåŸåãè¡šãããŒã°ã©ãã§ããCïŒãäžã€é æããŠå€ãããšãããããŸããã©ããªæåæ²ãCïŒããŒã¹ãªãã§ããããïŒ
df_sorted_100[df_sorted_100['key'] == 'C#'].sort_values(by='streams',ascending=False)

ãªãªãŒã¹å¹ŽãèŠããšããªãæã®æ²ãä»å¹Žè¯ãèãããŠãäºãããããŸãã代衚æ²ãã¶ã»ãŠã£ãŒã¯ãšã³ãã®Blinding LightsããããŠãšãã»ã·ãŒã©ã³ã®Shape of Youãªã©ããããŸãã
# Analyzing the distribution of Beats per Minute
plt.figure(figsize=(10,6))
sns.histplot(df_sorted_100['bpm'], bins=50, kde=True)
plt.axvline(x=df_sorted_100['bpm'].mean(), color='red', linestyle='dashed', linewidth=2, label="Mean BPM")
plt.title('Distribution of Beats Per Minute (BPM)')
plt.xlabel('BPM')
plt.ylabel('Frequency')
plt.legend()
plt.show()

次ã¯BPMãèŠãŠãããŸããBPMãšã¯Beats Per Minuteã®ç¥ã§é³æ¥œã®ãã³ããè¡šãåäœã§ããæ°å€ãé«ããªãã»ã©ãã³ããéãæ²ãäœãã»ã©ã¹ããŒãªæ²ãšããæå³ã§ãã
ã¡ãªã¿ã«æãã¹ããªãŒãã³ã°ãããæ²ãããïŒïŒïŒã®å¹³åBPMã¯ïŒïŒïŒïŒïŒã§ãããïŒã°ã©ãã«ããèµ€ãç¹ç·ãå¹³åç·ã§ãïŒ
plt.figure(figsize=(8,6))
sns.countplot(x='mode',
data=df,
palette='viridis',
order=df['mode'].value_counts().index)
plt.title('Distribution of Songs by Mode')
plt.xlabel('Mode')
plt.ylabel('Count')
plt.show()

æåŸã«ã¢ãŒããšããå±æ§ãèŠãŠãããŸããããã¢ãŒãã¯é·èª¿(Major)ãšç調(Minor)ããããç°¡åã«è¡šçŸãããšé·èª¿ã¯æããé¿ãã®é³ãç調ã¯æãé¿ãã®é³ãè¡šããŠããŸããäžã®ããŒã°ã©ãããèŠããšé·èª¿(Major)ã®æ¥œæ²ã®ã»ããç調(Minor)ãããå€ãäºãããããŸãã
ãåŸåãã§ã¯å®éã«Spotify APIã䜿ã£ãŠèª¿ã¹ãã楜æ²ã®ããŒã¿ãååŸããä»å¹Žã®æµè¡ã®æ¥œæ²ãšç §ããåãããŠåæããŠãããããšæããŸãã