1226

[17주차 - Day3] Recommendation system

딥러닝 오토인코더 실습

# 무비렌즈 데이터셋의 평점 정보 예측하는 딥러닝 모델 생성

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os

from keras.layers import Input, Embedding, Flatten, Dot, Dense
from keras.models import Model

ratings = pd.read_csv("https://grepp-reco-test.s3.ap-northeast-2.amazonaws.com/movielens/ratings.csv")

ratings.head()

users = len(ratings.userId.unique())
movies = len(ratings.movieId.unique())
print(users, movies)

ratings.describe()

- movieId의 값을 잘 보면 9066개인 Id의 max 값이 163949인 것을 확인할 수 있고, 이를 처리해야 한다

movieId_to_seqId = {}
seqId_to_movieId = {}
for sId, id in enumerate(ratings["movieId"].unique()):
  seqId_to_movieId[sId] = id
  movieId_to_seqId[id] = sId
  
def return_movieId_to_seqId(row):
  return movieId_to_seqId[row.movieId]

ratings['new_movieId'] = ratings.apply(return_movieId_to_seqId, axis=1)

ratings.describe()

- 위 코드 실행해 new_movieId라는 새로운 column에 값 저장하기

from sklearn.model_selection import train_test_split
train, test = train_test_split(ratings, test_size=0.2, random_state=42)

""" 영화 레이어 """ 
movie_input = Input(shape=[1], name="Movies")
# Embedding(input_dim, output_dim)
# - input_dim: 단어의 수 (여기서는 영화의 수)
# - output_dim: 임베딩 레이어 출력 차원
movie_embedding = Embedding(movies+1, 5, name='Embedded_Movies')(movie_input)
movie_final = Flatten(name='Flatten_Movies')(movie_embedding)

""" 사용자 레이어 """
user_input = Input(shape=[1], name="Users")
user_embedding = Embedding(users+1, 5, name="Embedded_Users")(user_input)
user_final = Flatten(name='Flatten_Users')(user_embedding)

""" 내적 레이어 """
mult = Dot(name="Dot_prodct", axes=1)([movie_final, user_final])

from keras.losses import  mean_squared_error
from keras import backend as K

def root_mean_squared_error(y_true, y_pred):
  return K.sqrt(mean_squared_error(y_true, y_pred))

model = Model([movie_input, user_input], mult)
model.compile(optimizer='adam', loss=root_mean_squared_error)
model.summary()

from keras.utils import plot_model

plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

from keras.models import load_model

history = model.fit([train.new_movieId, train.userId], train.rating, epochs=10, verbose=1)
model.save(r'recomender_model.h5')
plt.plot(history.history['loss'])
plt.xlabel = ('Epochs')
plt.ylabel = ('Training_Error')

model.evaluate([test.new_movieId, test.userId], test.rating)
predictions = model.predict([test.new_movieId.head(8), test.userId.head(8)])

for i in range(0, 8):
  print(predictions[i], test.rating.iloc[i])

- 학습한 모델의 예측값과 실제값을 비교해 볼 수 있다

저작자표시

'프로그래머스 데브 코스 > TIL' 카테고리의 다른 글

[6기] 프로그래머스 인공지능 데브코스 119일차 TIL (0)	2023.12.28
[6기] 프로그래머스 인공지능 데브코스 118일차 TIL (0)	2023.12.27
[6기] 프로그래머스 인공지능 데브코스 116일차 TIL (0)	2023.12.25
[6기] 프로그래머스 인공지능 데브코스 115일차 TIL (0)	2023.12.24
[6기] 프로그래머스 인공지능 데브코스 114일차 TIL (1)	2023.12.23

공공공 공부를 합시다

[6기] 프로그래머스 인공지능 데브코스 117일차 TIL

1226

[17주차 - Day3] Recommendation system

딥러닝 오토인코더 실습

'프로그래머스 데브 코스 > TIL' 카테고리의 다른 글

티스토리툴바

[6기] 프로그래머스 인공지능 데브코스 117일차 TIL

1226

[17주차 - Day3] Recommendation system

딥러닝 오토인코더 실습

'프로그래머스 데브 코스 > TIL' 카테고리의 다른 글

'프로그래머스 데브 코스/TIL' Related Articles

티스토리툴바