[KERAS] 학습시 검증데이터 세트 이용 유무

2021. 10. 29. 10:24딥러닝

Keras를 항상 급하게 수박 겉핥기식으로만 사용하다가 온라인 강좌를 들으며 실습하던 중 또 까먹을까봐 기록으로 남겨둠

 

데이터는 fasion_mnist(Keras 내장 데이터)

 

1. 검증데이터(테스트데이터) 없이 그냥 학습하기 - 생성된 모델의 evaluate 함수를 이용해서 테스트 데이터로 확인 가능하다

1-1) 데이터 읽어서 정규화(train, test)

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

#데이터를 0~1 값으로 정규화
def get_preprocessed_data(images, labels):
    images = np.array(images/255.0, dtype=np.float32)
    labels = np.array(labels, dtype=np.float32)
    return images, labels

train_images, train_labels = get_preprocessed_data(train_images, train_labels)
test_images, test_labels = get_preprocessed_data(test_images, test_labels)

#print("train dataset shape:", train_images.shape, train_labels.shape)
#print("test dataset shape:", test_images.shape, test_labels.shape)

> 결과

train dataset shape: (60000, 28, 28) (60000,)
test dataset shape: (10000, 28, 28) (10000,)

1-2) 모델 생성

from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential

model = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(100, activation='relu'),
    Dense(30, activation='relu'),
    Dense(10, activation='softmax')
])

model.summary()

>결과 : 1차원 배열 만들기

  - flatten -> dense : 784(flatten layer) x 100(dense layer) + 100(bias)

  - dense -> dense_1 : 100 x 30 + 30

  - dense_1 -> dense_2 : 30 x 10 + 10

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 100)               78500     
_________________________________________________________________
dense_1 (Dense)              (None, 30)                3030      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                310       
=================================================================
Total params: 81,840
Trainable params: 81,840
Non-trainable params: 0

1-3) 모델 컴파일(전에 어디서 들었을 때는 생성된 노드들 조립하는 과정이라고 들었는데 맞는지는...????)

  - 분류 모델이므로 categorical_crossentropy

 

from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import CategoricalCrossentropy
from tensorflow.keras.metrics import Accuracy

model.compile(optimizer=Adam(0.001), loss='categorical_crossentropy', metrics=['accuracy'])

1-4) 레이블을 원핫 인코딩으로 변경

from tensorflow.keras.utils import to_categorical

train_oh_labels = to_categorical(train_labels)
test_oh_labels = to_categorical(test_labels)

print(train_oh_labels.shape, test_oh_labels.shape)

>결과

(60000, 10) (10000, 10)

1-5) 학습하기

history = model.fit(x=train_images, y=train_oh_labels, batch_size=32, epochs=20, verbose=1)

> 결과

  - 데이터는 60000개, batch_size가 32개 이므로 데이터 전체를 1번 돌면 1875(60000/32)번에 나눠서 학습하게 된다

  - epoch가 20 이므로 20번을 학습

  - 그런데, loss와 accuracy는 학습데이터에서 추출하여 측정했으므로 학습데이터와 중복되어 overfitting 여부 확인 어려움

Epoch 1/20
1875/1875 [==============================] - 7s 3ms/step - loss: 0.5217 - accuracy: 0.8173
Epoch 2/20
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3848 - accuracy: 0.8622
Epoch 3/20
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3449 - accuracy: 0.8744
...............................................................
...............................................................
...............................................................
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2013 - accuracy: 0.9238
Epoch 19/20
1875/1875 [==============================] - 5s 3ms/step - loss: 0.1993 - accuracy: 0.9246
Epoch 20/20
1875/1875 [==============================] - 5s 3ms/step - loss: 0.1924 - accuracy: 0.9275

1-6) loss와 accuracy 이력 보기

print(history.history['loss'])
print(history.history['accuracy'])

> 결과

[0.5217015743255615, 0.3847706913948059, 0.3449043929576874, 0.32317453622817993, 0.30555015802383423, 0.2914452850818634, 0.2787344753742218, 0.2689889371395111, 0.2602957487106323, 0.24981915950775146, 0.2438966929912567, 0.23561082780361176, 0.22739382088184357, 0.22206111252307892, 0.2169351726770401, 0.2127165049314499, 0.20825491845607758, 0.20133952796459198, 0.1992630511522293, 0.1923792064189911]
[0.8172666430473328, 0.8622000217437744, 0.8743500113487244, 0.8816999793052673, 0.8874333500862122, 0.8915833234786987, 0.8980833292007446, 0.8997166752815247, 0.9035999774932861, 0.9058499932289124, 0.9086833596229553, 0.9110833406448364, 0.9141833186149597, 0.9162166714668274, 0.9170833230018616, 0.9195833206176758, 0.9207833409309387, 0.9238499999046326, 0.9246166944503784, 0.9274666905403137]

1-7) 생성된 모델로 예측하기(2개이상의 이미지 & 1개의 이미지 예측)

#2개 이상의 이미지 입력시
pred_proba_multi = model.predict(test_images)
#특정 index의 예측값을 원핫인코딩->숫자값으로 변경
pred_multi = np.argmax(pred_proba_multi[0])

#1개의 이미지만 입력시에는 차원을 늘려서 입력해야 한다
pred_proba_single = model.predict(np.expand_dims(test_images[0], axis=0))
#다시 차원을 줄인다
pred_single = np.argmax(np.squeeze(pred_proba_single))
#pred_single = np.argmax(pred_proba_single[0])으로 해도 되지 않을까

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat','Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print('target class value:', test_labels[0], 'predicted class value:', pred_single)

> 결과

target class value: 9.0 predicted class value: 9

1-8) 테스트 데이터 셋으로 모델 검증하기

model.evaluate(test_images, test_oh_labels, batch_size=64)

>결과

157/157 [==============================] - 0s 2ms/step - loss: 0.3832 - accuracy: 0.8859
[0.38321688771247864, 0.8859000205993652]

 

 

2. 학습시 테스트 데이터의 accuracy와 loss 함께 확인하기

  - 기본적으로 학습이 시작되면(model.fit) 중간에 하이퍼파라미터를 변경할 수 없지만, callback을 통하여 변경가능함

2-1) 데이터를 불러와서 train 데이터를 나눔(train data -> train data & val data)

  - 데이터를 나눌 때 sklearn 이용함

  - 아래 코드에서는 15%의 데이터를 validation 데이터로 분리함

import numpy as np 
import pandas as pd
from tensorflow.keras.datasets import fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

def get_preprocessed_data(images, labels):
    images = np.array(images/255.0, dtype=np.float32)
    labels = np.array(labels, dtype=np.float32)
    
    return images, labels
    
train_images, train_labels = get_preprocessed_data(train_images, train_labels)
test_images, test_labels = get_preprocessed_data(test_images, test_labels)


from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical

# 기존 학습 데이터를 다시 학습과 검증 데이터 세트로 분리
tr_images, val_images, tr_labels, val_labels = train_test_split(train_images, train_labels, test_size=0.15, random_state=2021)
print('train과 validation shape:', tr_images.shape, tr_labels.shape, val_images.shape, val_labels.shape)

# OHE 적용
tr_oh_labels = to_categorical(tr_labels)
val_oh_labels = to_categorical(val_labels)

print('after OHE:', tr_oh_labels.shape, val_oh_labels.shape)

> 결과 (51000개의 train, 9000개의  val 데이터로 분리)

print('after OHE:', tr_oh_labels.shape, val_oh_labels.shape)
train과 validation shape: (51000, 28, 28) (51000,) (9000, 28, 28) (9000,)
after OHE: (51000, 10) (9000, 10)

2-2) 모델 생성은 같으나 학습시 validation 데이터를 레이블과 함께 tuple로 입력

from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam

INPUT_SIZE = 28
model = Sequential([
    Flatten(input_shape=(INPUT_SIZE, INPUT_SIZE)),
    Dense(100, activation='relu'),
    Dense(30, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer=Adam(0.001), loss='categorical_crossentropy', metrics=['accuracy'])


#validation data는 tuple 로 넣어준다
history = model.fit(x=tr_images, y=tr_oh_labels, batch_size=128, validation_data=(val_images, val_oh_labels), 
                    epochs=20, verbose=1)

> 결과 : val_loss, val_accuracy를 추가로 확인 가능하다

Epoch 1/20
399/399 [==============================] - 3s 4ms/step - loss: 0.6180 - accuracy: 0.7885 - val_loss: 0.4737 - val_accuracy: 0.8350
Epoch 2/20
399/399 [==============================] - 1s 3ms/step - loss: 0.4172 - accuracy: 0.8536 - val_loss: 0.4027 - val_accuracy: 0.8574
Epoch 3/20
399/399 [==============================] - 1s 3ms/step - loss: 0.3766 - accuracy: 0.8664 - val_loss: 0.3831 - val_accuracy: 0.8631
...................................................................
...................................................................
Epoch 18/20
399/399 [==============================] - 1s 3ms/step - loss: 0.2224 - accuracy: 0.9169 - val_loss: 0.3041 - val_accuracy: 0.8912
Epoch 19/20
399/399 [==============================] - 1s 3ms/step - loss: 0.2119 - accuracy: 0.9205 - val_loss: 0.3205 - val_accuracy: 0.8850
Epoch 20/20
399/399 [==============================] - 1s 3ms/step - loss: 0.2094 - accuracy: 0.9217 - val_loss: 0.3190 - val_accuracy: 0.8901

3. 그래프로도 확인 가능하다

import matplotlib.pyplot as plt
%matplotlib inline

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label='val_accuracy')
plt.plot(history.history['loss'], label='loss')
plt.plot(history.history['val_loss'], label='val_loss')
plt.legend()

> 결과

<matplotlib.legend.Legend at 0x7f00188b7e10>