본문 바로가기

프로그래머스 데브 코스/TIL

[6기] 프로그래머스 인공지능 데브코스 61일차 TIL

1031

[11주차] CNN Monthly Project

학습 시간이 오래 걸릴 게 뻔해서 미리 하기

convolution 계산
- 출력 높이(output height) =  (height+2∗padding−filter_height)/stride+1 
- 출력 너비(output width) =  (width+2∗padding−filter_width)/stride+1
height = 16
width = 16
filter_height = 4
filter_width = 4
stride = 2
padding = 1

output_height = (height + 2 * padding - filter_height) // stride + 1
output_width = (width + 2 * padding - filter_width) // stride + 1

print('출력 높이:', output_height, '출력 너비:', output_width)
  • 이 계산을 이해하고 있으면 학습층을 쌓을 때 다음 학습층의 입력값으로 올바른 크기를 넣을 수 있다.
1    Input    image size: 3 X 64 X 64
2    Convolution    # of kernel: 128, kernel size: 8 X 8, stride: 1, zero padding: 0
3    Pooling    max pooling, kernel size: 2 X 2, stride: 2
4    Convolution    # of kernel: 256, kernel size: 8 X 8, stride: 1, zero padding: 0
5    Pooling    max pooling, kernel size: 2 X 2, stride: 2
6    Convolution    # of kernel: 512, kernel size: 4 X 4, stride: 1, zero padding: 0
7    Pooling    max pooling, kernel size: 2 X 2, stride: 2
8    Fully Connected    # of neuron: 4096
9    Activation    ReLU
10    Fully Connected    # of neuron: 6
11    Softmax    6 classes
- 위 정책을 반영한 코드 작성
class CustomLeNet(nn.Module):
    def __init__(self):
        super(CustomLeNet, self).__init__()
        # 이 부분에 소스코드를 작성하세요.
        # image size: 3 X 64 X 64
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=128, kernel_size=8, stride=1, padding=0)
        # 128 X 57 X 57
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        # 128 X 28 X 28
        self.conv2 = nn.Conv2d(in_channels=128, out_channels=256, kernel_size=8, stride=1, padding=0)
        # 256 X 21 X 21
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        # 256 X 10 X 10
        self.conv3 = nn.Conv2d(in_channels=256, out_channels=512, kernel_size=4, stride=1, padding=0)
        # 512 X 7 X 7
        self.pool3 = nn.MaxPool2d(kernel_size=2, stride=2)
        # 512 X 3 X 3

        self.fc1 = nn.Linear(512 * 3 * 3, 4096)
        self.fc2 = nn.Linear(4096, 6)

    def forward(self, x):
        # 이 부분에 소스코드를 작성하세요.
        x = self.pool1(self.conv1(x))
        x = self.pool2(self.conv2(x))
        x = self.pool3(self.conv3(x))
        x = torch.flatten(x, 1)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x
  • 최종 성능
    [ Validation epoch: 30 ]
    Accuarcy: 76.13736425007338
    Average loss: 0.020857428354988848
    Model saved! (time elapsed: 757.4019074440002)
1    Input    image size: 3 X 64 X 64
2    Convolution    # of kernel: 96, kernel size: 5 X 5, stride: 1, zero padding: 2
3    Activation    ReLU
4    Normalization    LRN (Local Response Normalization), size: 5
5    Pooling    max pooling, kernel size: 3 X 3, stride: 2
6    Convolution    # of kernel: 256, kernel size: 5 X 5, stride: 1, zero padding: 2
7    Activation    ReLU
8    Normalization    LRN (Local Response Normalization), size: 5
9    Pooling    max pooling, kernel size: 3 X 3, stride: 2
10    Convolution    # of kernel: 384, kernel size: 3 X 3, stride: 1, zero padding: 1
11    Activation    ReLU
12    Convolution    # of kernel: 384, kernel size: 3 X 3, stride: 1, zero padding: 1
13    Activation    ReLU
14    Convolution    # of kernel: 256, kernel size: 3 X 3, stride: 1, zero padding: 1
15    Activation    ReLU
16    Pooling    max pooling, kernel size: 3 X 3, stride: 2
17    Fully Connected    # of neuron: 4096
18    Activation    ReLU
19    Dropout    Probability: 0.5
20    Fully Connected    # of neuron: 6
21    Dropout    Probability: 0.5
22    Softmax    6 classes
- 위 정책을 반영한 코드 작성
  • CNN 기반의 분류 모델의 우수성을 전 세계에 알린 논문(NIPS 2012) AlexNet을 유사하게 따라 하는 코드
import torch.nn as nn
import torch.nn.functional as F

class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        # 이 부분에 소스코드를 작성하세요.
        # → 차원(dimension): (3 x 64 x 64)

        self.conv1 = nn.Conv2d(in_channels=3, out_channels=96, kernel_size=5, stride=1, padding=2)
        # → 차원(dimension): (96 x 64 x 64)
        # → 차원(dimension): (96 x 31 x 31)

        self.conv2 = nn.Conv2d(in_channels=96, out_channels=256, kernel_size=5, stride=1, padding=2)
        # → 차원(dimension): (256 x 31 x 31)
        # → 차원(dimension): (256 x 15 x 15)

        self.conv3 = nn.Conv2d(in_channels=256, out_channels=384, kernel_size=3, stride=1, padding=1)
        # → 차원(dimension): (384 x 15 x 15)
        self.conv4 = nn.Conv2d(in_channels=384, out_channels=384, kernel_size=3, stride=1, padding=1)
        # → 차원(dimension): (384 x 15 x 15)
        self.conv5 = nn.Conv2d(in_channels=384, out_channels=256, kernel_size=3, stride=1, padding=1)
        # → 차원(dimension): (256 x 15 x 15)

        self.fc1 = nn.Linear(256 * 15 * 15, 4096)
        # → 차원(dimension): (4096)
        self.fc2 = nn.Linear(4096, 6)
        # → 차원(dimension): (6)
        self.lrn = nn.LocalResponseNorm(5)
        self.pool = nn.MaxPool2d(kernel_size=3, stride=2)
        self.dropout = nn.Dropout(0.5)

    def forward(self, x):
        # 이 부분에 소스코드를 작성하세요.
        x = F.relu(self.conv1(x))
        x = self.pool(self.lrn(x))
        x = F.relu(self.conv2(x))
        x = self.pool(self.lrn(x))
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x = F.relu(self.conv5(x))
        x = torch.flatten(x, 1) # 배치(batch)를 제외한 모든 차원 flatten하기
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(self.fc2(x))
        return x
  • 이렇게 했는데 열두 시간이 지나도 절반 학습이 안 돼서 코드를 손봐야 할 것 같다...