ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Auto encoder
    Base Line/python 기초 코드 2022. 8. 23. 21:32

    DDPM Autoencoder 

    Diffusion Autoencoders: Toward a Meaningful and Decodable Representation

    cvpr2022

    논문을 보기전에 

    auto encoder랑 VAE를 정리하고 가면 좋을 것 같아서 auto encoder 구현에 대해 짧게 정리했다.

    import torch
    import torchvision
    import torch.nn.functional as F
    from torch import nn, optim
    from torchvision import transforms, datasets
    
    import matplotlib.pyplot as plt
    from mpl_toolkits.mplot3d import Axes3D
    from matplotlib import cm
    import numpy as np
    
    #주피터 노트북 사용시 에러방지
    import os
    os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
    
    
    # 하이퍼파라미터
    EPOCH = 10
    BATCH_SIZE = 64
    USE_CUDA = torch.cuda.is_available()
    DEVICE = torch.device("cuda" if USE_CUDA else "cpu")
    print("Using Device:", DEVICE)
    
    
    # Fashion MNIST 데이터셋
    trainset = datasets.FashionMNIST(
        root      = './.data/', 
        train     = True,
        download  = True,
        transform = transforms.ToTensor()
    )
    train_loader = torch.utils.data.DataLoader(
        dataset     = trainset,
        batch_size  = BATCH_SIZE,
        shuffle     = True,
        num_workers = 2
    )
    
    class Autoencoder(nn.Module):
        def __init__(self):
            super(Autoencoder, self).__init__()
    
            self.encoder = nn.Sequential(
                nn.Linear(28*28, 128),
                nn.ReLU(),
                nn.Linear(128, 64),
                nn.ReLU(),
                nn.Linear(64, 12),
                nn.ReLU(),
                nn.Linear(12, 3),   # 입력의 특징을 3차원으로 압축합니다
            )
            self.decoder = nn.Sequential(
                nn.Linear(3, 12),
                nn.ReLU(),
                nn.Linear(12, 64),
                nn.ReLU(),
                nn.Linear(64, 128),
                nn.ReLU(),
                nn.Linear(128, 28*28),
                nn.Sigmoid(),       # 픽셀당 0과 1 사이로 값을 출력합니다
            )
    
        def forward(self, x):
            encoded = self.encoder(x)
            decoded = self.decoder(encoded)
            return encoded, decoded
            
            
            
    autoencoder = Autoencoder().to(DEVICE)
    optimizer = torch.optim.Adam(autoencoder.parameters(), lr=0.005)
    criterion = nn.MSELoss()
    
    
    # 원본 이미지를 시각화 하기 (첫번째 열)
    view_data = trainset.data[:5].view(-1, 28*28)
    view_data = view_data.type(torch.FloatTensor)/255.
    
    def train(autoencoder, train_loader):
        autoencoder.train()
        for step, (x, label) in enumerate(train_loader):
            x = x.view(-1, 28*28).to(DEVICE)
            y = x.view(-1, 28*28).to(DEVICE)
            label = label.to(DEVICE)
    
            encoded, decoded = autoencoder(x)
    
            loss = criterion(decoded, y)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            
            
      for epoch in range(1, EPOCH+1):
        train(autoencoder, train_loader)
    
        # 디코더에서 나온 이미지를 시각화 하기 (두번째 열)
        test_x = view_data.to(DEVICE)
        _, decoded_data = autoencoder(test_x)
    
        # 원본과 디코딩 결과 비교해보기
        f, a = plt.subplots(2, 5, figsize=(5, 2))
        print("[Epoch {}]".format(epoch))
        for i in range(5):
            img = np.reshape(view_data.data.numpy()[i],(28, 28))
            a[0][i].imshow(img, cmap='gray')
            a[0][i].set_xticks(()); a[0][i].set_yticks(())
    
        for i in range(5):
            img = np.reshape(decoded_data.to("cpu").data.numpy()[i], (28, 28))
            a[1][i].imshow(img, cmap='gray')
            a[1][i].set_xticks(()); a[1][i].set_yticks(())
        plt.show()
    
    
    CLASSES = {
        0: 'T-shirt/top',
        1: 'Trouser',
        2: 'Pullover',
        3: 'Dress',
        4: 'Coat',
        5: 'Sandal',
        6: 'Shirt',
        7: 'Sneaker',
        8: 'Bag',
        9: 'Ankle boot'
    }
    
    fig = plt.figure(figsize=(10,8))
    ax = Axes3D(fig)
    
    X = encoded_data.data[:, 0].numpy()
    Y = encoded_data.data[:, 1].numpy()
    Z = encoded_data.data[:, 2].numpy()
    
    labels = trainset.targets[:200].numpy()
    
    for x, y, z, s in zip(X, Y, Z, labels):
        name = CLASSES[s]
        color = cm.rainbow(int(255*s/9))
        ax.text(x, y, z, name, backgroundcolor=color)
    
    ax.set_xlim(X.min(), X.max())
    ax.set_ylim(Y.min(), Y.max())
    ax.set_zlim(Z.min(), Z.max())
    plt.show()

    reference:

    https://wikidocs.net/152474

     

    1) VAE(Variational Auto-Encoder)

    ## VAE란? VAE는 Input image X를 잘 설명하는 feature를 추출하여 Latent vector z에 담고, 이 Latent vector z를 통해 X와 ...

    wikidocs.net

     

    https://dacon.io/codeshare/4551

     

    [pytorch 기초 - day5] AutoEncoder 오토 인코더

     

    dacon.io

    https://gaussian37.github.io/dl-concept-vae/

     

    VAE(Variational AutoEncoder)

    gaussian37's blog

    gaussian37.github.io

     

    'Base Line > python 기초 코드' 카테고리의 다른 글

    sobel, canny, laplacian filter  (1) 2022.12.03
    Genetic Algorithm(feature selection) + LightGBM  (0) 2022.08.25
    sklearn 회귀관련 모델 정리  (0) 2022.08.16
    torch check list  (0) 2022.08.14
    pandas - 2  (0) 2022.07.12
Designed by Tistory.