Deep Learning Study

Special Lectures

1. Artificial Intelligence 101 and Model

2. Basic Elements & Single Neuron

3. Multiple Neurons

4. Backward Propagation

5. Multi-layer Neural Networks

6. Model Selection

7. Optimization Technology

  • Lecture Note
  • Papers

    • Stochastic Gradient Descent (SGD)
    • Momentum
    • Nesterov’s Accelerated Gradient (NAG)
      • Y. Bengio, N. Boulanger-Lewandowski and R. Pascanu, “Advances in optimizing recurrent networks,” 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, 2013, pp. 8624-8628. doi: 10.1109/ICASSP.2013.6639349 Link, arXiv Link
    • AdaGrad
      • John Duchi, Elad Hazan, Yoram Singer, “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization,” Journal of Machine Learning Research 12 (2011) 2121-2159 Link
    • RMSprop
      • Tieleman, T. and Hinton, G., Divide the gradient by a running average of its recent magnitude, COUSERA: Neural Networks for Machine LearningRMSProp: Lecture 6.5, 2012 Link
    • Adam
      • Diederik P. Kingma, Jimmy Ba, “Adam: A Method for Stochastic Optimization Authors,” Proceedings of the 3rd International Conference on Learning Representations (ICLR), 2014 arXiv Link
  • Laboratory

  • Blogs

8. Parameter Initialization

9. Normalization & Regularization

10. Hyperparameter Optimization

  • Lecture Note
  • Papers
    • Random Search
      • James Bergstra, Yoshua Bengio, “Random Search for Hyper-Parameter Optimization,” Journal of Machine Learning Research, Vol. 13, No. 1, pp. 281-305, January 2012 Link

11. Convolutional Neural Networks

11.1 Image Classification

11.2 Object Detection

11.3 Face Recognition

12. Recurrent Neural Networks

13. Autoencorder

14. Reinforcement Learning

Appendix I. Machine Learning

I-1. Eigenvector & Eigenvalue

I-2. Naive Bayes Theorem & Naive Bayes Classifier

I-3. Linear Regression

I-4. PCA (Principle Component Analysis)

I-5. Machine Learning Platform

I-6. Tensorflow