Transformer Neural Networks - EXPLAINED! (Attention is all you need)

2020-01-13 24,746 766,623

Please subscribe to keep me alive: https://www.youtube.com/c/CodeEmporium?sub_confirmation=1 BLOG: https://medium.com/@dataemporium PLAYLISTS FROM MY CHANNEL ⭕ Reinforcement Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd9kS--NgVz0EPNyEmygV1Ha&si=AuThDZJwG19cgTA8 Natural Language Processing: https://youtube.com/playlist?list=PLTl9hO2Oobd_bzXUpzKMKA3liq2kj6LfE&si=LsVy8RDPu8jeO-cc ⭕ Transformers from Scratch: https://youtube.com/playlist?list=PLTl9hO2Oobd_bzXUpzKMKA3liq2kj6LfE ⭕ ChatGPT Playlist: https://youtube.com/playlist?list=PLTl9hO2Oobd9coYT6XsTraTBo4pL1j4HJ ⭕ Convolutional Neural Networks: https://youtube.com/playlist?list=PLTl9hO2Oobd9U0XHz62Lw6EgIMkQpfz74 ⭕ The Math You Should Know : https://youtube.com/playlist?list=PLTl9hO2Oobd-_5sGLnbgE8Poer1Xjzz4h ⭕ Probability Theory for Machine Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd9bPcq0fj91Jgk_-h1H_W3V ⭕ Coding Machine Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd82vcsOnvCNzxrZOlrz3RiD MATH COURSES (7 day free trial) 📕 Mathematics for Machine Learning: https://imp.i384100.net/MathML 📕 Calculus: https://imp.i384100.net/Calculus 📕 Statistics for Data Science: https://imp.i384100.net/AdvancedStatistics 📕 Bayesian Statistics: https://imp.i384100.net/BayesianStatistics 📕 Linear Algebra: https://imp.i384100.net/LinearAlgebra 📕 Probability: https://imp.i384100.net/Probability OTHER RELATED COURSES (7 day free trial) 📕 ⭐ Deep Learning Specialization: https://imp.i384100.net/Deep-Learning 📕 Python for Everybody: https://imp.i384100.net/python 📕 MLOps Course: https://imp.i384100.net/MLOps 📕 Natural Language Processing (NLP): https://imp.i384100.net/NLP 📕 Machine Learning in Production: https://imp.i384100.net/MLProduction 📕 Data Science Specialization: https://imp.i384100.net/DataScience 📕 Tensorflow: https://imp.i384100.net/Tensorflow REFERENCES [1] The main Paper: https://arxiv.org/abs/1706.03762 [2] Tensor2Tensor has some code with a tutorial: https://www.tensorflow.org/tutorials/text/transformer [3] Transformer very intuitively explained - Amazing: http://jalammar.github.io/illustrated-transformer/ [4] Medium Blog on intuitive explanation: https://medium.com/inside-machine-learning/what-is-a-transformer-d07dd1fbec04 [5] Pretrained word embeddings: https://nlp.stanford.edu/projects/glove/ [6] Intuitive explanation of Layer normalization: https://mlexplained.com/2018/11/30/an-overview-of-normalization-methods-in-deep-learning/ [7] Paper that gives even better results than transformers (Pervasive Attention): https://arxiv.org/abs/1808.03867 [8] BERT uses transformers to pretrain neural nets for common NLP tasks. : https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html [9] Stanford Lecture on RNN: http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture10.pdf [10] Colah’s Blog: https://colah.github.io/posts/2015-08-Understanding-LSTMs/ [11] Wiki for timeseries of events: https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)