Currently I am studying a masters degree within IT focusing on AI, or more specifically neural networks. Last semester we were introduced to neural networks. We learned about weights, biases, how a cell uses it’s activation function (learned about perceptron and sigmoid) and we "built" a really simple NN in python. (We didn’t really write it ourselves, but copied it from a guide).
This semester we delved into Deep Learning. How one can stack multiple layers of neural networks on top of each other and manage to learn incredible things.For our project I chose to make a chatbot, and though it works, I still don’t really understand it thoroughly. I’ve watched videos, read articles but I struggle to understand it completely.
What annoys me, is that I don’t understand the why‘s of things. Why is cell X (for example GRU) better to use here than cell Y? Why was the AdaDelta optimizer better than AdamOptimizer? (And when should I use which). And so forth.
I was wondering if you guys knew of a book/books or some other resources that could help me understand these sort of things:
- Neural Networks from the ground up (Would really appreciate it, if it also tought all the mathematics as well instead of just assuming I know it)
- Optimizers. The differences between each of them (the major ones) and when to use them.
- The different types of cells (LSTM, GRU) and networks (fully connected, convolutional, rnn, etc) and when to use them1
- Edit this line if I come up with more
1: For example, I would like to be able to view at an image like this and understand what is happening. What are the different symbols? How and why does it work. Regarding the different types of networks: I was watching this video and it made me wonder, why did he use two layers of CNN and two layers of fully connected? What are the differences between each?
I know that this might be much to ask, but I’ve always been someone who wants to learn the why and how. I love math and when I was younger I just had to learn exactly why things where like they were before moving un to the next topic. This helped me tremendously with calculus and physics. However, ever since I started with Neural Networks I’ve had to just accept things and move on. But now I want to be able to build a neural network/deep learning network and know exactly what I am doing. Knowing what I have to change to get better performance, or what type of networks I need to use. At the moment I just copy and paste examples from the internet and I’m forced to test every single variable in the program hoping that some of them give better performance.
Sorry for my english, I know it may not be easy to "understand" or should I say, easy to follow along. Anyways, any help would be much appreciated.
from Artificial Intelligence http://ift.tt/2rkMP03