Transcript:

These are three, sloppily written and provided at a very low resolution of 28 x 28 pixels But your mind has no problem recognizing it as three and I want you to take a moment to appreciate How your brain can do this effortlessly and smoothly I mean this and this and this can also be recognized as the number three Although the values specified for each pixel are very different from one image to the next The light-sensitive cells in your eyes that work when you see these three Very different from the ones that work when you see those three. But something in the smart visual cortex that you own This is resolved as an introduction to the same idea, which is the number three, but at the same time it identifies the other images as other ideas that differ from the number three But if I tell you hey, sit down and write a program for me, the inputs are a grid of 28 pixels by 28 pixels Such as this and its outputs are a number between zero and 10, and he can expect what this number will be Then the task turns from something very trivial to something awfully difficult If you were not living in the Stone Age then I believe I hardly need to alert you to the close relationship of present and future also with Neural Network Science and Machine Learning But what I really want to do is show you what neural networks are? Assuming you don’t have a prior background and help you visualize what you’re doing not in jargon but in the language of mathematics. I hope you get out of here feeling the same structure Trigger and feel that you know the meaning of what you read or hear from quotes about neural networks This video will only be devoted to the component architecture of neural networks and the next one will be to address and illustrate the learning itself for neural networks. What we’re going to do is put together a neural network for handwritten numbers This is a classic example To present the topic and I will be happy to stick to it in the current situation because at the end of the two videos I want to refer to some Good resources to learn from and places where you can download the code that does this and run it On your computer There are many, many different shapes of neural networks in recent years There has been a kind of boom in research towards these forms But in these two introductory videos, you and I are only going to look at a very simplified format without the complicated additions This is somewhat necessary for us Understand any of the stronger and newer shapes And believe me, we still have a lot of complexity to try to understand with our minds But even in this simple form, the network can learn to recognize handwritten numbers And it’s a very cool thing to have a computer capable of At the same time, you will see how it does not live up to some of the hopes we place on it As its name suggests, neural networks are inspired by the brain, but let’s explain that further What are neurons and with what logic are they related to each other? Now when I say neurons all I want you to think about is something that carries a number A number between the 0 and the 1 in particular. It really is nothing more than that For example, a grid begins with a bundle of cells that responds to each input of a 28 x 28 pixel image Which is 784 neurons in total, each one carrying a number representing the gray scale value with which each pixel responds Ranges from 0 for black pixels to 1 for white pixels This number inside the nerve cell is called activation, and the picture that you can visualize in your mind right now Is that every cell lights up when activated with a number of great value So these 784 neurons make up the first layer of the network Now looking at the last layer, it has ten neurons, each of which refers to a number of numbers The activation of these cells as we said before is a number between zero and one Represents the extent to which the network believes that the image presented to it represents this specific number There are also two layers in the middle called the hidden layers Which for the time being is only going to be a big question mark So, should there be a big question mark, how, on God’s behalf, will this number recognition process be accomplished ?! In this network I chose two hidden layers each of 16 neurons and I admit that this is kind of an arbitrary choice To be honest I chose two layers based on how much I want the structure to be stimulated in a single moment As for 16? That was just a good number to fit on the screen during training: D When implementing there is a lot of room to experiment with a specific structure here Where the network works in the following way, activating one layer leads to determining how to activate the next layer Of course, the core of the network represents the mechanism for processing information, which accurately represents how it is Activating one layer causes the next layer to be activated Presumably, this is a bit close to how the activation of some biological neurons performs To activate some of the other cells Now the network The one I’m reviewing here has already been trained to recognize numbers and let me show you what I mean by that This means if you have fed it with an image that has all the 784 cell values as input, then it is according to a value These values determine how bright each pixel in the image is This mode of activation causes a specific growth in activation of the next layer Which in turn causes another specific pattern in the next layer Which in the end gives some pattern in the outer layer And the brightest cell is the selection of the grid when you think it represents the correct number of the image you entered Before going to the mathematics behind how one layer affects the next or how networks are trained Let’s just explain why it makes sense to expect a class structure like this to behave smart What do we expect here? What’s your best expectation of what those middle classes could do? Well, when you and I get to know the numbers, we collect together some of the pieces, for example the 9 has a ring on the top and a line at the bottom to the right a little The 8 also has a top ring, but it pairs with another ring at the bottom As for the 4, it is basically divided into three lines, as in the figure In an ideal world, we might hope that every layer is from the second to the last It responds to one of these components Hopefully anytime you feed it with a picture with a ring on top like the 9 or 8 There are some specific neurons Whose activation is close to one, and I don’t mean a specific ring of pixels, hopefully it is Any general ring pattern above is able to activate neurons in the same way you would to recognize the ring In this way, by observing how to activate from the third layer to the last layer, to identify which combination of subcomponents responds with any number Of course, there are some obstacles in the way How do you recognize these subcomponents or even know which will be correct, and I haven’t spoken yet about the method One layer affects the other layer, but I’m running around for a moment Identifying a ring can also divide into many small problems One reasonable way to do this is to first become familiar with the various small edges that form Likewise, a long line, like you might see in the number 1, 4, or 7 It is nothing but a long edge or you might think of it as a defined pattern with many multiple small edges So our hope is that maybe every neuron in the second layer of the network Responds with various related little bezels Maybe when you feed with an image like this you activate all the neurons Associated with about eight to ten defined small edges Which, in turn, illuminates neurons associated with the upper ring and long vertical line And these do the cell associated with number 9 Whether this is what the network is actually doing or not is another question. I’ll come back to it as soon as we see how we can train networks But that’s kind of hope, kind of goal, with structured layers like this Moreover, you can imagine how the ability to identify these edges and patterns could be useful in other image recognition tasks Other than image recognition, there are a lot of clever things you can do that can be broken down into abstract layers Speech analysis, for example, involves taking the raw voice and selecting distinct sounds that combine to form a particular curriculum Which combine to form words that combine to form phrases and more abstract ideas and others Going back to how any of this works, now envision yourself designing How exactly does the activation of some neurons in one layer affect the activation of the neurons in the next layer? The goal is to find a mechanism that can identify the edges in pixels Convert edges to patterns, patterns to numbers, to focus on one example Let’s say we hope that one is a given From the neurons in the second layer to determine whether the image has edges in this region or not The question is what parameters the network should have What points should you strike in order for it to be expressive enough to hold the desired pattern Or any other pixel pattern or pattern that defines the different edges that form a loop or other things Well, what we’re going to do is assign a weight to each of the connections between our neuron and our first layer neuron These weights are just numbers Then it takes all of those actions from the first layer and computes their sum according to those groups I find it helpful to think of these groups as being organized into their own small networks I will use green pixels to denote positive weights and red pixels to denote negative weights Where the brightness of this pixel is a loose depiction of the weights values Now if we set all of the weights associated with nearly all of the pixels our value is zero Except for some positive weights in this area that we’re interested in Then we take the total weight All pixel values are just quantities to add to the pixel values in the region we are interested in And if you want to know if there is an edge here or not, maybe you have to have some negative weights Related to the surrounding pixels Then the sum is larger when those pixels in the middle are bright, but the surrounding pixels are darker When calculating the weight of the total in this way, you can get any number But for this network what we want is for the activation to have a value between 0 and 1 The common thing we use is to feed a function with this weight This function converts that number, whatever its value, into a value between zero and one A common function that actually does this is called a sigmoid and it’s also known as a logistic curve Basically any input high in its negative value ends up at zero, and any input high in its positive value ends up in one. It increases in a direct way only around point zero So the activation of neurons here is essentially a measure of the positive weight bias of the population But you probably don’t want the neuron to light up when the weight is greater than 0 You probably only want to be active when the weight is over 10 You want to make her a little bit biased in order to be inactive What it will only do is add a negative number to the sum of the weights like -10 Before entering the sigmoid function This extra number is called bias So the weights tell you what pixel pattern the neuron takes in the second layer, and the bias tells you. The value that the weight needs to be in order for a nerve cell to become active And it’s all just for one cell Every other neuron in this layer will be connected to all of the 784 neurons in the first layer Each of these 784 links has its own weight attached to it Also, each one has a bias – figure out what you add to weight before squashing it into the sigmoid function This is too much if you think about it carefully These hidden layers have 784 x 16 weight along with 16 biases And all of that is just a first-to-second layer connection Connections between other classes also have weights and biases With all of that in mind, this network has about 13,000 weights and biases 13,000 points to tap and knocking on those points differently will cause the grid to behave differently So when we talk about learning What this means is to make the computer find an effective mode for all these multiple numbers, provided that the mode is really It solves the problem presented to it It is a terrifying and fun thing to imagine yourself sitting and counting all these weights and biases by hand. You change and switch numbers so the second layer defines the edges, the third layer defines patterns etc. I find this satisfying on a personal level rather than imagining the network as a big black box Because when the network just doesn’t do its job well If you have a prediction about what these patterns and weights mean, then you have a point to start from To experiment how to change the topology to improve it Or when the network is working and doing its job, but not for the reasons you expect Going deeper into what weights and biases mean is a good way to challenge your assumptions and offer all possible solutions By the way, I think the function here is a little tricky to write, right? So let me show you a better way to write these links easier … this is how you will see them If you choose to read more about neural networks Organize all the actions into a single layer for a vector column Then organize all the weights as a matrix as each row of that matrix Corresponds to connections between one layer and a specific nerve cell in the next layer What this means is taking the sum of the weights of the verbs from the first layer and these weights Corresponds to one of the elements of the array that we see here on the left Incidentally, a lot of machine learning has a lot to do with linear algebra So for any of you who wants a nice visualization of multiplying matrices, you can take a look at this series that I presented in Linear Algebra. Especially the third chapter Returning to what we were talking about instead of adding bias for each value separately We organize all of our biases as a vector and add this vector to the product of the previous two matrices Then as a final step We put sigmoid around this way And what this is supposed to do is that you apply a sigmoid to every element within the resulting vector So just write this weight matrix and these vectors as symbols You can link every transition from layer to layer with a simple and small expression This makes programming commands simpler and faster considering that many libraries optimize the form of matrix multiplication Do you remember back when we said that neurons are simply numbered things? Well, of course, the numbers they hold depend on the image you feed them with So it might be more accurate if you think of every neuron as a function that takes Outputs from all neurons in the previous layer and show it as a number between zero and one In fact, the entire network is also a function You take 784 as input and output 10 digits as output It’s a little complicated So much so, that it has 13,000 parameters in the form of weights and biases that model patterns Which involves repeating a lot of matrix multiplication and then crushing the number in the sigmoid function But it’s just a function at the end and somehow it’s reassuring that it looks complicated I mean, if it were simpler, we wouldn’t have had hope that you’d overcome the number recognition challenge, would we? But how do you learn this network? How do you know proper weights and biases just by looking at the data they have? This is what I will show you in the next video, and we will also delve a little deeper into what these networks are doing on the ground Now is the time I say subscribe to the channel and so on to keep an eye out for new videos But the reality is that most of you don’t really get notifications from YouTube, right? Maybe I should say subscribe so that YouTube’s neural networks can certify that you want to see videos from this channel and subsequently recommend them for you. Anyway, stay prepared for more Many thanks to everyone for supporting these videos on Patron It has been a little slow to advance in the event series this summer But I will come back to it again in that after this project so that you can check for updates there To finish here, I have here, to do it for me She did her PhD in the theoretical side of deep learning and is currently working for an investment firm called amplify partners Who provided some funding for this video So there is nothing that I think we should mention quickly here is this sigmoid function. As I understood it, it was used in the past to crush numbers to a number between zero and one. It’s kind of stimulated by real neurons that are either activated or inactive Exactly * But modern neural networks don’t use them anymore. This is kind of old school is it? – Yeah Relu looks much easier and Relu is short for Really stands for rectified linear unit Yes, it is a kind of function where it only takes the value greater than the a is active and the least is inactive As you explain in the video, it is also inspired Partly by biological Neurons are either activated or not, if they pass a certain threshold or not If it passes it becomes activated If it does not pass it becomes inactive Using sigmoid makes it very difficult to train and Relu is easier to train Thanks, Lesha In the background Amplify Partners is an early stage investing in tech professionals to build the next generation of companies focused on AI applications if you or someone you know are thinking of starting a company someday Or if you were working earlier than that now the company would love to hear from you She even created a specific email for this video to 3blue1bro[email protected] So feel free to reach out to them through that