Transcript:
Hi, this is Carrie Ann, and welcome to the Computer Science course! As we touched upon several times in this series, computers are miraculous at storing, organizing and Fetch and process huge amounts of data. This is ideal for things like e-commerce sites with millions of items for sale. And to store billions of health records for quick access by doctors. But what if we want to use computers not only to fetch and display data, but actually to Make decisions about data? This is the essence of the machine-learning algorithms that give computers the ability to learn from Data, and then make predictions and decisions. Computer programs with this ability are very helpful in answering questions such as is e-mail spam? Does the person have an irregular heartbeat? What video should YouTube recommend next? While useful, we probably wouldn’t describe these programs as “smart.” The same way we think of human intelligence. So, although the standards often differ, most computer scientists would argue that Machine learning is a group of technologies with the more ambitious goal of artificial intelligence Or AI for short. INTRO Machine learning and AI algorithms tend to be very sophisticated. So instead of delving into the mechanisms of how they work, we’re going to focus on what Algorithms do in theory. Let’s start with a simple example: Decide if a butterfly is a Luna butterfly or an Emperor butterfly. This decision process is called categorization, and the algorithm that performs it is called Workbook. Although there are techniques that can use raw data for training – such as pictures, Sounds – Many algorithms reduce the complexity of real-world objects and phenomena into something It is named features. Features are the values that usefully characterize the things we want to classify. In our butterfly example, we’re going to use two features: “wings” and “mass.” In order to train our machine learning classifier to make predictions good, we’ll need a To the training data. To get it, we’ll send an entomologist to the forest to collect data for each of the Luna butterflies And the emperor. These experts can learn about different butterflies, they not only record feature values, But they also label this data with the butterfly’s specification This is called named data. Because we only have two features, it is easy to visualize this data in a scatter plot. Here, I’ve plotted data for 100 emperor butterflies in red and 100 emperor butterflies in blue. We can see that the types form two groups, but…. There is some overlap in the middle … so it is not entirely clear how to separate The two are better. This is what machine learning algorithms do – find the perfect class! I’m going to theoretically separate it and consider anything less than 45mm in the wingspan as likely To be an emperor butterfly. And we can add another section, which additionally considers that the mass must be less than 0.75 for We believe the butterfly is an emperor. These lines that cut across the decision space are called decision boundaries. If we look at our data, we can see that 86 emperor butterflies belong correctly Within the Emperor’s decision area, however, 14 incorrectly belongs to the location of the Luna Butterfly. On the other hand, 82 butterfly colors would be correct, with 18 falling on the wrong side. A table, like this one, shows where the compiler gets things right and wrong called Confusion Matrix … which should probably also be the title of two films he shared in Matrix Trilogy! Note that there is no way for us to draw lines that give us 100% accuracy. If we lower our limits for the two-wing resolution, we will misclassify more Emperor butterflies as Lunas. If we raised it, we’d misclassify more Luna butterflies. The job of machine learning algorithms, at a high level, is to maximize correct classifications While minimizing errors On our training data, we obtained 168 correct and 32 false butterflies, for an average Rating accuracy is 84%. Now, using those decision limits, if we’re going to go out into the woods and run into a butterfly Uncharted, we can measure its features, plot them, and place them on our decision space. This is unlabeled data. Our decision limits offer a guess as to what species of butterflies are. In this case, expect it to be a luna butterfly. This simple approach, dividing the decision space into bins, can be represented This is called a decision tree, which looks like this image or it can be written In code with If-Statements, like this. The machine learning algorithm that produces the decision trees must choose what features That you split it … And then for each of these features, what values to use for splitting. Decision trees are just one basic example of machine learning technology. There are hundreds of algorithms in computer science today. And more of that is being published all the time. Even a small number of algorithms employ multiple decision trees work together to make a prediction. Computer scientists arrogantly named it forest … because it contains a lot of trees. There are also “non-tree” based approaches, such as Support Vector Machines, which are at their core Resolution space cuts using arbitrary lines. These lines do not have to be straight lines. They can be polynomial or Another mathematical function. As before, it’s a machine learning algorithm function to find out the best lines to provide The limits of the most accurate resolution. So far, the examples I have cited only have a couple of features, and that is easy enough for a human being To know it. If we add a third feature, let’s say, length of antennas, then 2D lines becomes 3D levels, You create decision boundaries in three dimensions. These levels don’t have to be just straight. Additionally, this really useful classifier deals with many different types of butterflies. Now I think you agree with me that it gets too complicated to know by hand … But even this is a very simple example – only three features And five types of butterflies. We can still show it in this 3D scatter graph. Unfortunately, there is no good way to visualize four features simultaneously, or twenty features, Assume hundreds or even thousands of features. But this is what many real-world machines face. Can you imagine trying to figure out an equation for a space spanning thousands of dimensions Of the decision space? Probably not, but computers, with smart machine learning algorithms can do this … and they all do Day long, on computers in places like Google, Facebook, Microsoft and Amazon. Technologies like Decision Trees and Support Vector Machines are firmly rooted in The field of statistics, which dealt with making confident decisions, using data, Before computers even existed. There is a very large class of widely used statistical machine learning techniques, however There are also some approaches with no origins in the statistics. Most notable are the artificial neural networks, which were inspired by the neurons in our brains! To understand biological neurons, check out our general three-part review here, but mainly Neurons are cells that transmit and transmit messages using electrical and chemical signals. They take one or more inputs from other cells, process those signals, and then emit Its own sign. It belongs to huge interconnected networks that are capable of processing complex information. Just like your mind is watching this video. Artificial neurons are very similar. Each takes a series of inputs, combines them, and emits a signal. Instead of electrical or chemical signals, artificial neurons take numbers in and fire Numbers out. They are organized into layers that are linked by connections, forming a network of neurons, which connects us to The name. Let’s return to the butterfly example to see how neural networks can be used for classification. Our first layer – the input layer – presents data from a single butterfly that needs to be classified. Once again, we’ll be using a block and two wings. At the other end, we have the excretory layer, with two neurons: one for the emperor’s butterfly And another for Luna butterfly. The most exciting neuron is the classification decision. In the middle, we have a hidden layer, which converts inputs into outputs, and does the work Difficult to classify. To see how this is done, let’s approach a neuron in a hidden layer. The first thing Neuron does is multiply each of the inputs by a certain weight, let’s We say 2.8 for the first entry, and 0.1 for the second entry. Then, it brings these weighted inputs together, which in this case is the grand total of 9.74. The neuron then applies a bias to this finding – in other words, it adds or subtracts A constant value, say, minus six, for a new value of 3.74. This bias and weighted inputs are initially random values when creating the network Nervousness. Then, an algorithm works, and it starts changing and switching all those values to train the neural network, Using the labeled data for training and testing. This happens several times, gradually improving accuracy – a similar process Very much humane learning. Finally, neurons have an activation function, also called a transport function, that is Apply them to the output, making a final mathematical modification down to the result. For example, specify a field value from a negative and a positive one, or make Any negative value to 0. We’ll use a linear transformation function that passes the value through unchanged, so 3.74 It remains as 3.74. So for our example, looking at the inputs. 55 and 82, the output is 3.74. This is just one neuron, but this is a process of weight, addition, displacement and application The activation function is applied to all neurons in a layer, and the values are propagated Forward in the grid, one layer at a time. In this example, the output neuron with the highest value is our decision: the Luna butterfly. Most importantly, the hidden layer doesn’t have to be just one layer … it can be many layers Deep. This is where deep learning comes in. Training these more complex networks takes a lot more computation and data. Although neural networks were invented more than fifty years ago, neural networks are Deep has been in the works lately, thanks to powerful processors, and more, Fast graphics processing units. So, thank you guys for your fierce demand for smoother tires! A few years ago, Google and Facebook showed deep neural networks that they could find Faces in pictures as well as humans – and humans are really good at this! It was a great milestone. Currently, deep neural networks, motor driving, human speech translation, and medical condition diagnostics And much more. These algorithms are very sophisticated, but it’s less clear whether they should be described As “smart”. It really can only do one thing like categorize butterflies, find faces, or translate languages. This type of AI is called Weak AI or Narrow AI. It is only smart at specific tasks. However, this does not mean that it is not useful; I mean medical devices that can diagnose, And the cars that can drive themselves are amazing! But do we need those computers to compose music and look for delicious recipes on Their free time? Probably not. Although it would be a bit cute. General purpose AI, intelligent and well human-like, is called Strong AI. No one has proven that there is anything close to the human level in AI yet. Some find it impossible, but many people point to the explosion of digital knowledge – Like the examples of Wikipedia, web pages, and YouTube videos – that is a good perfection For a powerful AI. While you can only watch a maximum of 24 hours of YouTube per day, a computer can watch Millions of hours. For example, IBM’s Watson consults and collects information from 200 million pages of content. Including the full text of Wikipedia. Although not a powerful AI, Watson is very smart, crushing human competition at Jeopardy In 2011. AI can not only store large amounts of information, but it can also learn During time, often much faster than humans. In 2016, Google first demonstrated AlphaGo, a Narrow AI playing the very complex game Go game. One of the ways that he was so good and able to beat the best human players, was through The play itself has been reproduced millions upon millions of times. He learned what worked and what did not, and along the way, he discovered successful strategies All by himself. This is called reinforcement learning, and it’s a very powerful approach. In fact, it is very similar to how humans learn. People don’t magically gain the ability to walk … it takes thousands of hours From trial and error to find out. Computers are now on the cusp of learning by trial and error, and for many narrow problems, Reinforcement learning has been used extensively. What will be interesting to see is whether these kinds of learning techniques can be applied More broadly, to create an instance-to-human, powerful artificial intelligence that learns very similarly to the way children learn, but At very accelerating rates. If that happens, there are some very big changes in the human store – a subject We will review it at a later time. thanks for watching. See you next week.