Random forest regression is another type of supervised machine learning technique, which is very effective when other types of machine learning algorithms seems to be computationally expensive in this video. I am going to introduce to you the random forest of regression algorithm and will explain it to you in detail that is first. I will explain what is ensemble learning. Then I will cover Algorithmic approach related to random forest regression algorithm and later on I will explain the advantages and disadvantages of a random forest regression. So watch this video till the end as you are going to learn lot of things from this video. [MUSIC] Hello, folks, Newton here. And this is the Ai University channel. If you are new here, then consider subscribing to this channel or if you have already subscribed. Then click on the bell icon to receive the notifications about hottest technologies of 21st century. Before we jump on to the random Forester regression, let’s first understand the concept of ensemble learning. So what is ensemble learning well ensemble. Learning is a technique which takes predictions from multiple machine learning algorithms or predictions from same algorithms multiple times in order to make more accurate predictions predictions from a single individual model might not give that much accurate results and hence we use an samba learning techniques for that purpose, a model which is composed of several machine. Learning models is called an ensemble learning model so we can develop several models like the season 3 K&N support vector machines on same data to get the combined predictions from them. Similarly, we can have several decision trees a trained together to get a final combined prediction. All symbol learning comes in two flavors. Number one is boosting and number two is backing in boosting learners are learnt sequentially with early learners fitting simple model to the data and then analyzing data for errors, consecutive trees or random samples are fed and at every step, the goal is to improve the accuracy from the prior tree, bagging on the other hand is used to create several subsets of data from training sample chosen randomly with replacement, Each collection of subset’s data is used to train their decision trees, the reason. I highlighted these two techniques Here is that a random forest is a bagging technique and you should know the difference between these two techniques from Interview perspective. Now there are two drawbacks of decision trees due to which we can opt for random forest technique. Number one. This is trees are computationally expensive number two. These are very sensitive to the data on which they are trained, resulting in deviations in predictions if the underlying data gets changed, random forest caters this problem, it is supervised learning technique, which makes use of ensemble learning to perform both a regression and classification task, it constructs multiple decision trees, a training time and combines the result of predictions from each decision tree to determine the final output that output can be in terms of classification or regression. Please note that the trees in the random forest run in parallel only so let’s go ahead and see the approach algorithm takes to build a random forest number. One step is pick at random. K data points from training set. So here we are utilizing the whole data set and then we are picking a subset or K data points from it number. Two step is build the decision tree associated with those K data points. So as a part of this step, you are building the decision tree. With those subset of observations or K data points number three step is choose the number and three of trees you want to build and repeat, step one and two now as a part of this step, you actually choose the number of trees you want to build and then you repeat this step one and two to build lots and lots of regression decision, trees step four for a new data point. Make each one of your. M three trees predict the value of Y for the data point and assign the new data point the average across all of the predicted. Y values as a part of this step. You get multiple predictions from these decision trees. So basically, we take all these decision trees to get the predictions that is each one of them, predicting the value of target variable. Y for the new data point for whom we want to get prediction algorithm assigns the average value of all the decision trees predicted. Y values to this new data point, so if the value of number of trees is set to a thousand, then you will get individual predictions for the value of y for all the thousand trees, and then we are taking average of Y values of all the thousand trees. This is the reason it is called as random forest. Because here we have forests of multiple decision trees. One more thing since you are taking average of many predictions here, so your accuracy of prediction gets improved or not because predictions of some of the perfect individual decision trees gets balanced out with the predictions of remaining, not so perfect decision trees. These kind of ensemble Arabic algorithms are more stable because any change in the data set are less likely to impact the forest of these decision trees. Please note that each decision tree draws a random sample from the original data set when generating its split, adding a further element of randomness that prevents overfitting. Now let’s come on to the advantages of the random forest. So it has an effective method of for estimating missing data and maintains accuracy when a large proportion of the data are missing number two advantage is it runs efficiently on large data set or databases now? The disadvantages associated with this random forest regression algorithm is that random forests have been observed to overfit for some data sets with noisy classification or regression tasks. So, folks. This is it for this video in the next upcoming video. I will be covering the hands-on steps to build and train random forest machine learning model using Python. So here is today’s question state, true or false for the given statement. Decision trees are computationally expensive. Please post our answers. Comments in the comment section given below so that I can get a chance to incorporate your feedback, You can also post our technical questions in the comment section and I will. I do answer the same. If you are watching this video and you are not already A subscriber to our channel, consider clicking this little subscribe button. In case you have already subscribed. Then click on the bell icon to receive the notifications whenever I will release a new video. So thanks for hanging out with me, you guys. I will be covering next topic in the upcoming video. So keep on watching, thank you.