Transcript:

Today I will show you how we can implement a parameter optimization loop in the name analytics platform. The goal is to find the best set of parameters for your prediction model like all other loops. The parameter optimization loop follows the classic named loop motive. This means that we have a loop body, which is located between a loop start node and a loop end node for parameter optimization. We use the parameter optimization loop start and a parameter optimization loop end node. These two nodes surround a loop body where we trained a prediction model. The parameter optimization loop start node loops through a list of parameter sets at each iteration. A new set of parameters is applied to the learner note. The parameter optimization loop end node collects these iteration results and compares it with the previous ones. Let’s start with this pre-prepared workflow, where we read. The data set, split it into a training set and the test set using the petitioning node train. A classification model with the decision tree learning node applied a model to the test dataset, using the decision, creep licked a node and used a score inert to calculate accuracy to evaluate the performance of the model. Our goal is to find a set of parameters for a decision tree learning algorithm that leads to the highest accuracy on the test set. So let’s have a quick look at the configuration window of the decision tree learning ode to find out which setting options can be optimized as we can see here. One setting is the minimum number of Records per node. Let’s start with an easy example and try to find the optimal value for the setting minimum number of Records per node, which leads to the highest accuracy. So let’s start with building a loop as I said before we need the parameter optimization group start and end nodes, we look for those nodes in a node repository and first reckon drop the start node into the workflow editor as we can see. The node has only one output port of type flow variable lets. Open the configuration window to see what setting options we have. We can add a parameter by clicking the button add new parameter. Then we can define parameter name as well as the start value, stop value and step size. In our case, we want to test all values between 2 and 15 so we create a new parameter with the name mid number of Records to a start value 15 a stop value and 1 a step size. If you wish to optimize not only one, but many parameters you could add as many parameters as necessary through the add new parameter button, now we can choose between two search strategies, crude Falls or hill climbing the brute force strategy checks all possible parameter combinations and returns the best one, which means for our example that we will train a model for each value between 2 and 15 the hill climbing strategy on the other hand needs less computational effort as it doesn’t track all possible parameter value combinations. Instead, it starts with a random set and then proceeds with only the direct neighbor values, according to the given intervals and step sizes, the best value combination. According to the objective function among all neighbors is the start point for the next iteration. If no neighbor improves the accuracy, the loop terminates if we now execute the node and have a look at the output, we see that we get one flow variable with our defined parameter name and the start value two. We now have to override the original setting and the decision to learn a node with the parameter value produced by the parameter optimization loop start node to do that. We connect the flow variable output part of the parameter optimization of start node with the flow variable input Part of the decision tree learning node. In this way, we have injected the flow variable from the parameter optimization loop start node into the decision tree learning node. Next we open the configuration window of the decision, we learn a node and go into the flow variable. Stop here! We find a setting for the minimum number of records. We then overwrite its value with the value of the flow variable selected In this case. We select the flow rival named min number of Records for the setting min number records. Now we press OK to save our configuration set. Next we drag and drop the parameter optimization loop and note into the workflow editor. A look into the configuration window of this node shows that we can set here the flow variable first objective function value and whether we want to maximize or minimize it that is, we need to set here the flow of a rule containing the accuracy value this we get as output of the scorer node, so let’s connect the flow variable output part of the scorer node with the flow variable input part of the parameter optimization loop end node, open again, the configuration window and choose accuracy as the flow variable containing the objective function value. Next, we choose to maximize the function and press. OK, if we now execute the parameter optimization to end node, the whole loop is executed as we have chosen the brute-force strategy. The model is trained for each parameter between 2 and 15 once the loop has been executed. We can have a look at the resulting tables. The table at the first output part gives us only the parameter set, which corresponds to the highest accuracy in the second output table. We get the accuracy for all tested parameters.