Transcript:

Many of you know, I come from a bit of a life sciences background, so I ran into this idea called genetic algorithms and I’d like to share what they’re about with you. So let’s dive right into it. What is a genetic algorithm? It sounds like a pretty complicated thing, but it’s really not so bad and hopefully by the end of this presentation. I’ll have demystified it for you guys a little bit, and maybe you’ll even feel confident enough to try rolling your own. But the essence of it is that is an optimization strategy that mimics natural selection so basically, our algorithms will be generating solutions to problems and kind of apply this principle of survival of the fittest to them and so. How do we make one? We need a few things first, namely a problem to solve, depending on the kind of problem. We’ll have to tailor our algorithm a little bit. But the overall steps are usually about the same. We’re also going to need a Fitness function and what I mean by that is that we need a way to grade solutions, right. What makes a good solution? What makes a bad solution for our problem? And then finally, we need a way to represent these as DNA. This probably seems a bit vague to you, guys. And it’s probably unclear how this goes into code, so we’re going to walk through an example. I think most of us have at least heard of the traveling salesperson problem. Essentially, the gist is given a set of cities. What is the shortest path that connects all of them with the condition that your first city is also your last city, right, so it’s a round trip, and I guess it’s kind of appropriate that I’m doing this on our week. One of react, oh, and talking about algorithmic complexity, so you might imagine a naive solution to this would be to take all of our cities and actually generate every single possible permutation of routes between them, and that takes a really long time. It takes factorial time, which, as we know with inputs, even as small as twenty or thirty cities begins to take an unmanageable. Inc. Long amount of time. So we’re going to see if we could do a little bit better with our genetic algorithm and so first, we’re going to represent our routes as DNA with an ordered collection of cities in my implementation, I ended up using an array and basically the zero index of the first city, the one index of the second city and so on and so forth and then also again pretty straightforward. What our Fitness function is right, the shorter the path, the better. Basically, you just go through a route city to city sum up all the distances and then kind of rank them, so let’s go a bit more at the detail. Step one for our algorithm. We’ll have to generate a feed population, kind of like a generation zero and so say here we have cities one through five, and we want to find the shortest route between them here. I’ve just kind of randomly shuffled the cities three times totally random and then place them in our starter generation. Then I’m going to rank the fitness of each of these reps and so again, the shorter the distance, the higher, the fitness and then after that we’re going to randomly select two of these routes. I say randomly, and there’s definitely an element of randomness to it. But usually the selection algorithm actually is biased towards the fitter one, so the fitter you are the higher chance you have of actually getting picked and then so this algorithm is modeled after biology and so once we’ve our algorithm kind of acts as matchmaker through these two routes. Maybe they go on a few dates and such and they really hit it off and we hit step three mate. So here we have our two parent routes, Route 1 and Route 2 and here’s we’re going to make a child first. We take a random segment of random lengths from one of the parents. Transplant it directly into the child notice how the index is the same as it was in the parent and then from our second parent we’re going to fill in the rest of the genes, so we start at CD 3 the index after the end of the segment fill that in City 5 is already in here, so we’re going to skip over it. City one also is already in here city to City 4 and then damn, we already have our first child for our algorithm, and then after that we’re also going to apply a mutation step, so essentially, we’re going to iterate across this collection of cities and using a predetermined probability we’re going to decide if a city is going to mutate or not In this case, Let’s say City one has been selected to mutate and because for our traveling salesman problem. We have this constraint where Oliver out should have the same cities and city should only appear once the mutation step is just a simple swap, right so here we swap City 1 and City 2 right, and that’s about it to create a child. We place it in the next generation and rinse and repeat, and so we have a new generation that was the same size as the old one and hopefully if we’ve constructed our algorithm right over time we’re hoping that our algorithm will generate fitter and fitter generation. So let’s take a look at what this looks like in practice, So I’ve coded up a quick demo. Please excuse the awful front-end anyway. I’m just going to get this going and let’s orient ourselves so over here on the left. I am displaying the fittest individual of the current generation and then up here we are displaying the fittest individual that we have generated so far. Oh, and actually. I forgot to mention so here. We’re only displaying one, but this is one of 50 others, and so we kind of have like 49 unsung heroes here who are helping to contribute their genes to the next generation and here. This is just a really coarse graph of the fitness of these individuals, so we get a feel for how things change over time, right, so, yeah, the steps we just went over in the slide. That’s one generation, all right, so we just do this over and over and over again Until we arrive at the solution, we’re pleased with, so now take a look at these parameters down here. We have population size, which is basically just the size of each generation. We also have the crossover chance what I mean by this is, we don’t always cross over too children every now and then will actually just take a parent and clone it down directly kind of to have a stronger way of preserving the genes, and then we also have our mutation chance which kind of self-explanatory again. It’s a chance that cities will swap within a route and so let’s see how each of these kind of affect our algorithm, so I’m going to start with a pretty small population of 15 I’m going to get rid of mutation for now. It kind of gets in the way of the demo because it was all probability based. But, uh, anyway, we’ll see that if we try to run this. I should also do this stuff, but if we try to run this, it actually gets stuck here and I’ll try and run this over and over again but we’ll see our algorithm continuously get stuck on pretty poor solution, and basically, the problem here is that we have a low genetic diversity because our population of solutions is so small over time, they really start to just look more and more like each other, and they’re kind of unable to break out of that local optimal solution, so we can try to mitigate that by increasing our population size. And, oh, we actually have a pretty decent solver. Now this won’t always get the best solution, and sometimes it’ll take a while, but it’s a great deal better than having only 15 routes in our population, so diversity is probably a good thing who knew, and then from here, let’s take a look at. I’m actually going to skip over crossover for the sake of time, but let’s take a look at how mutation plays into this, so let’s go back to our small population of 15 right, we’re getting stuck before. And so if I introduce a little bit of mutation about 5% we can see that we actually don’t arrive at this problem anymore of getting stuck. We don’t exactly converge on a great solution very quickly either. But mutation really helps us to explore more of the solution space and kind of try to are continuing to try to find a good solution and then we’ll see how this affects a large population size as well, namely, it’ll make conversion to a solution pretty difficult just because we’re adding, we’re basically adding statistical noise with this mutation champ, and so this solution won’t converge quite as quickly, but it probably has a good chance of arriving at a good solution relatively quickly and you might even find the best solution given enough time, right and so back to the slide for a little bit what? I’ve kind of demonstrated just now is referred to as a local optimal problem. There’s trade-off between exploitation and exploration, and that’s not really a trade-off unique to this kind of algorithm either. It’s a trade-off that exists for really anything, including human beings or other living organisms that can learn or adapt to solve their problems. The the essence of it is do. I keep doing what I’m doing. Maybe with a few small changes and try to get really good at that or do. I try something new do. I try and yeah, explore my options. Maybe try a new paradigm and so earlier when we were getting stuck on our solution, we were essentially hitting local Optima in the global solution space and the hope is that by introducing decent crossover and mutation We can not get stuck. Maybe even climbed down or jump across these little local. Optima in our solution space. And so a final note. As you saw already. We don’t always hit the optimal solution This way, but yeah. This is a heuristic approach, but we do usually get a pretty good one pretty quickly, right, so remember? These are time complexities for or how long operations take at given time complexities and with enough cities. This kind of operation could take years and years with a brute force solution. But I’m going to show you right here. I took a screenshot of my algorithm trying to solve a 30 random city problem, And if you look right here in the best solution rated, it’s definitely not the best solution possible, but it’s really not bad. It’s actually probably quite close to what the optimum would be and rather than taking years and years to get to here. This took about 30 seconds on my several year-old neckla care and so hopefully. I’ve helped demonstrate what genetic algorithms are. And yeah, actually, so. I did skip over a bunch of implementation details and I’d be happy to like slack out code later. Something so you could kind of see how you might want to implement one of these in actual just plain Javascript, but yeah, that’s it. Thanks for listening [Applause].