Transcript:
Dear Fellow Scholars. This is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today we are going to look at a paper from three years ago and not any kind of paper, but my kind of paper, which is in the intersection of machine learning, computer graphics and physics simulations. This work zooms in on reproducing reference motions but with a twist and adds lots of amazing additional features. So what does all this mean? You see, we are given this virtual character a reference motion that we wish to teach it and here, additionally. We are given a task that needs to be done. So when the reference motion is specified, we place our AI into a physics simulation where it tries to reproduce these motions. That is a good thing, because if it would try to learn to run by itself alone, it would look something like this. And if we ask it to mimic the reference motion. Oh, Yes…much better. Now that we have built up confidence in this technique, Let’’s think bigger and perform a backflip Uh-oh. Well, that didn’t quite work. Why is that? We just established that we can give it a reference motion and it can learn it by itself. Well, this chap failed to learn a backflip because it explored many motions during training, most of which resulted in failure. So it didn’’t find a good solution and settled for a mediocre solution instead. A proposed technique by the name Reference State Initialization RSI in short remedies, this issue by letting the agent explore better during the training phase. Got it so we add this? Rsi, and now all is well, right. Let’’s see Ouch. Not so much, It appears to fall on the ground and tries to continue the motion from there A+ for effort, Little Ai. But unfortunately, that’s not what we are looking for. So what is the issue here? The issue is that the agent has hit the ground and after that it still tries to score some additional points by continuing to mimic the reference motion Again A+ for effort, But this should not give the agent additional scores. This method we just described is called early termination. Let’’s try it. Now we add the early termination and RSI together and let’’s see if this will do the trick …And…ye’s Finally with these two additions, it can now perform that sweet, sweet backflip rolls and much much more with flying colors. So now the agent has the basics down and can even perform explosive dynamic motions as well. So it is time. Now hold on to your papers. As now comes the coolest part – we can perform different kinds of retargeting as well. What is that? Well, one kind is retargeting the environment. This means that we can teach the AI a landing motion in an idealized case and then ask it to perform the same, but now off of a tall ledge Or we can teach it to run and then drop it into computer game level and see if it performs well And it really does Amazing. This part is very important because in any reasonable industry use these characters have to perform in a variety of environments that are different from the training environment. Two is retargeting not the environment, but the body type. We can have different types of characters. Learn the same motions. This is pretty nice for the Atlas robot, which has a drastically different weight distribution and you can also see that the technique is robust against perturbations. Yes, this means one of the favorite pastimes of a computer graphics researcher, which is throwing boxes at virtual characters and seeing how well it can take it. Might as well make sure of the fact that in a simulated world we make up all the rules. This one is doing really well … oh! Note that the Atlas robot is indeed different than the previous model, and these motions can be retargeted to it, however. This is also a humanoid. Can we ask for non-humanoids as well, perhaps? Oh yes. This technique supports retargeting to T-Rexes dragons. Lions, you name it. It can even get used to the gravity of different virtual planets that we dream up Bravo. So the value proposition of this paper is just completely out of this world Reference State Initialization Early Termination retargeting to different body types environments. Oh, my! To have digital applications like computer games use, this would already be amazing and just imagine what we could do if we could deploy these to real-world robots And don’’t forget these research works, just keep on improving every year. The First Law Of Papers says that research is a process. Do not look at where we are. Look at where we will be two more papers down the line. Now, fortunately. We can do that right now. Why is that? It is because this paper is from 2018, which means that followup papers already exist. What’’s more. We even discussed one that teaches these agents to not only reproduce these reference motions but to do those with style. And style there meant that the agent is allowed to make creative deviations from the reference motion, thus developing its own way of doing it. An amazing improvement, And I wonder what researchers will come up with in the near future. If you have some ideas, let me know in the comments below. What a time to be alive. Thanks for watching and for your generous support and Ill. See you next time.