Transcript:
Hi, there! Serdar Yegulalp for Infoworld at IDG. Today I’m going to demonstrate Python’s C Profile module used to generate runtime performance profiles of Python programs. Previously we talked about the time-it function and how that can be used to time. The performance of small snippets of code like individual functions see profile also included with Python profiles. The behavior of applications in whole or, in part, see profile runs an existing Python program and generates an analysis of all. The function calls made in the program. How many of each one how long they take And whether their calls made a native Python functions or two calls to other functions in the application, the results are then saved to a binary file, which you could then use to generate a human readable analysis for this example. I’ll analyze the runtime behavior of a program I wrote for an earlier video in this series, a version of Conway’s Game of Life. I wrote two versions of that program. A fast one and a slow one. I’ll start by analyzing the slow one and then show how the analysis gained from that could be used to generate the fast one working with C. Profile is a three-stage process. First you take the application. You want to analyze and set it up to run with C profile. Then you take the profile generated by the app and produce a human readable report from it. Finally, you use the report to figure out where your applications bottlenecks are and make any improvements for the first step, You can use C profile from the command-line, but it’s often more flexible to run it from within the program. You want to analyze in this example? I’m just wrapping the program’s Main function. Its entry point with C Profile, C profile, dot and run will run the function supplied to it and then generate its trace information based on everything that happens in that function. Since that functions are program’s. Entry point were in effect tracing the whole program’s behavior. The results of the trace are written to a file named output Dot the next lines. We see here, use the. P Stats Library, which goes hand-in-hand with C profile p-stats, is used to take the results of a C profile one saved here to output data and generate human readable reports from it were creating two reports with P stats, the first being the total number of time each function takes and the second being the total number of calls made per function. Now, let’s run the program for a few seconds and generates results as you can see. We now have the report files in our program directory. The first output calls text lists which functions were called the most. Let’s take a moment to look at the format of the output. The end calls column is the total number of calls for the function in question. The next two TOK time and per call indicate the total time spent inside the function call and the average amount of time spent for each call. The next two cume time and per call indicate the total time spent inside the function and all other calls made by that function and the average time per call for that as well. So what we want to pay? Most attention to in this report is the functions that are not only called most often, but that spend the most time internally. Those are most worth optimizing. If there are function calls, we can minimize or optimize out entirely. That’ll speed things up, especially if those calls are being made in tight loops, The topmost item in the list, shown in curly braces is a Python built-in method, appending to a list that shouldn’t surprise us since the pentacle list is something. A lot of Python programs do quite often below that, though our calls to the random number function generators in Python standard library. We generate a lot of random numbers in this program, so we make a lot of those calls, however, keep in mind how total numbers of calls without context can be misleading. For instance, in this case. All the calls to the random number functions are performed a program startup and at no time after that, optimizing random number generation won’t speed up the main body of the program at all. I’ll return to this issue later and show you how to avoid gathering misleading statistics. Now, let’s look at the second report. Output time dot text, it is the same. Multi column format and the same values in each column, but here it’s being sorted by total time spent in the function. That’s the most useful statistic to have for this particular program because we can now zoom in on which functions are the slowest overall here, it’s the generation and render functions their per call times are agonizingly. Slow so those would be the best places to start optimizing. I mentioned before how total number of calls communis leading in this program. We have a lot of calls to the random number. Generators at startup, which means our total calls analysis, is thrown off what I’m going to do is run a number of calls analysis on a slightly different version of the program here. I’m taking the startup part of the program and running it before we start gathering our statistics. This is one of the way see, profile is powerful. You can use it to gather. Statistics on the whole program are only on one specific aspect of it now when I run the program and generate a report. The number of calls report is not cluttered with irrelevant details. We can see here that the total number of calls is actually pretty low, so it isn’t. The number of calls is the issue is that we have a few calls that each take a long time now. Let’s have a look at the statistics generated by the fast version of the same program. The slow calls have been replaced with rewritten equivalents in scythe on which turns Python into C for accelerated performance. I did another video about this earlier, and I’ve linked that below as you can see. In the report. The generation and render functions are still there, but they have braces that indicate their external calls rather than Python native, and the total time per call is now so low, it barely even registers in the per call column. All of the other functions above. It also have very small per call times, so it’s safe to assume this program is as optimized as it gets for now. If you liked this video, be sure to leave a comment below, Don’t forget to subscribe to our Youtube channel and for more tech tips, be sure to follow us on Facebook, Youtube and info Worldcom.