Free Pascal Windows Profiler Usage
To profile an application with FPWProf
first run it and then select it from the list at the left (press
Ctrl+R to refresh the list if the application doesn't show up
- also note that you need to refresh it again if you exit the
application and run it again). Then clik the "Start"
button and let the profiler collect some samples. To stop collecting
the samples either press the "Stop" button or exit the
application. Once you do that the profiler will read the debug
information from the executable and show a profile with the functions
in the executable sorted from the most sampled (so your program
spends most time in that function) to the least sampled (including
functions that weren't sampled at all) together with a timeline
for the overall CPU usage the program had. If the "Only known"
checkbox is checked, this will only show functions known to the
profiler through the debug data, otherwise it will also show samples
that the profile didn't know where they end up (e.g. code without
any debug info or code from other modules, like Windows DLLs).
These are grouped together based in the module they were at (e.g.
if your application spends most time in the UI doing nothing you
may end up with win32u.dll at the top).
After the profile is finished and the results are listed you
can right click in the results are and select Copy All
to copy them to the clipboard. This way you can save them in a
text editor for later comparisons.
The options you can set for profiling in the panel at the top
of the window are:
- Start/Stop buttons - Start/stop the sampling (Start
resets any profiled data)
- Capture Stack - Capture call stacks too, this can
be used to find the stack traces where a function appears (sorted
by sample count). This is very useful to find the exact path
that leads to a highly called function (e.g. if a function like
DistanceBetween is called 1000 times from a function like
ClosestPoint and just 50 times from a function like CheckActivator,
this will help distringuish between the two paths instead of
just showing 1050 calls)
- Closest - When creating the profile, if a function
without a known debug symbol is encountered, try to find the
closest known one in the callstack (needs Capture Stack)
- Start automatically for EXE - If checked, the profiler
will periodically refresh the process list and if a process with
the executable name entered in the box next to it appears, it
will be selected and the sampling will start (unless a shortcut
is set, see below)
- Max samples - If checked stops the sampling automatically
once the specified number of samples have been collected
- Start/Stop shortcut - Specify one of the F1..F12 keys
to be used to start/stop the collection, can be useful for profiling
fullscreen applications like games (do not use the same key for
both). If "Start automatically for Exe" is checked,
the profiler will select the specified process as if it was clicked
but wont start the profiling until the Start shortcut is pressed.
- Threads - After the sampling is finished this can
be used to select which thread to profile
Right clicking on any function in the results list will show
the following commands (most need Capture Stack):
- Show Call Traces (Anywhere) - Show the call traces
that contain the function starting from the collected samples.
The results are sorted by the number of times each trace is found
in the collected samples
- Show Call Traces (From Function) - Show the call traces
that contain the function after merging the traces so that they
start from the function itself (so, e.g. if a function is both
at the top of a trace in one sample and at the middle of another,
using this command will treat both as if they were the same trace
- this can be used to find the paths that lead up to that function
regardless of any calls the function makes later). Like above,
the results are sorted by the number of times each trace is found
in the collected samples (but ignoring any functions above the
selected function)
- Show Direct Callers/Callees - Show any known functions
that called the selected function as well as any known functions
that this function called itself, both sorted by the number of
times the callers/callees were found in the collected call traces
- Show in Timeline - Show where the function is found
during sampling in the timeline (the bars for CPU usage will
be drawn as red)
- Copy Name - Copy the name of a known function to the
clipboard
- Copy All - Copy the entire results to the clipboard
- Find... - Search the results for the given string
The Show... commands will put the results in a secondary
list. Right clicking on any select function inside the results
will also show the same Show... commands to further drill
down the report.
You can focus on only a part of the program's run time by left
clicking and dragging anywhere in the timeline to select a time
range. Once you do that, the report will be recreated with only
the samples inside the time range. You can also use the UFPWProfControl
unit from your program to place markers in the timeline.
DWARF vs STABS
The profiler supports two debug information formats: DWARF
version 2 and STABS. Free Pascal supports both of those
in addition to having beta support for DWARF version 3, however
the profiler only supports DWARF version 2. Between the two STABS
is by far the simplest and fastest to parse (at least for the
information that the profiler needs), however it may not be as
precise as DWARF and in addition it is not available in 64bit
applications (this is a limitation of the STABS format using 32bit
values for addresses - a proposal for extending this to 64bit
was made at some point around 2013 in the GDB mailing lists but
it went nowhere).
Depending on the version, Free Pascal may default to one of
these two, so it is better to explicitly specify what debug information
to use (Lazarus defaults to DWARF version 2): -gs will
generate STABS information while -gw or -gw2 will
generate DWARF version 2 information.
In general for 32bit applications STABS might be the better
choice, especially with older Free Pascal versions that may generate
invalid DWARF data. However if the results seem imprecise (you
get a lot of "somewhere in yourapplication.exe" samples)
you may want to use DWARF instead. For 64bit applications DWARF
is the only choice.
Troubleshooting
- If you get a DWARF scanning error - some older versions
(e.g 2.2.x, perhaps others in 2.x line) of Free Pascal may generate
invalid DWARF data. This causes the scanning to fail. To fix
either use a newer version of the compiler or use the -gs parameter
to generate STABS debug information instead.
- If you get no debug info with external symbols - FPWProf
uses the ExeInfo unit to read executable sections and the ReadDebugLink
function seems to not always work. Recompile the application
without external symbols and it should work fine.
- If you get a warning about some compilation units having
unsupported DWARF data - This means some modules (perhaps
external DLLs?) were compiled with DWARFv3 or above which isn't
supported by FPWProf. Make sure that all units compile with DWARFv2.
- If you get a lot of "Somewhere in yourapplication.exe"
entries - Similar to above, make sure everything is compiled
with DWARFv2 debug data. Another reason this may happen (especially
for other compilers) is that the executable does not use fixed
addresses - disable this for profiling.
Using with MinGW
FPWProf can also be used to profile applications written in
C with MinGW (either GCC or Clang should work). This will certainly
generate the "some compilation units have unsupported DWARF
data" warning since the system libraries are not compiled
with DWARFv2 and will be ignored. To ensure DWARFv2 data is used
compile your C application using the -gdwarf-2 parameter,
like:
$ gcc -O2 -gdwarf-2 myapp.c -omyapp
Note that i have made very little testing for this, it mainly
works because both Free Pascal and MinGW use the same debug info
format.