Polly 20.0.0git
|
#include <PerfMonitor.h>
Public Member Functions | |
PerfMonitor (const Scop &S, llvm::Module *M) | |
Create a new performance monitor. | |
void | initialize () |
Initialize the performance monitor. | |
void | insertRegionStart (llvm::Instruction *InsertBefore) |
Mark the beginning of a timing region. | |
void | insertRegionEnd (llvm::Instruction *InsertBefore) |
Mark the end of a timing region. | |
Private Member Functions | |
llvm::Function * | insertInitFunction (llvm::Function *FinalReporting) |
void | addToGlobalConstructors (llvm::Function *Fn) |
Add Function to list of global constructors. | |
void | addGlobalVariables () |
Add global variables to module. | |
void | addScopCounter () |
Add per-scop tracking to module. | |
llvm::Function * | getRDTSCP () |
Get a reference to the intrinsic "{ i64, i32 } @llvm.x86.rdtscp()". | |
llvm::Function * | getAtExit () |
Get a reference to "int atexit(void (*function)(void))" function. | |
llvm::Function * | insertFinalReporting () |
Create function "__polly_perf_final_reporting". | |
void | AppendScopReporting () |
Append Scop reporting data to "__polly_perf_final_reporting". | |
Private Attributes | |
llvm::Module * | M |
PollyIRBuilder | Builder |
const Scop & | S |
bool | Supported |
Indicates if performance profiling is supported on this architecture. | |
llvm::Value * | CyclesTotalStartPtr |
The cycle counter at the beginning of the program execution. | |
llvm::Value * | CyclesInCurrentScopPtr |
The total number of cycles spent in the current scop S. | |
llvm::Value * | TripCountForCurrentScopPtr |
The total number of times the current scop S is executed. | |
llvm::Value * | CyclesInScopsPtr |
The total number of cycles spent within scops. | |
llvm::Value * | CyclesInScopStartPtr |
The value of the cycle counter at the beginning of the last scop. | |
llvm::Value * | AlreadyInitializedPtr |
A global variable, that keeps track if the performance monitor initialization has already been run. | |
Definition at line 16 of file PerfMonitor.h.
PerfMonitor::PerfMonitor | ( | const Scop & | S, |
llvm::Module * | M | ||
) |
Create a new performance monitor.
S | The scop for which to generate fine-grained performance monitoring information. |
M | The module for which to generate the performance monitor. |
Definition at line 65 of file PerfMonitor.cpp.
|
private |
Add global variables to module.
Insert a set of global variables that are used to track performance, into the module (or obtain references to them if they already exist).
Definition at line 103 of file PerfMonitor.cpp.
References AlreadyInitializedPtr, Builder, CyclesInScopsPtr, CyclesInScopStartPtr, CyclesTotalStartPtr, M, and TryRegisterGlobal().
Referenced by initialize().
|
private |
Add per-scop tracking to module.
Insert the global variable which is used to track the number of cycles this scop runs.
Definition at line 94 of file PerfMonitor.cpp.
References Builder, CyclesInCurrentScopPtr, GetScopUniqueVarname(), M, TripCountForCurrentScopPtr, and TryRegisterGlobal().
Referenced by initialize().
|
private |
Add Function to
list of global constructors.
If no global constructors are available in this current module, insert a new list of global constructors containing Fn
as only global constructor. Otherwise, append Fn
to the list of global constructors.
All functions listed as global constructors are executed before the main() function is called.
Fn | Function to add to global constructors |
Definition at line 36 of file PerfMonitor.cpp.
References polly::Array, Builder, M, polly::Value, and X().
Referenced by initialize().
|
private |
Append Scop reporting data to "__polly_perf_final_reporting".
This function appends the current scop (S)'s information to the final printing function.
Definition at line 168 of file PerfMonitor.cpp.
References assert, Builder, polly::RuntimeDebugBuilder::createCPUPrinter(), CyclesInCurrentScopPtr, FinalStartBB, ReturnFromFinal, Supported, TripCountForCurrentScopPtr, and polly::Value.
Referenced by initialize().
|
private |
Get a reference to "int atexit(void (*function)(void))" function.
This function allows to register function pointers that must be executed when the program is terminated.
Definition at line 22 of file PerfMonitor.cpp.
References Builder, Function, and M.
Referenced by insertInitFunction().
|
private |
Get a reference to the intrinsic "{ i64, i32 } @llvm.x86.rdtscp()".
The rdtscp function returns the current value of the processor's time-stamp counter as well as the current CPU identifier. On modern x86 systems, the returned value is independent of the dynamic clock frequency and consistent across multiple cores. It can consequently be used to get accurate and low-overhead timing information. Even though the counter is wrapping, it can be reliably used even for measuring longer time intervals, as on a 1 GHz processor the counter only wraps every 545 years.
The RDTSCP instruction is "pseudo" serializing:
"“The RDTSCP instruction waits until all previous instructions have been executed before reading the counter. However, subsequent instructions may begin execution before the read operation is performed.”
To ensure that no later instructions are scheduled before the RDTSCP instruction it is often recommended to schedule a cpuid call after the RDTSCP instruction. We do not do this yet, trading some imprecision in our timing for a reduced overhead in our timing.
Definition at line 61 of file PerfMonitor.cpp.
Referenced by insertFinalReporting(), insertInitFunction(), insertRegionEnd(), and insertRegionStart().
void PerfMonitor::initialize | ( | ) |
Initialize the performance monitor.
Ensure that all global variables, functions, and callbacks needed to manage the performance monitor are initialized and registered.
Definition at line 200 of file PerfMonitor.cpp.
References addGlobalVariables(), addScopCounter(), addToGlobalConstructors(), AppendScopReporting(), FinalReporting, Function, insertFinalReporting(), and insertInitFunction().
Referenced by generateCode().
|
private |
Create function "__polly_perf_final_reporting".
This function finalizes the performance measurements and prints the results to stdout. It is expected to be registered with 'atexit()'.
Definition at line 123 of file PerfMonitor.cpp.
References Builder, polly::RuntimeDebugBuilder::createCPUPrinter(), CyclesInScopsPtr, CyclesTotalStartPtr, FinalReportingFunctionName, FinalStartBB, Function, getRDTSCP(), M, ReturnFromFinal, Supported, and polly::Value.
Referenced by initialize().
|
private |
Definition at line 216 of file PerfMonitor.cpp.
References AlreadyInitializedPtr, Builder, CyclesTotalStartPtr, FinalReporting, Function, getAtExit(), getRDTSCP(), InitFunctionName, M, Supported, and polly::Value.
Referenced by initialize().
void PerfMonitor::insertRegionEnd | ( | llvm::Instruction * | InsertBefore | ) |
Mark the end of a timing region.
InsertBefore | The instruction before which the timing region starts. |
Definition at line 277 of file PerfMonitor.cpp.
References Builder, CyclesInCurrentScopPtr, CyclesInScopsPtr, CyclesInScopStartPtr, Function, getRDTSCP(), Supported, TripCountForCurrentScopPtr, and polly::Value.
Referenced by generateCode().
void PerfMonitor::insertRegionStart | ( | llvm::Instruction * | InsertBefore | ) |
Mark the beginning of a timing region.
InsertBefore | The instruction before which the timing region starts. |
Definition at line 266 of file PerfMonitor.cpp.
References Builder, CyclesInScopStartPtr, Function, getRDTSCP(), Supported, and polly::Value.
Referenced by generateCode().
|
private |
A global variable, that keeps track if the performance monitor initialization has already been run.
Definition at line 68 of file PerfMonitor.h.
Referenced by addGlobalVariables(), and insertInitFunction().
|
private |
Definition at line 43 of file PerfMonitor.h.
Referenced by addGlobalVariables(), addScopCounter(), addToGlobalConstructors(), AppendScopReporting(), getAtExit(), insertFinalReporting(), insertInitFunction(), insertRegionEnd(), and insertRegionStart().
|
private |
The total number of cycles spent in the current scop S.
Definition at line 55 of file PerfMonitor.h.
Referenced by addScopCounter(), AppendScopReporting(), and insertRegionEnd().
|
private |
The total number of cycles spent within scops.
Definition at line 61 of file PerfMonitor.h.
Referenced by addGlobalVariables(), insertFinalReporting(), and insertRegionEnd().
|
private |
The value of the cycle counter at the beginning of the last scop.
Definition at line 64 of file PerfMonitor.h.
Referenced by addGlobalVariables(), insertRegionEnd(), and insertRegionStart().
|
private |
The cycle counter at the beginning of the program execution.
Definition at line 52 of file PerfMonitor.h.
Referenced by addGlobalVariables(), insertFinalReporting(), and insertInitFunction().
|
private |
Definition at line 42 of file PerfMonitor.h.
Referenced by addGlobalVariables(), addScopCounter(), addToGlobalConstructors(), getAtExit(), getRDTSCP(), insertFinalReporting(), insertInitFunction(), and PerfMonitor().
|
private |
Definition at line 46 of file PerfMonitor.h.
|
private |
Indicates if performance profiling is supported on this architecture.
Definition at line 49 of file PerfMonitor.h.
Referenced by AppendScopReporting(), insertFinalReporting(), insertInitFunction(), insertRegionEnd(), insertRegionStart(), and PerfMonitor().
|
private |
The total number of times the current scop S is executed.
Definition at line 58 of file PerfMonitor.h.
Referenced by addScopCounter(), AppendScopReporting(), and insertRegionEnd().