Polly 20.0.0git
|
The ParallelLoopGenerator allows to create parallelized loops. More...
#include <LoopGenerators.h>
Public Member Functions | |
ParallelLoopGenerator (PollyIRBuilder &Builder, const DataLayout &DL) | |
Create a parallel loop generator for the current function. | |
virtual | ~ParallelLoopGenerator () |
Value * | createParallelLoop (Value *LB, Value *UB, Value *Stride, SetVector< Value * > &Values, ValueMapT &VMap, BasicBlock::iterator *LoopBody) |
Create a parallel loop. | |
DominatorTree * | getCalleeDominatorTree () const |
Returns the DominatorTree for the generated subfunction. | |
LoopInfo * | getCalleeLoopInfo () const |
Returns the LoopInfo for the generated subfunction. | |
AllocaInst * | storeValuesIntoStruct (SetVector< Value * > &Values) |
Create a struct for all Values and store them in there. | |
void | extractValuesFromStruct (SetVector< Value * > Values, Type *Ty, Value *Struct, ValueMapT &VMap) |
Extract all values from the Struct and construct the mapping. | |
Function * | createSubFnDefinition () |
Create the definition of the parallel subfunction. | |
virtual void | deployParallelExecution (Function *SubFn, Value *SubFnParam, Value *LB, Value *UB, Value *Stride)=0 |
Create the runtime library calls for spawn and join of the worker threads. | |
virtual Function * | prepareSubFnDefinition (Function *F) const =0 |
Prepare the definition of the parallel subfunction. | |
virtual std::tuple< Value *, Function * > | createSubFn (Value *Stride, AllocaInst *Struct, SetVector< Value * > UsedValues, ValueMapT &VMap)=0 |
Create the parallel subfunction. | |
Protected Attributes | |
PollyIRBuilder & | Builder |
The IR builder we use to create instructions. | |
std::unique_ptr< LoopInfo > | SubFnLI |
The loop info for the generated subfunction. | |
std::unique_ptr< DominatorTree > | SubFnDT |
The dominance tree for the generated subfunction. | |
Type * | LongType |
The type of a "long" on this hardware used for backend calls. | |
Module * | M |
The current module. | |
llvm::DebugLoc | DLGenerated |
Debug location for generated code without direct link to any specific line. | |
The ParallelLoopGenerator allows to create parallelized loops.
To parallelize a loop, we perform the following steps: o Generate a subfunction which will hold the loop body. o Create a struct to hold all outer values needed in the loop body. o Create calls to a runtime library to achieve the actual parallelism. These calls will spawn and join threads, define how the work (here the iterations) are distributed between them and make sure each has access to the struct holding all needed values.
At the moment we support only one parallel runtime, OpenMP.
If we parallelize the outer loop of the following loop nest,
S0; for (int i = 0; i < N; i++) for (int j = 0; j < M; j++) S1(i, j); S2;
we will generate the following code (with different runtime function names):
S0; auto *values = storeValuesIntoStruct(); // Execute subfunction with multiple threads spawn_threads(subfunction, values); join_threads(); S2;
// This function is executed in parallel by different threads void subfunction(values) { while (auto *WorkItem = getWorkItem()) { int LB = WorkItem.begin(); int UB = WorkItem.end(); for (int i = LB; i < UB; i++) for (int j = 0; j < M; j++) S1(i, j); } cleanup_thread(); }
Definition at line 128 of file LoopGenerators.h.
|
inline |
Create a parallel loop generator for the current function.
Definition at line 131 of file LoopGenerators.h.
|
inlinevirtual |
Definition at line 138 of file LoopGenerators.h.
Value * ParallelLoopGenerator::createParallelLoop | ( | Value * | LB, |
Value * | UB, | ||
Value * | Stride, | ||
SetVector< Value * > & | Values, | ||
ValueMapT & | VMap, | ||
BasicBlock::iterator * | LoopBody | ||
) |
Create a parallel loop.
This function is the main function to automatically generate a parallel loop with all its components.
LB | The lower bound for the loop we parallelize. |
UB | The upper bound for the loop we parallelize. |
Stride | The stride of the loop we parallelize. |
Values | A set of LLVM-IR Values that should be available in the new loop body. |
VMap | A map to allow outside access to the new versions of the values in Values . |
LoopBody | A pointer to an iterator that is set to point to the body of the created loop. It should be used to insert instructions that form the actual loop body. |
Definition at line 176 of file LoopGenerators.cpp.
References Builder, createSubFn(), deployParallelExecution(), Function, LongType, storeValuesIntoStruct(), and polly::Value.
|
pure virtual |
Create the parallel subfunction.
Stride | The induction variable increment. |
Struct | A struct holding all values in Values . |
Values | A set of LLVM-IR Values that should be available in the new loop body. |
VMap | A map to allow outside access to the new versions of the values in Values . |
SubFn | The newly created subfunction is returned here. |
Implemented in polly::ParallelLoopGeneratorGOMP, and polly::ParallelLoopGeneratorKMP.
Referenced by createParallelLoop().
Function * ParallelLoopGenerator::createSubFnDefinition | ( | ) |
Create the definition of the parallel subfunction.
Definition at line 199 of file LoopGenerators.cpp.
References Builder, Function, polly::PollySkipFnAttr, and prepareSubFnDefinition().
Referenced by polly::ParallelLoopGeneratorGOMP::createSubFn(), and polly::ParallelLoopGeneratorKMP::createSubFn().
|
pure virtual |
Create the runtime library calls for spawn and join of the worker threads.
Additionally, places a call to the specified subfunction.
SubFn | The subfunction which holds the loop body. |
SubFnParam | The parameter for the subfunction (basically the struct filled with the outside values). |
LB | The lower bound for the loop we parallelize. |
UB | The upper bound for the loop we parallelize. |
Stride | The stride of the loop we parallelize. |
Implemented in polly::ParallelLoopGeneratorGOMP, and polly::ParallelLoopGeneratorKMP.
Referenced by createParallelLoop().
void ParallelLoopGenerator::extractValuesFromStruct | ( | SetVector< Value * > | Values, |
Type * | Ty, | ||
Value * | Struct, | ||
ValueMapT & | VMap | ||
) |
Extract all values from the Struct
and construct the mapping.
Values | The values which were stored in the struct. |
Struct | The struct holding all the values in Values . |
VMap | A map to associate every element of Values with the new llvm value loaded from the Struct . |
Definition at line 242 of file LoopGenerators.cpp.
References Builder, and polly::Value.
Referenced by polly::ParallelLoopGeneratorGOMP::createSubFn(), and polly::ParallelLoopGeneratorKMP::createSubFn().
|
inline |
Returns the DominatorTree for the generated subfunction.
Definition at line 186 of file LoopGenerators.h.
References SubFnDT.
|
inline |
Returns the LoopInfo for the generated subfunction.
Definition at line 189 of file LoopGenerators.h.
References SubFnLI.
|
pure virtual |
Prepare the definition of the parallel subfunction.
Creates the argument list and names them (as well as the subfunction).
F | A pointer to the (parallel) subfunction's parent function. |
Implemented in polly::ParallelLoopGeneratorGOMP, and polly::ParallelLoopGeneratorKMP.
References Function.
Referenced by createSubFnDefinition().
AllocaInst * ParallelLoopGenerator::storeValuesIntoStruct | ( | SetVector< Value * > & | Values | ) |
Create a struct for all Values
and store them in there.
Values | The values which should be stored in the struct. |
Definition at line 216 of file LoopGenerators.cpp.
References Builder, and polly::Value.
Referenced by createParallelLoop().
|
protected |
The IR builder we use to create instructions.
Definition at line 163 of file LoopGenerators.h.
Referenced by polly::ParallelLoopGeneratorGOMP::createCallCleanupThread(), polly::ParallelLoopGeneratorKMP::createCallDispatchInit(), polly::ParallelLoopGeneratorKMP::createCallDispatchNext(), polly::ParallelLoopGeneratorGOMP::createCallGetWorkItem(), polly::ParallelLoopGeneratorKMP::createCallGlobalThreadNum(), polly::ParallelLoopGeneratorGOMP::createCallJoinThreads(), polly::ParallelLoopGeneratorKMP::createCallPushNumThreads(), polly::ParallelLoopGeneratorGOMP::createCallSpawnThreads(), polly::ParallelLoopGeneratorKMP::createCallSpawnThreads(), polly::ParallelLoopGeneratorKMP::createCallStaticFini(), polly::ParallelLoopGeneratorKMP::createCallStaticInit(), createParallelLoop(), polly::ParallelLoopGeneratorKMP::createSourceLocation(), polly::ParallelLoopGeneratorGOMP::createSubFn(), polly::ParallelLoopGeneratorKMP::createSubFn(), createSubFnDefinition(), polly::ParallelLoopGeneratorGOMP::deployParallelExecution(), polly::ParallelLoopGeneratorKMP::deployParallelExecution(), extractValuesFromStruct(), polly::ParallelLoopGeneratorGOMP::prepareSubFnDefinition(), polly::ParallelLoopGeneratorKMP::prepareSubFnDefinition(), and storeValuesIntoStruct().
|
protected |
Debug location for generated code without direct link to any specific line.
We only set the DebugLoc where the IR Verifier requires us to. Otherwise, absent debug location for optimized code should be fine.
Definition at line 182 of file LoopGenerators.h.
Referenced by polly::ParallelLoopGeneratorGOMP::createCallCleanupThread(), polly::ParallelLoopGeneratorKMP::createCallDispatchInit(), polly::ParallelLoopGeneratorKMP::createCallDispatchNext(), polly::ParallelLoopGeneratorGOMP::createCallGetWorkItem(), polly::ParallelLoopGeneratorKMP::createCallGlobalThreadNum(), polly::ParallelLoopGeneratorGOMP::createCallJoinThreads(), polly::ParallelLoopGeneratorKMP::createCallPushNumThreads(), polly::ParallelLoopGeneratorGOMP::createCallSpawnThreads(), polly::ParallelLoopGeneratorKMP::createCallSpawnThreads(), polly::ParallelLoopGeneratorKMP::createCallStaticFini(), polly::ParallelLoopGeneratorKMP::createCallStaticInit(), and polly::ParallelLoopGeneratorGOMP::deployParallelExecution().
|
protected |
The type of a "long" on this hardware used for backend calls.
Definition at line 172 of file LoopGenerators.h.
Referenced by polly::ParallelLoopGeneratorKMP::createCallDispatchInit(), polly::ParallelLoopGeneratorGOMP::createCallSpawnThreads(), polly::ParallelLoopGeneratorKMP::createCallStaticInit(), createParallelLoop(), polly::ParallelLoopGeneratorGOMP::createSubFn(), polly::ParallelLoopGeneratorKMP::createSubFn(), polly::ParallelLoopGeneratorKMP::is64BitArch(), and polly::ParallelLoopGeneratorKMP::prepareSubFnDefinition().
|
protected |
The current module.
Definition at line 175 of file LoopGenerators.h.
Referenced by polly::ParallelLoopGeneratorGOMP::createCallCleanupThread(), polly::ParallelLoopGeneratorKMP::createCallDispatchInit(), polly::ParallelLoopGeneratorKMP::createCallDispatchNext(), polly::ParallelLoopGeneratorGOMP::createCallGetWorkItem(), polly::ParallelLoopGeneratorKMP::createCallGlobalThreadNum(), polly::ParallelLoopGeneratorGOMP::createCallJoinThreads(), polly::ParallelLoopGeneratorKMP::createCallPushNumThreads(), polly::ParallelLoopGeneratorGOMP::createCallSpawnThreads(), polly::ParallelLoopGeneratorKMP::createCallSpawnThreads(), polly::ParallelLoopGeneratorKMP::createCallStaticFini(), polly::ParallelLoopGeneratorKMP::createCallStaticInit(), polly::ParallelLoopGeneratorKMP::createSourceLocation(), polly::ParallelLoopGeneratorGOMP::prepareSubFnDefinition(), and polly::ParallelLoopGeneratorKMP::prepareSubFnDefinition().
|
protected |
The dominance tree for the generated subfunction.
Definition at line 169 of file LoopGenerators.h.
Referenced by polly::ParallelLoopGeneratorGOMP::createSubFn(), polly::ParallelLoopGeneratorKMP::createSubFn(), and getCalleeDominatorTree().
|
protected |
The loop info for the generated subfunction.
Definition at line 166 of file LoopGenerators.h.
Referenced by polly::ParallelLoopGeneratorGOMP::createSubFn(), polly::ParallelLoopGeneratorKMP::createSubFn(), and getCalleeLoopInfo().