59 namespace threadblock {
    66   typename WarpMmaSimt_,
    76   static const int kPartitionsK = Shape::kK / WarpMmaSimt::Shape::kK;
    79   using LayoutC = 
typename WarpMmaSimt::LayoutC;
    88     typename WarpMmaSimt::Shape,
    89     typename WarpMmaSimt::Policy,
   101     typename WarpMmaSimt::Shape,
   102     typename WarpMmaSimt::ThreadMma,
   104     typename WarpMmaSimt::Policy
   108     typename WarpMmaSimt::Shape,
   109     typename WarpMmaSimt::ThreadMma,
   112     typename WarpMmaSimt::Policy
   116     typename OutputTileThreadMap::CompactedThreadMap,
   121   using Padding = 
typename WarpTileIterator::Padding;
 Templates implementing loading of tiles from pitch-linear rank=2 tensors. 
Definition: aligned_buffer.h:35
Defines sensible defaults for epilogues for SimtOps. 
Definition: default_epilogue_simt.h:70
static int const kElementsPerAccess
Definition: default_epilogue_simt.h:75
Epilogue for threadblock scoped GEMMs using Tensor Ops. 
static const int kPartitionsK
Definition: default_epilogue_simt.h:76
Defines common types used for all GEMM-like operators. 
Functor performing conversion operations used by epilogues. 
cutlass::epilogue::threadblock::SharedLoadIterator< typename OutputTileThreadMap::CompactedThreadMap, ElementAccumulator > SharedLoadIterator
Definition: default_epilogue_simt.h:118
WarpMmaSimt_ WarpMmaSimt
Definition: default_epilogue_simt.h:73
Defines the optimal thread map for SIMT accumulator layouts. 
Definition: default_thread_map_simt.h:52
Statically sized array of elements that accommodates all CUTLASS-supported numeric types and is safe ...
OutputOp_ OutputOp
Definition: default_epilogue_simt.h:74
Functor performing linear combination operations used by epilogues. 
cutlass::epilogue::warp::FragmentIteratorSimt< typename WarpMmaSimt::Shape, typename WarpMmaSimt::ThreadMma, layout::RowMajor, typename WarpMmaSimt::Policy > AccumulatorFragmentIterator
Definition: default_epilogue_simt.h:105
typename WarpMmaSimt::LayoutC LayoutC
Definition: default_epilogue_simt.h:79
Fragment iterator for SIMT accumulator arrangements. 
Definition: fragment_iterator_simt.h:60
typename WarpMmaSimt::ElementC ElementAccumulator
Definition: default_epilogue_simt.h:80
Top-level include for all CUTLASS numeric types. 
This defines a "fragment" iterator for visiting the fragments of an accumulator tile that participate...
Epilogue for threadblock scoped GEMMs using Tensor Ops. 
typename WarpTileIterator::Padding Padding
Hard-coded padding elements added. 
Definition: default_epilogue_simt.h:121
Shape_ Shape
Definition: default_epilogue_simt.h:72
Mapping function for row-major matrices. 
Definition: layout/matrix.h:50
Epilogue operator without splitk. 
Definition: epilogue.h:74
Epilogue for threadblock scoped GEMMs using Tensor Ops. 
Definition: epilogue/threadblock/predicated_tile_iterator.h:65
typename OutputOp::ElementOutput ElementOutput
Definition: default_epilogue_simt.h:78
cutlass::epilogue::warp::TileIteratorSimt< typename WarpMmaSimt::Shape, typename WarpMmaSimt::ThreadMma, ElementAccumulator, layout::RowMajor, typename WarpMmaSimt::Policy > WarpTileIterator
Definition: default_epilogue_simt.h:113
Definition: shared_load_iterator.h:61
cutlass::epilogue::threadblock::PredicatedTileIterator< OutputTileThreadMap, ElementOutput > OutputTileIterator
Definition: default_epilogue_simt.h:98
Functor performing reduction operations used by epilogues. 
Basic include for CUTLASS. 
Template for reading and writing tiles of accumulators to shared memory. 
Definition: tile_iterator_simt.h:55
typename cutlass::epilogue::threadblock::DefaultThreadMapSimt< Shape, typename WarpMmaSimt::Shape, typename WarpMmaSimt::Policy, kPartitionsK, ElementOutput, kElementsPerAccess >::Type OutputTileThreadMap
Definition: default_epilogue_simt.h:93