51 namespace threadblock {
    81                       ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_,
    96     Shape::kM / WarpShape::kM,
    97     Shape::kN / WarpShape::kN,
    98     Shape::kK / WarpShape::kK
   103     !(Shape::kM % WarpShape::kM) &&
   104     !(Shape::kN % WarpShape::kN),
   105     "Threadblock-scoped GEMM should be divisible by warp-scoped GEMM size."   112   static int const kThreads = WarpCount::kCount * kWarpSize;
 Describes the size of a matrix tile. 
Definition: matrix_shape.h:42
Definition: aligned_buffer.h:35
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::WarpShape WarpShape_ WarpShape
Definition: default_mma_core_sm50.h:84
Query the number of threads per warp. 
Definition: gemm/warp/mma.h:43
Definition: default_mma_core.h:90
Templates implementing how threads are mapped to a given tile. 
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::MmaPolicy MmaPolicy< WarpMma, MatrixShape< 0, 0 >, MatrixShape< 0, 0 >, WarpCount::kK > MmaPolicy
Policy used to define MmaPipelined. 
Definition: default_mma_core_sm50.h:190
Structure to compute the matrix product targeting CUDA cores and SIMT math instructions. 
Definition: mma_simt.h:74
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::ElementB ElementB_ ElementB
Definition: default_mma_core_sm50.h:88
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::OperatorClass arch::OpClassSimt OperatorClass
Definition: default_mma_core_sm50.h:92
Mapping function for column-major matrices. 
Definition: layout/matrix.h:142
Template defining a shape used by pitch-linear operators. 
Definition: pitch_linear.h:43
Statically sized array of elements that accommodates all CUTLASS-supported numeric types and is safe ...
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::ElementC ElementC_ ElementC
Definition: default_mma_core_sm50.h:90
Describes the arrangement and configuration of per-lane operations in warp-level matrix multiply...
Definition: mma_simt_policy.h:46
Defines a Shape template for matrix tiles. 
Defines the size of an element in bits. 
Definition: numeric_types.h:42
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::InstructionShape InstructionShape_ InstructionShape
Definition: default_mma_core_sm50.h:85
Defines basic properties needed by CTA-level GEMMs assuming expectations about data layout of the glo...
Top-level include for all CUTLASS numeric types. 
Shape of a matrix multiply-add operation. 
Definition: include/cutlass/gemm/gemm.h:57
Mapping function for row-major matrices. 
Definition: layout/matrix.h:50
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::ElementA ElementA_ ElementA
Definition: default_mma_core_sm50.h:86
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::LayoutC LayoutC_ LayoutC
Definition: default_mma_core_sm50.h:91
Templates implementing storing of tiles from pitch-linear rank=2 tensors. 
Defines layout functions used by TensorRef and derived classes. 
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::Shape Shape_ Shape
Definition: default_mma_core_sm50.h:83
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::WarpMma cutlass::gemm::warp::MmaSimt< WarpShape, ElementA, SmemLayoutA, ElementB, SmemLayoutB, ElementC, LayoutC, warp::MmaSimtPolicy< MatrixShape< 4, 8 >, layout::RowMajorInterleaved< 2 >, GemmShape< 128/sizeof_bits< ElementA >::value, 128/sizeof_bits< ElementB >::value, 1 > > > > WarpMma
Definition: default_mma_core_sm50.h:182
Templates implementing warp-level matrix multiply-accumulate operations. 
Basic include for CUTLASS. 
Definition: layout/matrix.h:237