xref: /aosp_15_r20/external/llvm/docs/AMDGPUUsage.rst (revision 9880d6810fe72a1726cb53787c6711e909410d58)
1*9880d681SAndroid Build Coastguard Worker==============================
2*9880d681SAndroid Build Coastguard WorkerUser Guide for AMDGPU Back-end
3*9880d681SAndroid Build Coastguard Worker==============================
4*9880d681SAndroid Build Coastguard Worker
5*9880d681SAndroid Build Coastguard WorkerIntroduction
6*9880d681SAndroid Build Coastguard Worker============
7*9880d681SAndroid Build Coastguard Worker
8*9880d681SAndroid Build Coastguard WorkerThe AMDGPU back-end provides ISA code generation for AMD GPUs, starting with
9*9880d681SAndroid Build Coastguard Workerthe R600 family up until the current Volcanic Islands (GCN Gen 3).
10*9880d681SAndroid Build Coastguard Worker
11*9880d681SAndroid Build Coastguard Worker
12*9880d681SAndroid Build Coastguard WorkerConventions
13*9880d681SAndroid Build Coastguard Worker===========
14*9880d681SAndroid Build Coastguard Worker
15*9880d681SAndroid Build Coastguard WorkerAddress Spaces
16*9880d681SAndroid Build Coastguard Worker--------------
17*9880d681SAndroid Build Coastguard Worker
18*9880d681SAndroid Build Coastguard WorkerThe AMDGPU back-end uses the following address space mapping:
19*9880d681SAndroid Build Coastguard Worker
20*9880d681SAndroid Build Coastguard Worker   ============= ============================================
21*9880d681SAndroid Build Coastguard Worker   Address Space Memory Space
22*9880d681SAndroid Build Coastguard Worker   ============= ============================================
23*9880d681SAndroid Build Coastguard Worker   0             Private
24*9880d681SAndroid Build Coastguard Worker   1             Global
25*9880d681SAndroid Build Coastguard Worker   2             Constant
26*9880d681SAndroid Build Coastguard Worker   3             Local
27*9880d681SAndroid Build Coastguard Worker   4             Generic (Flat)
28*9880d681SAndroid Build Coastguard Worker   5             Region
29*9880d681SAndroid Build Coastguard Worker   ============= ============================================
30*9880d681SAndroid Build Coastguard Worker
31*9880d681SAndroid Build Coastguard WorkerThe terminology in the table, aside from the region memory space, is from the
32*9880d681SAndroid Build Coastguard WorkerOpenCL standard.
33*9880d681SAndroid Build Coastguard Worker
34*9880d681SAndroid Build Coastguard Worker
35*9880d681SAndroid Build Coastguard WorkerAssembler
36*9880d681SAndroid Build Coastguard Worker=========
37*9880d681SAndroid Build Coastguard Worker
38*9880d681SAndroid Build Coastguard WorkerThe assembler is currently considered experimental.
39*9880d681SAndroid Build Coastguard Worker
40*9880d681SAndroid Build Coastguard WorkerFor syntax examples look in test/MC/AMDGPU.
41*9880d681SAndroid Build Coastguard Worker
42*9880d681SAndroid Build Coastguard WorkerBelow some of the currently supported features (modulo bugs).  These
43*9880d681SAndroid Build Coastguard Workerall apply to the Southern Islands ISA, Sea Islands and Volcanic Islands
44*9880d681SAndroid Build Coastguard Workerare also supported but may be missing some instructions and have more bugs:
45*9880d681SAndroid Build Coastguard Worker
46*9880d681SAndroid Build Coastguard WorkerDS Instructions
47*9880d681SAndroid Build Coastguard Worker---------------
48*9880d681SAndroid Build Coastguard WorkerAll DS instructions are supported.
49*9880d681SAndroid Build Coastguard Worker
50*9880d681SAndroid Build Coastguard WorkerFLAT Instructions
51*9880d681SAndroid Build Coastguard Worker------------------
52*9880d681SAndroid Build Coastguard WorkerThese instructions are only present in the Sea Islands and Volcanic Islands
53*9880d681SAndroid Build Coastguard Workerinstruction set.  All FLAT instructions are supported for these architectures
54*9880d681SAndroid Build Coastguard Worker
55*9880d681SAndroid Build Coastguard WorkerMUBUF Instructions
56*9880d681SAndroid Build Coastguard Worker------------------
57*9880d681SAndroid Build Coastguard WorkerAll non-atomic MUBUF instructions are supported.
58*9880d681SAndroid Build Coastguard Worker
59*9880d681SAndroid Build Coastguard WorkerSMRD Instructions
60*9880d681SAndroid Build Coastguard Worker-----------------
61*9880d681SAndroid Build Coastguard WorkerOnly the s_load_dword* SMRD instructions are supported.
62*9880d681SAndroid Build Coastguard Worker
63*9880d681SAndroid Build Coastguard WorkerSOP1 Instructions
64*9880d681SAndroid Build Coastguard Worker-----------------
65*9880d681SAndroid Build Coastguard WorkerAll SOP1 instructions are supported.
66*9880d681SAndroid Build Coastguard Worker
67*9880d681SAndroid Build Coastguard WorkerSOP2 Instructions
68*9880d681SAndroid Build Coastguard Worker-----------------
69*9880d681SAndroid Build Coastguard WorkerAll SOP2 instructions are supported.
70*9880d681SAndroid Build Coastguard Worker
71*9880d681SAndroid Build Coastguard WorkerSOPC Instructions
72*9880d681SAndroid Build Coastguard Worker-----------------
73*9880d681SAndroid Build Coastguard WorkerAll SOPC instructions are supported.
74*9880d681SAndroid Build Coastguard Worker
75*9880d681SAndroid Build Coastguard WorkerSOPP Instructions
76*9880d681SAndroid Build Coastguard Worker-----------------
77*9880d681SAndroid Build Coastguard Worker
78*9880d681SAndroid Build Coastguard WorkerUnless otherwise mentioned, all SOPP instructions that have one or more
79*9880d681SAndroid Build Coastguard Workeroperands accept integer operands only.  No verification is performed
80*9880d681SAndroid Build Coastguard Workeron the operands, so it is up to the programmer to be familiar with the
81*9880d681SAndroid Build Coastguard Workerrange or acceptable values.
82*9880d681SAndroid Build Coastguard Worker
83*9880d681SAndroid Build Coastguard Workers_waitcnt
84*9880d681SAndroid Build Coastguard Worker^^^^^^^^^
85*9880d681SAndroid Build Coastguard Worker
86*9880d681SAndroid Build Coastguard Workers_waitcnt accepts named arguments to specify which memory counter(s) to
87*9880d681SAndroid Build Coastguard Workerwait for.
88*9880d681SAndroid Build Coastguard Worker
89*9880d681SAndroid Build Coastguard Worker.. code-block:: nasm
90*9880d681SAndroid Build Coastguard Worker
91*9880d681SAndroid Build Coastguard Worker   ; Wait for all counters to be 0
92*9880d681SAndroid Build Coastguard Worker   s_waitcnt 0
93*9880d681SAndroid Build Coastguard Worker
94*9880d681SAndroid Build Coastguard Worker   ; Equivalent to s_waitcnt 0.  Counter names can also be delimited by
95*9880d681SAndroid Build Coastguard Worker   ; '&' or ','.
96*9880d681SAndroid Build Coastguard Worker   s_waitcnt vmcnt(0) expcnt(0) lgkcmt(0)
97*9880d681SAndroid Build Coastguard Worker
98*9880d681SAndroid Build Coastguard Worker   ; Wait for vmcnt counter to be 1.
99*9880d681SAndroid Build Coastguard Worker   s_waitcnt vmcnt(1)
100*9880d681SAndroid Build Coastguard Worker
101*9880d681SAndroid Build Coastguard WorkerVOP1, VOP2, VOP3, VOPC Instructions
102*9880d681SAndroid Build Coastguard Worker-----------------------------------
103*9880d681SAndroid Build Coastguard Worker
104*9880d681SAndroid Build Coastguard WorkerAll 32-bit and 64-bit encodings should work.
105*9880d681SAndroid Build Coastguard Worker
106*9880d681SAndroid Build Coastguard WorkerThe assembler will automatically detect which encoding size to use for
107*9880d681SAndroid Build Coastguard WorkerVOP1, VOP2, and VOPC instructions based on the operands.  If you want to force
108*9880d681SAndroid Build Coastguard Workera specific encoding size, you can add an _e32 (for 32-bit encoding) or
109*9880d681SAndroid Build Coastguard Worker_e64 (for 64-bit encoding) suffix to the instruction.  Most, but not all
110*9880d681SAndroid Build Coastguard Workerinstructions support an explicit suffix.  These are all valid assembly
111*9880d681SAndroid Build Coastguard Workerstrings:
112*9880d681SAndroid Build Coastguard Worker
113*9880d681SAndroid Build Coastguard Worker.. code-block:: nasm
114*9880d681SAndroid Build Coastguard Worker
115*9880d681SAndroid Build Coastguard Worker   v_mul_i32_i24 v1, v2, v3
116*9880d681SAndroid Build Coastguard Worker   v_mul_i32_i24_e32 v1, v2, v3
117*9880d681SAndroid Build Coastguard Worker   v_mul_i32_i24_e64 v1, v2, v3
118*9880d681SAndroid Build Coastguard Worker
119*9880d681SAndroid Build Coastguard WorkerAssembler Directives
120*9880d681SAndroid Build Coastguard Worker--------------------
121*9880d681SAndroid Build Coastguard Worker
122*9880d681SAndroid Build Coastguard Worker.hsa_code_object_version major, minor
123*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
124*9880d681SAndroid Build Coastguard Worker
125*9880d681SAndroid Build Coastguard Worker*major* and *minor* are integers that specify the version of the HSA code
126*9880d681SAndroid Build Coastguard Workerobject that will be generated by the assembler.  This value will be stored
127*9880d681SAndroid Build Coastguard Workerin an entry of the .note section.
128*9880d681SAndroid Build Coastguard Worker
129*9880d681SAndroid Build Coastguard Worker.hsa_code_object_isa [major, minor, stepping, vendor, arch]
130*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
131*9880d681SAndroid Build Coastguard Worker
132*9880d681SAndroid Build Coastguard Worker*major*, *minor*, and *stepping* are all integers that describe the instruction
133*9880d681SAndroid Build Coastguard Workerset architecture (ISA) version of the assembly program.
134*9880d681SAndroid Build Coastguard Worker
135*9880d681SAndroid Build Coastguard Worker*vendor* and *arch* are quoted strings.  *vendor* should always be equal to
136*9880d681SAndroid Build Coastguard Worker"AMD" and *arch* should always be equal to "AMDGPU".
137*9880d681SAndroid Build Coastguard Worker
138*9880d681SAndroid Build Coastguard WorkerIf no arguments are specified, then the assembler will derive the ISA version,
139*9880d681SAndroid Build Coastguard Worker*vendor*, and *arch* from the value of the -mcpu option that is passed to the
140*9880d681SAndroid Build Coastguard Workerassembler.
141*9880d681SAndroid Build Coastguard Worker
142*9880d681SAndroid Build Coastguard WorkerISA version, *vendor*, and *arch* will all be stored in a single entry of the
143*9880d681SAndroid Build Coastguard Worker.note section.
144*9880d681SAndroid Build Coastguard Worker
145*9880d681SAndroid Build Coastguard Worker.amd_kernel_code_t
146*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^
147*9880d681SAndroid Build Coastguard Worker
148*9880d681SAndroid Build Coastguard WorkerThis directive marks the beginning of a list of key / value pairs that are used
149*9880d681SAndroid Build Coastguard Workerto specify the amd_kernel_code_t object that will be emitted by the assembler.
150*9880d681SAndroid Build Coastguard WorkerThe list must be terminated by the *.end_amd_kernel_code_t* directive.  For
151*9880d681SAndroid Build Coastguard Workerany amd_kernel_code_t values that are unspecified a default value will be
152*9880d681SAndroid Build Coastguard Workerused.  The default value for all keys is 0, with the following exceptions:
153*9880d681SAndroid Build Coastguard Worker
154*9880d681SAndroid Build Coastguard Worker- *kernel_code_version_major* defaults to 1.
155*9880d681SAndroid Build Coastguard Worker- *machine_kind* defaults to 1.
156*9880d681SAndroid Build Coastguard Worker- *machine_version_major*, *machine_version_minor*, and
157*9880d681SAndroid Build Coastguard Worker  *machine_version_stepping* are derived from the value of the -mcpu option
158*9880d681SAndroid Build Coastguard Worker  that is passed to the assembler.
159*9880d681SAndroid Build Coastguard Worker- *kernel_code_entry_byte_offset* defaults to 256.
160*9880d681SAndroid Build Coastguard Worker- *wavefront_size* defaults to 6.
161*9880d681SAndroid Build Coastguard Worker- *kernarg_segment_alignment*, *group_segment_alignment*, and
162*9880d681SAndroid Build Coastguard Worker  *private_segment_alignment* default to 4.  Note that alignments are specified
163*9880d681SAndroid Build Coastguard Worker  as a power of two, so a value of **n** means an alignment of 2^ **n**.
164*9880d681SAndroid Build Coastguard Worker
165*9880d681SAndroid Build Coastguard WorkerThe *.amd_kernel_code_t* directive must be placed immediately after the
166*9880d681SAndroid Build Coastguard Workerfunction label and before any instructions.
167*9880d681SAndroid Build Coastguard Worker
168*9880d681SAndroid Build Coastguard WorkerFor a full list of amd_kernel_code_t keys, see the examples in
169*9880d681SAndroid Build Coastguard Workertest/CodeGen/AMDGPU/hsa.s.  For an explanation of the meanings of the different
170*9880d681SAndroid Build Coastguard Workerkeys, see the comments in lib/Target/AMDGPU/AmdKernelCodeT.h
171*9880d681SAndroid Build Coastguard Worker
172*9880d681SAndroid Build Coastguard WorkerHere is an example of a minimal amd_kernel_code_t specification:
173*9880d681SAndroid Build Coastguard Worker
174*9880d681SAndroid Build Coastguard Worker.. code-block:: nasm
175*9880d681SAndroid Build Coastguard Worker
176*9880d681SAndroid Build Coastguard Worker   .hsa_code_object_version 1,0
177*9880d681SAndroid Build Coastguard Worker   .hsa_code_object_isa
178*9880d681SAndroid Build Coastguard Worker
179*9880d681SAndroid Build Coastguard Worker   .hsatext
180*9880d681SAndroid Build Coastguard Worker   .globl  hello_world
181*9880d681SAndroid Build Coastguard Worker   .p2align 8
182*9880d681SAndroid Build Coastguard Worker   .amdgpu_hsa_kernel hello_world
183*9880d681SAndroid Build Coastguard Worker
184*9880d681SAndroid Build Coastguard Worker   hello_world:
185*9880d681SAndroid Build Coastguard Worker
186*9880d681SAndroid Build Coastguard Worker      .amd_kernel_code_t
187*9880d681SAndroid Build Coastguard Worker         enable_sgpr_kernarg_segment_ptr = 1
188*9880d681SAndroid Build Coastguard Worker         is_ptr64 = 1
189*9880d681SAndroid Build Coastguard Worker         compute_pgm_rsrc1_vgprs = 0
190*9880d681SAndroid Build Coastguard Worker         compute_pgm_rsrc1_sgprs = 0
191*9880d681SAndroid Build Coastguard Worker         compute_pgm_rsrc2_user_sgpr = 2
192*9880d681SAndroid Build Coastguard Worker         kernarg_segment_byte_size = 8
193*9880d681SAndroid Build Coastguard Worker         wavefront_sgpr_count = 2
194*9880d681SAndroid Build Coastguard Worker         workitem_vgpr_count = 3
195*9880d681SAndroid Build Coastguard Worker     .end_amd_kernel_code_t
196*9880d681SAndroid Build Coastguard Worker
197*9880d681SAndroid Build Coastguard Worker     s_load_dwordx2 s[0:1], s[0:1] 0x0
198*9880d681SAndroid Build Coastguard Worker     v_mov_b32 v0, 3.14159
199*9880d681SAndroid Build Coastguard Worker     s_waitcnt lgkmcnt(0)
200*9880d681SAndroid Build Coastguard Worker     v_mov_b32 v1, s0
201*9880d681SAndroid Build Coastguard Worker     v_mov_b32 v2, s1
202*9880d681SAndroid Build Coastguard Worker     flat_store_dword v[1:2], v0
203*9880d681SAndroid Build Coastguard Worker     s_endpgm
204*9880d681SAndroid Build Coastguard Worker   .Lfunc_end0:
205*9880d681SAndroid Build Coastguard Worker        .size   hello_world, .Lfunc_end0-hello_world
206