xref: /aosp_15_r20/external/bcc/tools/kvmexit_example.txt (revision 387f9dfdfa2baef462e92476d413c7bc2470293e)
1Demonstrations of kvm exit reasons, the Linux eBPF/bcc version.
2
3
4Considering virtual machines' frequent exits can cause performance problems,
5this tool aims to locate the frequent exited reasons and then find solutions
6to reduce or even avoid the exit, by displaying the detail exit reasons and
7the counts of each vm exit for all vms running on one physical machine.
8
9
10Features of this tool
11=====================
12
13- Although there is a patch: [KVM: x86: add full vm-exit reason debug entries]
14  (https://patchwork.kernel.org/project/kvm/patch/[email protected]/)
15  trying to fill more vm-exit reason debug entries, just as the comments said,
16  the code allocates lots of memory that may never be consumed, misses some
17  arch-specific kvm causes, and can not do kernel aggregation. Instead bcc, as
18  a user space tool, can implement all these functions more easily and flexibly.
19- The bcc python logic could provide nice kernel aggregation and custom output,
20  like collpasing all tids for one pid (e.i. one vm's qemu process id) with exit
21  reasons sorted in descending order. For more information, see the following
22  #USAGE message.
23- The bpf in-kernel percpu_array and percpu_cache further improves performance.
24  For more information, see the following #Help to understand.
25
26
27Limited
28=======
29
30In view of the hardware-assisted virtualization technology of
31different architectures, currently we only adapt on vmx in intel.
32And the amd feature is on the road..
33
34
35Example output:
36===============
37
38# ./kvmexit.py
39Display kvm exit reasons and statistics for all threads... Hit Ctrl-C to end.
40PID      TID      KVM_EXIT_REASON                     COUNT
41^C1273551  1273568  EXIT_REASON_HLT                     12
421273551  1273568  EXIT_REASON_MSR_WRITE               6
431274253  1274261  EXIT_REASON_EXTERNAL_INTERRUPT      1
441274253  1274261  EXIT_REASON_HLT                     12
451274253  1274261  EXIT_REASON_MSR_WRITE               4
46
47# ./kvmexit.py 6
48Display kvm exit reasons and statistics for all threads after sleeping 6 secs.
49PID      TID      KVM_EXIT_REASON                     COUNT
501273903  1273922  EXIT_REASON_EXTERNAL_INTERRUPT      175
511273903  1273922  EXIT_REASON_CPUID                   10
521273903  1273922  EXIT_REASON_HLT                     6043
531273903  1273922  EXIT_REASON_IO_INSTRUCTION          24
541273903  1273922  EXIT_REASON_MSR_WRITE               15025
551273903  1273922  EXIT_REASON_PAUSE_INSTRUCTION       11
561273903  1273922  EXIT_REASON_EOI_INDUCED             12
571273903  1273922  EXIT_REASON_EPT_VIOLATION           6
581273903  1273922  EXIT_REASON_EPT_MISCONFIG           380
591273903  1273922  EXIT_REASON_PREEMPTION_TIMER        194
601273551  1273568  EXIT_REASON_EXTERNAL_INTERRUPT      18
611273551  1273568  EXIT_REASON_HLT                     989
621273551  1273568  EXIT_REASON_IO_INSTRUCTION          10
631273551  1273568  EXIT_REASON_MSR_WRITE               2205
641273551  1273568  EXIT_REASON_PAUSE_INSTRUCTION       1
651273551  1273568  EXIT_REASON_EOI_INDUCED             5
661273551  1273568  EXIT_REASON_EPT_MISCONFIG           61
671273551  1273568  EXIT_REASON_PREEMPTION_TIMER        14
68
69# ./kvmexit.py -p 1273795 5
70Display kvm exit reasons and statistics for PID 1273795 after sleeping 5 secs.
71KVM_EXIT_REASON                     COUNT
72MSR_WRITE                           13467
73HLT                                 5060
74PREEMPTION_TIMER                    345
75EPT_MISCONFIG                       264
76EXTERNAL_INTERRUPT                  169
77EPT_VIOLATION                       18
78PAUSE_INSTRUCTION                   6
79IO_INSTRUCTION                      4
80EOI_INDUCED                         2
81
82# ./kvmexit.py -p 1273795 5 -a
83Display kvm exit reasons and statistics for PID 1273795 and its all threads after sleeping 5 secs.
84TID      KVM_EXIT_REASON                     COUNT
851273819  EXTERNAL_INTERRUPT                  64
861273819  HLT                                 2802
871273819  IO_INSTRUCTION                      4
881273819  MSR_WRITE                           7196
891273819  PAUSE_INSTRUCTION                   2
901273819  EOI_INDUCED                         2
911273819  EPT_VIOLATION                       6
921273819  EPT_MISCONFIG                       162
931273819  PREEMPTION_TIMER                    194
941273820  EXTERNAL_INTERRUPT                  78
951273820  HLT                                 2054
961273820  MSR_WRITE                           5199
971273820  EPT_VIOLATION                       2
981273820  EPT_MISCONFIG                       77
991273820  PREEMPTION_TIMER                    102
100
101# ./kvmexit.py -p 1273795 -v 0
102Display kvm exit reasons and statistics for PID 1273795 VCPU 0... Hit Ctrl-C to end.
103KVM_EXIT_REASON                     COUNT
104^CMSR_WRITE                           2076
105HLT                                 795
106PREEMPTION_TIMER                    86
107EXTERNAL_INTERRUPT                  20
108EPT_MISCONFIG                       10
109PAUSE_INSTRUCTION                   2
110IO_INSTRUCTION                      2
111EPT_VIOLATION                       1
112EOI_INDUCED                         1
113
114# ./kvmexit.py -p 1273795 -v 0 4
115Display kvm exit reasons and statistics for PID 1273795 VCPU 0 after sleeping 4 secs.
116KVM_EXIT_REASON                     COUNT
117MSR_WRITE                           4726
118HLT                                 1827
119PREEMPTION_TIMER                    78
120EPT_MISCONFIG                       67
121EXTERNAL_INTERRUPT                  28
122IO_INSTRUCTION                      4
123EOI_INDUCED                         2
124PAUSE_INSTRUCTION                   2
125
126# ./kvmexit.py -p 1273795 -v 4 4
127Traceback (most recent call last):
128  File "tools/kvmexit.py", line 306, in <module>
129      raise Exception("There's no v%s for PID %d." % (tgt_vcpu, args.pid))
130      Exception: There's no vCPU 4 for PID 1273795.
131
132# ./kvmexit.py -t 1273819 10
133Display kvm exit reasons and statistics for TID 1273819 after sleeping 10 secs.
134KVM_EXIT_REASON                     COUNT
135MSR_WRITE                           13318
136HLT                                 5274
137EPT_MISCONFIG                       263
138PREEMPTION_TIMER                    171
139EXTERNAL_INTERRUPT                  109
140IO_INSTRUCTION                      8
141PAUSE_INSTRUCTION                   5
142EOI_INDUCED                         4
143EPT_VIOLATION                       2
144
145# ./kvmexit.py -T '1273820,1273819'
146Display kvm exit reasons and statistics for TIDS ['1273820', '1273819']... Hit Ctrl-C to end.
147TIDS     KVM_EXIT_REASON                     COUNT
148^C1273819  EXTERNAL_INTERRUPT                  300
1491273819  HLT                                 13718
1501273819  IO_INSTRUCTION                      26
1511273819  MSR_WRITE                           37457
1521273819  PAUSE_INSTRUCTION                   13
1531273819  EOI_INDUCED                         13
1541273819  EPT_VIOLATION                       53
1551273819  EPT_MISCONFIG                       654
1561273819  PREEMPTION_TIMER                    958
1571273820  EXTERNAL_INTERRUPT                  212
1581273820  HLT                                 9002
1591273820  MSR_WRITE                           25495
1601273820  PAUSE_INSTRUCTION                   2
1611273820  EPT_VIOLATION                       64
1621273820  EPT_MISCONFIG                       396
1631273820  PREEMPTION_TIMER                    268
164
165
166Help to understand
167==================
168
169We use a PERCPU_ARRAY: pcpuArrayA and a percpu_hash: hashA to collaboratively
170store each kvm exit reason and its count. The reason is there exists a rule when
171one vcpu exits and re-enters, it tends to continue to run on the same physical
172cpu (pcpu as follows) as the last cycle, which is also called 'cache hit'. Thus
173we turn to use a PERCPU_ARRAY to record the 'cache hit' situation to speed
174things up; and for other cases, then use a percpu_hash.
175
176BTW, we originally use a common hash to do this, with a u64(exit_reason)
177key and a struct exit_info {tgid_pid, exit_reason} value. But due to
178the big lock in bpf_hash, each updating is quite performance consuming.
179
180Now imagine here is a pid_tgidA (vcpu A) exits and is going to run on
181pcpuArrayA, the BPF code flow is as follows:
182
183               pid_tgidA keeps running on the same pcpu
184                        //               \\
185                       //                 \\
186                      // Y               N \\
187                     //                     \\
188             a. cache_hit               b. cache_miss
189(cacheA's pid_tgid matches pid_tgidA)        ||
190                  |                          ||
191                  |                          ||
192   "increase percpu exit_ct and return"      ||
193            [*Note*]                         ||
194                             pid_tgidA ever been exited on pcpuArrayA?
195                                           //   \\
196                                          //     \\
197                                         //       \\
198                                        // Y     N \\
199                                       //           \\
200                          b.a load_last_hashA   b.b initialize_hashA_with_zero
201                                          \       /
202                                           \     /
203                                            \   /
204                                      "increase percpu exit_ct"
205                                             ||
206                                             ||
207                           is another pid_tgid been running on pcpuArrayA?
208                                        //          \\
209                                       // Y        N \\
210                                      //              \\
211                       b.*.a save_theLastHit_hashB    do_nothing
212                                           \\       //
213                                            \\     //
214                                             \\   //
215                                       b.* save_to_pcpuArrayA
216
217
218[*Note*] we do not update the table in above "a.", in case the vcpu hit the same
219pcpu again when exits next time, instead we only update until this pcpu is not
220hitted by the same tgidpid(vcpu) again, which is in "b.*.a" and "b.*".
221
222
223USAGE message:
224==============
225
226# ./kvmexit.py -h
227usage: kvmexit.py [-h] [-p PID [-v VCPU | -a] ] [-t TID | -T 'TID1,TID2'] [duration]
228
229Display kvm_exit_reason and its statistics at a timed interval
230
231optional arguments:
232  -h, --help            show this help message and exit
233  -p PID, --pid PID     display process with this PID only, collpase all tids with exit reasons sorted in descending order
234  -v VCPU, --v VCPU     display this VCPU only for this PID
235  -a, --alltids         display all TIDS for this PID
236  -t TID, --tid TID     display thread with this TID only with exit reasons sorted in descending order
237  -T 'TID1,TID2', --tids 'TID1,TID2'
238                        display threads for a union like {395490, 395491}
239  duration              duration of display, after sleeping several seconds
240
241examples:
242    ./kvmexit                         # Display kvm_exit_reason and its statistics in real-time until Ctrl-C
243    ./kvmexit 5                       # Display in real-time after sleeping 5s
244    ./kvmexit -p 3195281              # Collpase all tids for pid 3195281 with exit reasons sorted in descending order
245    ./kvmexit -p 3195281 20           # Collpase all tids for pid 3195281 with exit reasons sorted in descending order, and display after sleeping 20s
246    ./kvmexit -p 3195281 -v 0         # Display only vcpu0 for pid 3195281, descending sort by default
247    ./kvmexit -p 3195281 -a           # Display all tids for pid 3195281
248    ./kvmexit -t 395490               # Display only for tid 395490 with exit reasons sorted in descending order
249    ./kvmexit -t 395490 20            # Display only for tid 395490 with exit reasons sorted in descending order after sleeping 20s
250    ./kvmexit -T '395490,395491'      # Display for a union like {395490, 395491}
251