Name Date Size #Lines LOC

..--

android/H25-Apr-2025-987778

cmake/H25-Apr-2025-4839

include/H25-Apr-2025-375270

results/H25-Apr-2025-6,3595,188

snap/H25-Apr-2025-2622

src/H25-Apr-2025-3,1912,363

.gitignoreH A D25-Apr-202515 22

.gitmodulesH A D25-Apr-2025148 54

.travis.ymlH A D25-Apr-20251.7 KiB4841

Android.bpH A D25-Apr-20251.2 KiB4845

CMakeLists.txtH A D25-Apr-20252.4 KiB7964

CREDITSH A D25-Apr-202586 31

LICENSEH A D25-Apr-202511.1 KiB202169

METADATAH A D25-Apr-2025391 1714

MODULE_LICENSE_APACHE2HD25-Apr-20250

OWNERSH A D25-Apr-2025177 97

README.mdH A D25-Apr-20251.7 KiB6753

README.md

1# clpeak
2
3[![Build Status](https://app.travis-ci.com/krrishnarraj/clpeak.svg?branch=master)](https://app.travis-ci.com/github/krrishnarraj/clpeak)
4[![Snap Status](https://snapcraft.io/clpeak/badge.svg)](https://snapcraft.io/clpeak)
5
6A synthetic benchmarking tool to measure peak capabilities of opencl devices. It only measures the peak metrics that can be achieved using vector operations and does not represent a real-world use case
7
8## Building
9
10```console
11git submodule update --init --recursive --remote
12mkdir build
13cd build
14cmake ..
15cmake --build .
16```
17
18## Sample
19
20```text
21Platform: NVIDIA CUDA
22  Device: Tesla V100-SXM2-16GB
23    Driver version  : 390.77 (Linux x64)
24    Compute units   : 80
25    Clock frequency : 1530 MHz
26
27    Global memory bandwidth (GBPS)
28      float   : 767.48
29      float2  : 810.81
30      float4  : 843.06
31      float8  : 726.12
32      float16 : 735.98
33
34    Single-precision compute (GFLOPS)
35      float   : 15680.96
36      float2  : 15674.50
37      float4  : 15645.58
38      float8  : 15583.27
39      float16 : 15466.50
40
41    No half precision support! Skipped
42
43    Double-precision compute (GFLOPS)
44      double   : 7859.49
45      double2  : 7849.96
46      double4  : 7832.96
47      double8  : 7799.82
48      double16 : 7740.88
49
50    Integer compute (GIOPS)
51      int   : 15653.47
52      int2  : 15654.40
53      int4  : 15655.21
54      int8  : 15659.04
55      int16 : 15608.65
56
57    Transfer bandwidth (GBPS)
58      enqueueWriteBuffer         : 10.64
59      enqueueReadBuffer          : 11.92
60      enqueueMapBuffer(for read) : 9.97
61        memcpy from mapped ptr   : 8.62
62      enqueueUnmap(after write)  : 11.04
63        memcpy to mapped ptr     : 9.16
64
65    Kernel launch latency : 7.22 us
66```
67