1Continuous Integration 2====================== 3 4GitLab CI 5--------- 6 7GitLab provides a convenient framework for running commands in response to Git pushes. 8We use it to test merge requests (MRs) before merging them (pre-merge testing), 9as well as post-merge testing, for everything that hits ``main`` 10(this is necessary because we still allow commits to be pushed outside of MRs, 11and even then the MR CI runs in the forked repository, which might have been 12modified and thus is unreliable). 13 14The CI runs a number of tests, from trivial build-testing to complex GPU rendering: 15 16- Build testing for a number of configurations and platforms 17- Sanity checks (``meson test``) 18- Most drivers are also tested using several test suites, such as the 19 `Vulkan/GL/GLES conformance test suite <https://github.com/KhronosGroup/VK-GL-CTS>`__, 20 `Piglit <https://gitlab.freedesktop.org/mesa/piglit>`__, and others. 21- Replay of application traces 22 23A typical run takes between 20 and 30 minutes, although it can go up very quickly 24if the GitLab runners are overwhelmed, which happens sometimes. When it does happen, 25not much can be done besides waiting it out, or cancel it. 26You can do your part by only running the jobs you care about by using `our 27tool <#running-specific-ci-jobs>`__. 28 29Due to limited resources, we currently do not run the CI automatically 30on every push; instead, we only run it automatically once the MR has 31been assigned to ``Marge``, our merge bot. 32 33If you're interested in the details, the main configuration file is ``.gitlab-ci.yml``, 34and it references a number of other files in ``.gitlab-ci/``. 35 36If the GitLab CI doesn't seem to be running on your fork (or MRs, as they run 37in the context of your fork), you should check the "Settings" of your fork. 38Under "CI / CD" → "General pipelines", make sure "Custom CI config path" is 39empty (or set to the default ``.gitlab-ci.yml``), and that the 40"Public pipelines" box is checked. 41 42If you're having issues with the GitLab CI, your best bet is to ask 43about it on ``#freedesktop`` on OFTC and tag `Daniel Stone 44<https://gitlab.freedesktop.org/daniels>`__ (``daniels`` on IRC) or 45`Emma Anholt <https://gitlab.freedesktop.org/anholt>`__ (``anholt`` on 46IRC). 47 48The three GitLab CI systems currently integrated are: 49 50 51.. toctree:: 52 :maxdepth: 1 53 54 bare-metal 55 LAVA 56 docker 57 58Farm management 59--------------- 60 61.. note:: 62 Never mix disabling/re-enabling a farm with any change that can affect a job 63 that runs in another farm! 64 65When the farm starts failing for any reason (power, network, out-of-space), it needs to be disabled by pushing separate MR with 66 67.. code-block:: sh 68 69 git mv .ci-farms{,-disabled}/$farm_name 70 71After farm restore functionality can be enabled by pushing a new merge request, which contains 72 73.. code-block:: sh 74 75 git mv .ci-farms{-disabled,}/$farm_name 76 77.. warning:: 78 Pushing (``git push``) directly to ``main`` is forbidden; this change must 79 be sent as a :ref:`Merge Request <merging>`. 80 81Application traces replay 82------------------------- 83 84The CI replays application traces with various drivers in two different jobs. The first 85job replays traces listed in ``src/<driver>/ci/traces-<driver>.yml`` files and if any 86of those traces fail the pipeline fails as well. The second job replays traces listed in 87``src/<driver>/ci/restricted-traces-<driver>.yml`` and it is allowed to fail. This second 88job is only created when the pipeline is triggered by ``marge-bot`` or any other user that 89has been granted access to these traces. 90 91A traces YAML file also includes a ``download-url`` pointing to a MinIO 92instance where to download the traces from. While the first job should always work with 93publicly accessible traces, the second job could point to an URL with restricted access. 94 95Restricted traces are those that have been made available to Mesa developers without a 96license to redistribute at will, and thus should not be exposed to the public. Failing to 97access that URL would not prevent the pipeline to pass, therefore forks made by 98contributors without permissions to download non-redistributable traces can be merged 99without friction. 100 101As an aside, only maintainers of such non-redistributable traces are responsible for 102ensuring that replays are successful, since other contributors would not be able to 103download and test them by themselves. 104 105Those Mesa contributors that believe they could have permission to access such 106non-redistributable traces can request permission to Daniel Stone <[email protected]>. 107 108gitlab.freedesktop.org accounts that are to be granted access to these traces will be 109added to the OPA policy for the MinIO repository as per 110https://gitlab.freedesktop.org/freedesktop/helm-gitlab-infra/-/commit/a3cd632743019f68ac8a829267deb262d9670958 . 111 112So the jobs are created in personal repositories, the name of the user's account needs 113to be added to the rules attribute of the GitLab CI job that accesses the restricted 114accounts. 115 116.. toctree:: 117 :maxdepth: 1 118 119 local-traces 120 121Intel CI 122-------- 123 124The Intel CI is not yet integrated into the GitLab CI. 125For now, special access must be manually given (file a issue in 126`the Intel CI configuration repo <https://gitlab.freedesktop.org/Mesa_CI/mesa_jenkins>`__ 127if you think you or Mesa would benefit from you having access to the Intel CI). 128Results can be seen on `mesa-ci.01.org <https://mesa-ci.01.org>`__ 129if you are *not* an Intel employee, but if you are you 130can access a better interface on 131`mesa-ci-results.jf.intel.com <http://mesa-ci-results.jf.intel.com>`__. 132 133The Intel CI runs a much larger array of tests, on a number of generations 134of Intel hardware and on multiple platforms (X11, Wayland, DRM & Android), 135with the purpose of detecting regressions. 136Tests include 137`Crucible <https://gitlab.freedesktop.org/mesa/crucible>`__, 138`VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__, 139`dEQP <https://android.googlesource.com/platform/external/deqp>`__, 140`Piglit <https://gitlab.freedesktop.org/mesa/piglit>`__, 141`Skia <https://skia.googlesource.com/skia>`__, 142`VkRunner <https://github.com/Igalia/vkrunner>`__, 143`WebGL <https://github.com/KhronosGroup/WebGL>`__, 144and a few other tools. 145A typical run takes between 30 minutes and an hour. 146 147If you're having issues with the Intel CI, your best bet is to ask about 148it on ``#dri-devel`` on OFTC and tag `Nico Cortes 149<https://gitlab.freedesktop.org/ngcortes>`__ (``ngcortes`` on IRC). 150 151.. _CI-job-user-expectations: 152 153CI job user expectations 154------------------------ 155 156To make sure that testing of one vendor's drivers doesn't block 157unrelated work by other vendors, we require that a given driver's test 158farm produces a spurious failure no more than once a week. If every 159driver had CI and failed once a week, we would be seeing someone's 160code getting blocked on a spurious failure daily, which is an 161unacceptable cost to the project. 162 163To ensure that, driver maintainers with CI enabled should watch the Flakes panel 164of the `CI flakes dashboard 165<https://ci-stats-grafana.freedesktop.org/d/Ae_TLIwVk/mesa-ci-quality-false-positives?orgId=1>`__, 166particularly the "Flake jobs" pane, to inspect jobs in their driver where the 167automatic retry of a failing job produced a success a second time. 168Additionally, most CI reports test-level flakes to an IRC channel, and flakes 169reported as NEW are not expected and could cause spurious failures in jobs. 170Please track the NEW reports in jobs and add them as appropriate to the 171``-flakes.txt`` file for your driver. 172 173Additionally, the test farm needs to be able to provide a short enough 174turnaround time that we can get our MRs through marge-bot without the pipeline 175backing up. As a result, we require that the test farm be able to handle a 176whole pipeline's worth of jobs in less than 15 minutes (to compare, the build 177stage is about 10 minutes). Given boot times and intermittent network delays, 178this generally means that the test runtime as reported by deqp-runner should be 179kept to 10 minutes. 180 181If a test farm is short the HW to provide these guarantees, consider dropping 182tests to reduce runtime. dEQP job logs print the slowest tests at the end of 183the run, and Piglit logs the runtime of tests in the results.json.bz2 in the 184artifacts. Or, you can add the following to your job to only run some fraction 185(in this case, 1/10th) of the dEQP tests. 186 187.. code-block:: yaml 188 189 variables: 190 DEQP_FRACTION: 10 191 192to just run 1/10th of the test list. 193 194For Collabora's LAVA farm, the `device types 195<https://lava.collabora.dev/scheduler/device_types>`__ page can tell you how 196many boards of a specific tag are currently available by adding the "Idle" and 197"Busy" columns. For bare-metal, a gitlab admin can look at the `runners 198<https://gitlab.freedesktop.org/admin/runners>`__ page. A pipeline should 199probably not create more jobs for a board type than there are boards, unless you 200clearly have some short-runtime jobs. 201 202If a HW CI farm goes offline (network dies and all CI pipelines end up 203stalled) or its runners are consistently spuriously failing (disk 204full?), and the maintainer is not immediately available to fix the 205issue, please push through an MR disabling that farm's jobs according 206to the `Farm Management <#farm-management>`__ instructions. 207 208Personal runners 209---------------- 210 211Mesa's CI is currently run primarily on packet.net's m1xlarge nodes 212(2.2Ghz Sandy Bridge), with each job getting 8 cores allocated. You 213can speed up your personal CI builds (and marge-bot merges) by using a 214faster personal machine as a runner. You can find the gitlab-runner 215package in Debian, or use GitLab's own builds. 216 217To do so, follow `GitLab's instructions 218<https://docs.gitlab.com/ee/ci/runners/runners_scope.html#create-a-project-runner-with-a-runner-authentication-token>`__ 219to register your personal GitLab runner in your Mesa fork. Then, tell 220Mesa how many jobs it should serve (``concurrent=``) and how many 221cores those jobs should use (``FDO_CI_CONCURRENT=``) by editing these 222lines in ``/etc/gitlab-runner/config.toml``, for example: 223 224.. code-block:: toml 225 226 concurrent = 2 227 228 [[runners]] 229 environment = ["FDO_CI_CONCURRENT=16"] 230 231 232Docker caching 233-------------- 234 235The CI system uses Docker images extensively to cache 236infrequently-updated build content like the CTS. The `freedesktop.org 237CI templates 238<https://gitlab.freedesktop.org/freedesktop/ci-templates/>`__ help us 239manage the building of the images to reduce how frequently rebuilds 240happen, and trim down the images (stripping out manpages, cleaning the 241apt cache, and other such common pitfalls of building Docker images). 242 243When running a container job, the templates will look for an existing 244build of that image in the container registry under 245``MESA_IMAGE_TAG``. If it's found it will be reused, and if 246not, the associated ``.gitlab-ci/containers/<jobname>.sh`` will be run 247to build it. So, when developing any change to container build 248scripts, you need to update the associated ``MESA_IMAGE_TAG`` to 249a new unique string. We recommend using the current date plus some 250string related to your branch (so that if you rebase on someone else's 251container update from the same day, you will get a Git conflict 252instead of silently reusing their container) 253 254When developing a given change to your Docker image, you would have to 255bump the tag on each ``git commit --amend`` to your development 256branch, which can get tedious. Instead, you can navigate to the 257`container registry 258<https://gitlab.freedesktop.org/mesa/mesa/container_registry>`__ for 259your repository and delete the tag to force a rebuild. When your code 260is eventually merged to main, a full image rebuild will occur again 261(forks inherit images from the main repo, but MRs don't propagate 262images from the fork into the main repo's registry). 263 264Building locally using CI docker images 265--------------------------------------- 266 267It can be frustrating to debug build failures on an environment you 268don't personally have. If you're experiencing this with the CI 269builds, you can use Docker to use their build environment locally. Go 270to your job log, and at the top you'll see a line like:: 271 272 Pulling docker image registry.freedesktop.org/anholt/mesa/debian/android_build:2020-09-11 273 274We'll use a volume mount to make our current Mesa tree be what the 275Docker container uses, so they'll share everything (their build will 276go in _build, according to ``meson-build.sh``). We're going to be 277using the image non-interactively so we use ``run --rm $IMAGE 278command`` instead of ``run -it $IMAGE bash`` (which you may also find 279useful for debug). Extract your build setup variables from 280.gitlab-ci.yml and run the CI meson build script: 281 282.. code-block:: sh 283 284 IMAGE=registry.freedesktop.org/anholt/mesa/debian/android_build:2020-09-11 285 sudo docker pull $IMAGE 286 sudo docker run --rm -v `pwd`:/mesa -w /mesa $IMAGE env PKG_CONFIG_PATH=/usr/local/lib/aarch64-linux-android/pkgconfig/:/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/aarch64-linux-android/pkgconfig/ GALLIUM_DRIVERS=freedreno UNWIND=disabled EXTRA_OPTION="-D android-stub=true -D llvm=disabled" DRI_LOADERS="-D glx=disabled -D gbm=disabled -D egl=enabled -D platforms=android" CROSS=aarch64-linux-android ./.gitlab-ci/meson-build.sh 287 288All you have left over from the build is its output, and a _build 289directory. You can hack on mesa and iterate testing the build with: 290 291.. code-block:: sh 292 293 sudo docker run --rm -v `pwd`:/mesa $IMAGE meson compile -C /mesa/_build 294 295Running specific CI jobs 296------------------------ 297 298You can use ``bin/ci/ci_run_n_monitor.py`` to run specific CI jobs. It 299will automatically take care of running all the jobs yours depends on, 300and cancel the rest to avoid wasting resources. 301 302See ``bin/ci/ci_run_n_monitor.py --help`` for all the options. 303 304The ``--target`` argument takes a regex that you can use to select the 305jobs names you want to run, e.g. ``--target 'zink.*'`` will run all the 306Zink jobs, leaving the other drivers' jobs free for others to use. 307 308Note that in fork pipelines, GitLab only adds the jobs for the files that have 309changed **since the last push**, so you might not get the jobs you expect. 310You can work around that by adding a dummy change in a file core to what you're 311working on and then making a new push with that change, and removing that change 312before you create the MR. 313 314Conformance Tests 315----------------- 316 317Some conformance tests require a special treatment to be maintained on GitLab CI. 318This section lists their documentation pages. 319 320.. toctree:: 321 :maxdepth: 1 322 323 skqp 324 325 326Updating GitLab CI Linux Kernel 327------------------------------- 328 329GitLab CI usually runs a bleeding-edge kernel. The following documentation has 330instructions on how to uprev Linux Kernel in the GitLab CI ecosystem. 331 332.. toctree:: 333 :maxdepth: 1 334 335 kernel 336 337 338Reusing CI scripts for other projects 339-------------------------------------- 340 341The CI scripts in ``.gitlab-ci/`` can be reused for other projects, to 342facilitate reuse of the infrastructure, our scripts can be used as tools 343to create containers and run tests on the available farms. 344 345.. envvar:: EXTRA_LOCAL_PACKAGES 346 347 Define extra Debian packages to be installed in the container. 348