Uploaded image for project: 'Dev - Nexus Repo'
  1. Dev - Nexus Repo
  2. NEXUS-26938

Use HEAD request to determine whether remote content has changed instead of conditional GET to avoid hitting DockerHub rate limit prematurely

    XMLWordPrintable

    Details

    • Story Points:
      5
    • Sprint:
      NXRM Rocket Sprint 5, NXRM Rocket Sprint 6
    • Notability:
      n/a

      Description

      As an NXRM user I would like to avoid hitting Docker Hub rate limit as much as possible - so when NXRM checks remote for content existence or changes, it should always use a HEAD request rather than (conditional) GET request.

      History

      Since November 1, 2020, Docker Hub has introduced rate limiting at rate of 100 pulls per 6 hours for anonymous users, and 200 pulls per 6 hours for authenticated users using free plan.

      NXRM proxy repository acts as a cache. Proxy repositories keep track of

      • cached assets downloaded from remote ( stored in repository blob store )
      • cached requests missed (not found) at the remote ( stored in internal in memory not found cache per repo )

      These caches have expiries. When a cached item expires, the next inbound request for that asset will trigger an outbound request by NXRM to check the remote for new information.

      An docker asset can become expired via number of paths - examples:

      • explicit invalidate cache on proxy repository
      • explicit invalidate Cache on group repository containing proxy repository as member
      • NXRM restart clears in-memory not found cache
      • if the asset is a docker tag pointing at a manifest and present in storage, then when metadata max age time is expired for that specific asset
      • the path is stored in Negative Cache as "not found" and Negative Cache TTL has expired for that specific asset

      Problem

      NXRM sends conditional GET requests to check for updates for already cached content.

      Conditional GET requests count towards the Docker Hub Rate limits, even if they return 304 Not Modified.

      Reference

      https://www.docker.com/pricing/resource-consumption-updates

      How is a pull request defined for purposes of rate limiting?
      A pull request is up to two GET requests to the registry URL path ‘/v2//manifests/’.

      This accounts for the fact that container pull requests for multi-arch images require a manifest list to be downloaded followed by the actual image manifest for the required architecture. HEAD requests are not counted.

      Note that all pull requests, including ones for images you already have, are counted by this method. This is the trade-off for not counting individual layers.

      Expected

      When NXRM checks remote for Docker manifest existence or changes, it should always use a HEAD request rather than (conditional) GET request.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              iudovika Igor Udovika
              Reporter:
              dsawa Dawid Sawa
              Last Updated By:
              Rich Seddon Rich Seddon
              Team:
              NXRM - Rocket Raccoon
              Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Date of First Response:

                  tigCommentSecurity.panel-title