Uploaded image for project: 'Dev - Nexus Repo'
  1. Dev - Nexus Repo
  2. NEXUS-19840

/service/metrics/healthcheck returns 500 status code if a single healthcheck fails

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 3.16.1
    • Fix Version/s: 3.29.0
    • Component/s: HA
    • Story Points:
      2
    • Release Note:
      Yes
    • Notability:
      4

      Description

      NXRM 3 implements server status checks using dropwizard metrics library and and exposes these at the /service/metrics/healthcheck endpoint.

      The purpose of this endpoint is to report server health status from the perspective of all implemented Server Status checks ( the same list of statuses one sees in the UI under Adminsitration -> Support Tools -> Status ).

      Problem

      If a single status check fails, the resource returns a 500 status code.

      Expected

      If this resource can render a JSON response ( code that processes status checks is active, connector is on ) it should NOT return a 500 status code, even if one or more status checks "fails".

      A 200 status code is a more appropriate code for the NXRM3 use case of this endpoint if a JSON response can be rendered.

      All exceptions thrown by a single status check must be caught ( expected to include indication of the caught error in the JSON response, per status check ), and therefore a single status check should never render this entire endpoint broken.

      Additional Info

      Do not rely on this endpoint for node and cluster health, say for a load balancer, or for triggering automatic node host restarts, as may be the case if Kubernetes deployment is being used. That is not the intended use case of this endpoint.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              mpiggott Matthew Piggott
              Reporter:
              jkruger John Kruger
              Last Updated By:
              Peter Lynch Peter Lynch
              Team:
              NXRM - Groot
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Date of First Response:

                  tigCommentSecurity.panel-title