Uploaded image for project: 'Dev - Nexus Repo'
  1. Dev - Nexus Repo
  2. NEXUS-17233

Restarting while backup is in progress leaves NXRM as read-only

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.14.0
    • Component/s: Backup, Database
    • Labels:
    • Story Points:
      5

      Description

      If you restart a node while it's taking a backup then the node will preserve the read-only state, despite the backup task having been aborted. The following log entry during restart is where it restores the frozen state:

      2018-05-28 02:17:08,146+0100 INFO  [FelixStartLevel] 127.0.0.1 *SYSTEM org.sonatype.nexus.orient.internal.freeze.DatabaseFreezeServiceImpl - Restoring database frozen state on startup
      

      The first thing to determine is whether this is acceptable behaviour - while it leaves NXRM in a degraded state (only returning previously cached content, disallowing write operations) it could be considered the safe option given NXRM was shutdown during backup. It is also not difficult to return NXRM to a writable state, using the UI or REST. If we decide this is "working-as-designed" then this ticket will verify that manual intervention can quickly return NXRM to normal service. We should also verify that the behaviour is the same whether NXRM is clustered or non-clustered.

      However, if it's determined that this is not acceptable behaviour then this ticket will look at how to avoid leaving the freeze state if the backup task was aborted during shutdown. This may involve making sure we update the freeze state regardless of how the backup task ends. Note if the process is forcibly killed (eg. power-cut) then we won't get any chance to update it, but in that case it might be best to start as read-only as there may be data integrity issues. Another option would be to try and detect when backup was aborted and ignore restoring the freeze state - but that sounds more fragile. Other suggestions are welcome

      While recreating these scenarios (restarting while non-clustered backup is in progress and restarting while clustered backup is in progress) also consider whether extra logging would be useful.

      Summary:

      • Confirm restarting while non-clustered backup is in progress leaves NXRM as read-only
      • Confirm restarting while clustered backup is in progress leaves NXRM as read-only
      • Get input from team / PO about desired behaviour
      • Implement any change in behaviour (if necessary)
      • Consider additional logging / recording reason for freeze

        Attachments

          Activity

            People

            Assignee:
            mbucher Michael Bucher
            Reporter:
            mcculls Stuart McCulloch
            Last Updated By:
            Peter Lynch
            Team:
            NXRM - Morpheus
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:
              Date of First Response:

                tigCommentSecurity.panel-title