Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
3.34.1
-
NXRM Immortals Sprint 42, NXRM Immortals Sprint 43, NXRM Immortals Sprint 44, NXRM Immortals Sprint 45
-
2
-
1
Description
Problem
The Admin - Change repository blob store task may be stopped before it has a chance to finish moving all blobs from source blobstore A to target blobstore B. The reason for stopping can include:
- unexpected task error
- nexus Repo instance shut down
The very first thing this task does when it starts running is change the repository blobstore from A to B. So when the task stops, the repository is configured to believe all of its blobs are to be found in Blobstore B, but in actual fact some are still left in blobstore A.
In this partial-moved state, builds and API requests against the repository can fail, because the inbound request will cause repo to only look inside blobstore B for the matching blob. The left-behind blobs are essentially lost.
Further blob references in the database that cannot be matched to actual blobs in a blobstore may eventually be pruned out of the database by other normal operations and tasks intended to put the database into a healthy state ( ie. reconcile task ). For this reason this partially moved blob state is dangerous to customer data.
Existing recovery procedures known to be used in this situation are fragile. For example
Recovery Option 1: Use Admin - Change repository blob store task in the opposite direction
One could try to run the task in the opposite direction to move blobs now in B back to A. This risks encountering the same problem should the task stop. Depending on why the task stopped in the first place ( task error), or why the task was used in the first place ( maybe source blobstore was out of disk ) may be impractical.
Recovery Option 2: Combination of manually executed database queries, move blobs out-of-band, and running rebuild tasks
This option is a one-off solution and as such has its own risks. it can involve many individual risky steps.
Alternate Options
Use Case: When ALL repositories that are using a Source Blobstore are to be moved to a different Blobstore
Use Blobstore Groups as described at these links:
If choosing this option, please be aware of some known issues:
- NEXUS-26683 - "Remove a member from a blob store group" task can't remove a blob store if one bytes file is missing
NEXUS-33897- Remove member from blob store group sometimes does not move all content
Expected
There should be a more resilient process for changing a repository blobstore and moving its blobs. Stopping of the task cannot be avoided in all cases and the process should expect and plan for this error state. The steps to recover from this need to be simple and effective, avoid data loss risks and allow for fast recovery.
Attachments
Issue Links
- causes
-
NEXUS-36239 block using the Change Repository blob store task
-
- Closed
-
- is related to
-
NEXUS-26974 Change Repo Blob Store repository.move task logged counts do not make sense
-
- New
-