Uploaded image for project: 'Dev - Nexus Repo'
  1. Dev - Nexus Repo
  2. NEXUS-18736

Blobstore missing contents during HA cluster startup can cause nodes to go into error state.

    XMLWordPrintable

    Details

      Description

      In a 3 node HA cluster running Nexus Repo 3.13.0, all nodes were brought down. While the cluster was not running the contents of one of the blobstores was accidentally wiped out.

      On the first node, the following was observed:

      2018-12-20 16:40:12,076-0600 INFO [qtp135203356-428] admin org.sonatype.nexus.blobstore.file.internal.BlobStoreMetricsStoreImpl - Blob store metrics file /cme/nexus_blobs/blobstores/npmjms/B94DEBDE-0CA96E7B-F9521710-CC9C6A23-690221A2-metrics.properties not found - initializing at zero.
      2018-12-20 16:40:36,651-0600 INFO [qtp135203356-428] admin org.sonatype.nexus.blobstore.file.internal.BlobStoreMetricsStoreImpl - Blob store metrics file /cme/nexus_blobs/blobstores/blobstore_npmjms/B94DEBDE-0CA96E7B-F9521710-CC9C6A23-690221A2-metrics.properties not found - initializing at zero.
      2018-12-20 16:41:09,403-0600 INFO [qtp135203356-428] admin org.sonatype.nexus.blobstore.file.internal.BlobStoreMetricsStoreImpl - Blob store metrics file /cme/nexus_blobs/blobstores/blobstore_npm/B94DEBDE-0CA96E7B-F9521710-CC9C6A23-690221A2-metrics.properties not found - initializing at zero.

      But the other two nodes got NPE's in an event handler:

      2018-12-20 16:40:27,417-0600 WARN [OrientDB DistributedWorker node=142AB0E3-E31A2E42-419FB5CC-F0ED04FC-58CC8165 db=config id=0] *SYSTEM org.sonatype.nexus.repository.internal.blobstore.BlobStoreManagerImpl - delete blob store from remote event failed: npmjms
      org.sonatype.nexus.blobstore.api.BlobStoreException: BlobId: null, java.lang.NullPointerException
      at org.sonatype.nexus.blobstore.file.FileBlobStore.remove(FileBlobStore.java:761)
      at org.sonatype.nexus.common.stateguard.MethodInvocationAction.run(MethodInvocationAction.java:39)
      at org.sonatype.nexus.common.stateguard.StateGuard$GuardImpl.run(StateGuard.java:270)
      at org.sonatype.nexus.common.stateguard.GuardedInterceptor.invoke(GuardedInterceptor.java:53)
      at org.sonatype.nexus.repository.internal.blobstore.BlobStoreManagerImpl.lambda$3(BlobStoreManagerImpl.java:266)
      at org.sonatype.nexus.repository.internal.blobstore.BlobStoreManagerImpl.handleRemoteOnly(BlobStoreManagerImpl.java:282)
      at org.sonatype.nexus.repository.internal.blobstore.BlobStoreManagerImpl.on(BlobStoreManagerImpl.java:259)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:498)
      at com.google.common.eventbus.Subscriber.invokeSubscriberMethod(Subscriber.java:87)
      at com.google.common.eventbus.Subscriber$SynchronizedSubscriber.invokeSubscriberMethod(Subscriber.java:144)
      at com.google.common.eventbus.Subscriber$1.run(Subscriber.java:72)
      at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:398)
      at com.google.common.eventbus.Subscriber.dispatchEvent(Subscriber.java:67)
      at com.google.common.eventbus.Dispatcher$ImmediateDispatcher.dispatch(Dispatcher.java:186)
      at com.google.common.eventbus.EventBus.post(EventBus.java:212)
      at org.sonatype.nexus.internal.event.EventManagerImpl.post(EventManagerImpl.java:127)
      at org.sonatype.nexus.orient.entity.EntityHook.postEvents(EntityHook.java:272)
      at org.sonatype.nexus.orient.entity.EntityHook.lambda$2(EntityHook.java:252)
      at org.sonatype.nexus.common.event.EventHelper.asReplicating(EventHelper.java:61)
      at org.sonatype.nexus.orient.entity.EntityHook.flushEvents(EntityHook.java:252)
      at org.sonatype.nexus.orient.entity.EntityHook.onAfterTxCommit(EntityHook.java:169)
      at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2949)
      at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2870)
      at com.orientechnologies.orient.server.distributed.impl.task.OTxTask.execute(OTxTask.java:132)
      at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin$1.call(ODistributedAbstractPlugin.java:609)
      at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:70)
      at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.executeOnLocalNode(ODistributedAbstractPlugin.java:605)
      at com.sonatype.nexus.hazelcast.internal.orient.SharedHazelcastPlugin.lambda$0(SharedHazelcastPlugin.java:86)
      at org.sonatype.nexus.orient.entity.EntityHook.asRemote(EntityHook.java:90)
      at com.sonatype.nexus.hazelcast.internal.orient.SharedHazelcastPlugin.executeOnLocalNode(SharedHazelcastPlugin.java:86)
      at com.orientechnologies.orient.server.distributed.impl.ODistributedWorker.onMessage(ODistributedWorker.java:359)
      at com.orientechnologies.orient.server.distributed.impl.ODistributedWorker.run(ODistributedWorker.java:127)
      Caused by: java.lang.NullPointerException: null
      at java.util.Arrays.stream(Arrays.java:5004)
      at java.util.stream.Stream.of(Stream.java:1000)
      at org.sonatype.nexus.blobstore.file.FileBlobStore.remove(FileBlobStore.java:748)
      ... 34 common frames omitted

      After this, the blob store in question was stuck in an error state:

      2018-12-20 23:33:44,127-0600 ERROR [pool-22-thread-9] admin org.sonatype.nexus.extdirect.internal.ExtDirectServlet - Failed to invoke action method: coreui_Blobstore.read, java-method: org.sonatype.nexus.coreui.BlobStoreComponent.read
      org.sonatype.nexus.common.stateguard.InvalidStateException: Invalid state: STOPPED; allowed: [STARTED]
      at org.sonatype.nexus.common.stateguard.StateGuard._ensure(StateGuard.java:115)
      at org.sonatype.nexus.common.stateguard.StateGuard.access$1(StateGuard.java:108)
      at org.sonatype.nexus.common.stateguard.StateGuard$GuardImpl.run(StateGuard.java:269)
      at org.sonatype.nexus.common.stateguard.GuardedInterceptor.invoke(GuardedInterceptor.java:53)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:498)

      This should not have happened, reinitializing a blob store in an HA cluster should just work.

        Attachments

          Activity

            People

            Assignee:
            mbucher Michael Bucher
            Reporter:
            rseddon Rich Seddon
            Last Updated By:
            Peter Lynch Peter Lynch
            Team:
            NXRM - Morpheus
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:
              Date of First Response:

                tigCommentSecurity.panel-title