Details
-
Bug
-
Resolution: Fixed
-
Major
-
3.13.0
-
None
Description
NEXUS-9605 introduced a feature that detects when tasks don't have a valid last run state, and corrects this using the time NXRM last shutdown to estimate duration, etc. Unfortunately if NXRM is in read-only mode or lacks quorum (in HA) then the attempt to persist the updated last run state will fail and abort the startup process.
when NXRM is read-only:
2018-07-13 15:07:18,076+0100 WARN [FelixStartLevel] *SYSTEM org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI - Updating lastRunState to interrupted for nexus.01e53b8b-869a-4866-859c-46a83c874919 taskConfig: {multinode=true, .name=Test, lastRunState.runStarted=1531490752646, .id=230c933c-df4e-4e16-a0e1-f7c3d5f158d5, .typeName=Admin - Execute script, language=groovy, source=while (true) { println 'ping' sleep(3000) }, .visible=true, .typeId=script, lastRunState.endState=INTERRUPTED, .updated=2018-07-12T21:44:24.353+01:00, .enabled=true, .message=Execute script, lastRunState.runDuration=0, .created=2018-07-12T20:54:55.533+01:00} 2018-07-13 15:07:18,098+0100 WARN [FelixStartLevel] *SYSTEM org.sonatype.nexus.quartz.internal.orient.JobStoreImpl - Execution failed com.orientechnologies.common.concur.lock.OModificationOperationProhibitedException: Modification requests are prohibited DB name="config" at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.throwFreezeExceptionIfNeeded(OAtomicOperationsManager.java:358) at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.startAtomicOperation(OAtomicOperationsManager.java:197) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.startStorageTx(OAbstractPaginatedStorage.java:3910) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commit(OAbstractPaginatedStorage.java:1799) at com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:541) at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:99) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2908) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2870) at org.sonatype.nexus.orient.transaction.OrientTransaction.commit(OrientTransaction.java:83) at org.sonatype.nexus.transaction.TransactionalWrapper.proceedWithTransaction(TransactionalWrapper.java:67) at org.sonatype.nexus.transaction.Operations.transactional(Operations.java:200) at org.sonatype.nexus.transaction.Operations.call(Operations.java:146) at org.sonatype.nexus.orient.transaction.OrientOperations.call(OrientOperations.java:56) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.execute(JobStoreImpl.java:202) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.storeJob(JobStoreImpl.java:339) at org.quartz.core.QuartzScheduler.addJob(QuartzScheduler.java:938) at org.quartz.impl.StdScheduler.addJob(StdScheduler.java:273) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI.updateLastRunStateInfo(QuartzSchedulerSPI.java:236) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI.doStart(QuartzSchedulerSPI.java:201) at org.sonatype.nexus.common.stateguard.StateGuardLifecycleSupport.start(StateGuardLifecycleSupport.java:67) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI$$EnhancerByGuice$$8d457f36.CGLIB$start$25(<generated>) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI$$EnhancerByGuice$$8d457f36$$FastClassByGuice$$dd138cd.invoke(<generated>) at com.google.inject.internal.cglib.proxy.$MethodProxy.invokeSuper(MethodProxy.java:228) at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:76) at org.sonatype.nexus.common.stateguard.MethodInvocationAction.run(MethodInvocationAction.java:39) at org.sonatype.nexus.common.stateguard.StateGuard$TransitionImpl.run(StateGuard.java:191) at org.sonatype.nexus.common.stateguard.TransitionsInterceptor.invoke(TransitionsInterceptor.java:56) at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:77) at com.google.inject.internal.InterceptorStackCallback.intercept(InterceptorStackCallback.java:55) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI$$EnhancerByGuice$$8d457f36.start(<generated>) at org.sonatype.nexus.extender.NexusLifecycleManager.startComponent(NexusLifecycleManager.java:155) at org.sonatype.nexus.extender.NexusLifecycleManager.to(NexusLifecycleManager.java:95) at org.sonatype.nexus.extender.NexusContextListener.frameworkEvent(NexusContextListener.java:191) at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1429) at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:308) at java.lang.Thread.run(Thread.java:748)
when quorum is missing:
2018-07-12 21:09:33,230+0100 WARN [FelixStartLevel] *SYSTEM org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI - Updating lastRunState to interrupted for nexus.01e53b8b-869a-4866-859c-46a83c874919 taskConfig: {multinode=true, .name=Test, lastRunState.runStarted=1531425302119, .id=230c933c-df4e-4e16-a0e1-f7c3d5f158d5, .typeName=Admin - Execute script, language=groovy, source=while (true) { println 'ping' sleep(3000) }, .visible=true, .typeId=script, lastRunState.endState=INTERRUPTED, .updated=2018-07-12T20:54:55.533+01:00, .enabled=true, lastRunState.runDuration=100881, .created=2018-07-12T20:54:55.533+01:00} 2018-07-12 21:09:33,302+0100 ERROR [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.db.OPartitionedDatabasePool$DatabaseDocumentTxPooled - Error on transaction commit `165EE0AD` com.orientechnologies.orient.server.distributed.ODistributedException: Quorum (2) cannot be reached on server 'E776F470-EFCEF342-6732907B-3327B80D-4B260AAF' database 'config' because it is major than available nodes (1) at com.orientechnologies.orient.server.distributed.impl.ODistributedDatabaseImpl.calculateQuorum(ODistributedDatabaseImpl.java:1055) at com.orientechnologies.orient.server.distributed.impl.ODistributedDatabaseImpl.send2Nodes(ODistributedDatabaseImpl.java:430) at com.orientechnologies.orient.server.distributed.impl.ODistributedAbstractPlugin.sendRequest(ODistributedAbstractPlugin.java:589) at com.orientechnologies.orient.server.distributed.impl.ODistributedTransactionManager.commit(ODistributedTransactionManager.java:162) at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage.commit(ODistributedStorage.java:1426) at com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:541) at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:99) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2908) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2870) at org.sonatype.nexus.orient.transaction.OrientTransaction.commit(OrientTransaction.java:83) at org.sonatype.nexus.transaction.TransactionalWrapper.proceedWithTransaction(TransactionalWrapper.java:67) at org.sonatype.nexus.transaction.Operations.transactional(Operations.java:200) at org.sonatype.nexus.transaction.Operations.call(Operations.java:146) at org.sonatype.nexus.orient.transaction.OrientOperations.call(OrientOperations.java:56) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.execute(JobStoreImpl.java:202) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.storeJob(JobStoreImpl.java:339) at org.quartz.core.QuartzScheduler.addJob(QuartzScheduler.java:938) at org.quartz.impl.StdScheduler.addJob(StdScheduler.java:273) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI.updateLastRunStateInfo(QuartzSchedulerSPI.java:232) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI.doStart(QuartzSchedulerSPI.java:200) at org.sonatype.nexus.common.stateguard.StateGuardLifecycleSupport.start(StateGuardLifecycleSupport.java:67) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI$$EnhancerByGuice$$8d0b0af4.CGLIB$start$25(<generated>) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI$$EnhancerByGuice$$8d0b0af4$$FastClassByGuice$$974bccb.invoke(<generated>) at com.google.inject.internal.cglib.proxy.$MethodProxy.invokeSuper(MethodProxy.java:228) at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:76) at org.sonatype.nexus.common.stateguard.MethodInvocationAction.run(MethodInvocationAction.java:39) at org.sonatype.nexus.common.stateguard.StateGuard$TransitionImpl.run(StateGuard.java:191) at org.sonatype.nexus.common.stateguard.TransitionsInterceptor.invoke(TransitionsInterceptor.java:56) at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:77) at com.google.inject.internal.InterceptorStackCallback.intercept(InterceptorStackCallback.java:55) at org.sonatype.nexus.quartz.internal.QuartzSchedulerSPI$$EnhancerByGuice$$8d0b0af4.start(<generated>) at org.sonatype.nexus.extender.NexusLifecycleManager.startComponent(NexusLifecycleManager.java:155) at org.sonatype.nexus.extender.NexusLifecycleManager.to(NexusLifecycleManager.java:95) at org.sonatype.nexus.extender.NexusContextListener.frameworkEvent(NexusContextListener.java:191) at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1429) at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:308) at java.lang.Thread.run(Thread.java:748)
Expected:
if we can't persist the new last run state then log a warning and continue, so the admin can resolve the problem via the UI (either by unfreezing the instance or resetting the quorum).
Attachments
Issue Links
- relates
-
NEXUS-18983 If NXRM is read-only or lacks quorum, then run now triggers make startup fail.
-
- Closed
-