Uploaded image for project: 'Dev - Nexus Repo'
  1. Dev - Nexus Repo
  2. NEXUS-15702

possible deadlock when OrientDB transaction rollback fails in OAtomicOperationsManager.endAtomicOperation() if asset download counts are enabled

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.7.1, 3.9.0, 3.12.0
    • Fix Version/s: 3.13.0
    • Component/s: Database
    • Labels:

      Description

      If asset download counts feature is enabled, there is a rare possibility that low level OrientDB code will deadlock when a transaction rollback fails. This was noticed in an exceptional case where a customer DB had some corruption.

      the external tracking ticket is https://www.prjhub.com/#/issues/9735

      First an exception on doing write

      com.orientechnologies.orient.core.exception.OPaginatedClusterException: Error during record creation
      	DB name="component"
      	Component Name="assetdownloadcount"
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createSinglePageRecord(OPaginatedCluster.java:634)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createDataRecord(OPaginatedCluster.java:511)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createRecord(OPaginatedCluster.java:476)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doCreateRecord(OAbstractPaginatedStorage.java:3841)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commitEntry(OAbstractPaginatedStorage.java:4472)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commit(OAbstractPaginatedStorage.java:1802)
      	at com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:541)
      	at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:99)
      	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2907)
      	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2876)
      	at org.sonatype.nexus.orient.transaction.OrientTransaction.commit(OrientTransaction.java:83)
      	at org.sonatype.nexus.transaction.TransactionalWrapper.proceedWithTransaction(TransactionalWrapper.java:67)
      	at org.sonatype.nexus.transaction.Operations.transactional(Operations.java:200)
      	at org.sonatype.nexus.transaction.Operations.run(Operations.java:155)
      	at org.sonatype.nexus.orient.transaction.OrientOperations.run(OrientOperations.java:63)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl.setMonthlyVulnerableCount(AssetDownloadCountStoreImpl.java:174)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b.CGLIB$setMonthlyVulnerableCount$7(<generated>)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b$$FastClassByGuice$$1be47c0c.invoke(<generated>)
      	at com.google.inject.internal.cglib.proxy.$MethodProxy.invokeSuper(MethodProxy.java:228)
      	at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:76)
      	at org.sonatype.nexus.common.stateguard.MethodInvocationAction.run(MethodInvocationAction.java:39)
      	at org.sonatype.nexus.common.stateguard.StateGuard$GuardImpl.run(StateGuard.java:270)
      	at org.sonatype.nexus.common.stateguard.GuardedInterceptor.invoke(GuardedInterceptor.java:53)
      	at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:77)
      	at com.google.inject.internal.InterceptorStackCallback.intercept(InterceptorStackCallback.java:55)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b.setMonthlyVulnerableCount(<generated>)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.putDownloadCountsInDb(HealthCheckTask.java:441)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.doDownload(HealthCheckTask.java:420)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.doRunSafe(HealthCheckTask.java:242)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.execute(HealthCheckTask.java:132)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.execute(HealthCheckTask.java:1)
      	at org.sonatype.nexus.scheduling.TaskSupport.call(TaskSupport.java:94)
      	at org.sonatype.nexus.quartz.internal.task.QuartzTaskJob.doExecute(QuartzTaskJob.java:145)
      	at org.sonatype.nexus.quartz.internal.task.QuartzTaskJob.execute(QuartzTaskJob.java:108)
      	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
      	at org.sonatype.nexus.thread.internal.MDCAwareRunnable.run(MDCAwareRunnable.java:40)
      	at org.apache.shiro.subject.support.SubjectRunnable.doRun(SubjectRunnable.java:120)
      	at org.apache.shiro.subject.support.SubjectRunnable.run(SubjectRunnable.java:108)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.IllegalArgumentException: null
      	at java.nio.Buffer.position(Buffer.java:244)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OWALPageChangesPortion.readData(OWALPageChangesPortion.java:226)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OWALPageChangesPortion.getIntValue(OWALPageChangesPortion.java:85)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurablePage.getIntValue(ODurablePage.java:116)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OClusterPage.appendRecord(OClusterPage.java:103)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.addEntry(OPaginatedCluster.java:1928)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createSinglePageRecord(OPaginatedCluster.java:625)
      	... 42 common frames omitted
      

      And then a rollback is attempted, which also fails

      2018-01-11 15:52:24,610-0500 ERROR [quartz-7-thread-1] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=component}} Exception `52744947` in storage `component`
      com.orientechnologies.orient.core.exception.OPaginatedClusterException: Error during record deletion
      	DB name="component"
      	Component Name="assetdownloadcount"
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.deleteRecord(OPaginatedCluster.java:891)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doCreateRecord(OAbstractPaginatedStorage.java:3858)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commitEntry(OAbstractPaginatedStorage.java:4472)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commit(OAbstractPaginatedStorage.java:1802)
      	at com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:541)
      	at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:99)
      	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2907)
      	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2876)
      	at org.sonatype.nexus.orient.transaction.OrientTransaction.commit(OrientTransaction.java:83)
      	at org.sonatype.nexus.transaction.TransactionalWrapper.proceedWithTransaction(TransactionalWrapper.java:67)
      	at org.sonatype.nexus.transaction.Operations.transactional(Operations.java:200)
      	at org.sonatype.nexus.transaction.Operations.run(Operations.java:155)
      	at org.sonatype.nexus.orient.transaction.OrientOperations.run(OrientOperations.java:63)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl.setMonthlyVulnerableCount(AssetDownloadCountStoreImpl.java:174)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b.CGLIB$setMonthlyVulnerableCount$7(<generated>)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b$$FastClassByGuice$$1be47c0c.invoke(<generated>)
      	at com.google.inject.internal.cglib.proxy.$MethodProxy.invokeSuper(MethodProxy.java:228)
      	at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:76)
      	at org.sonatype.nexus.common.stateguard.MethodInvocationAction.run(MethodInvocationAction.java:39)
      	at org.sonatype.nexus.common.stateguard.StateGuard$GuardImpl.run(StateGuard.java:270)
      	at org.sonatype.nexus.common.stateguard.GuardedInterceptor.invoke(GuardedInterceptor.java:53)
      	at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:77)
      	at com.google.inject.internal.InterceptorStackCallback.intercept(InterceptorStackCallback.java:55)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b.setMonthlyVulnerableCount(<generated>)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.putDownloadCountsInDb(HealthCheckTask.java:441)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.doDownload(HealthCheckTask.java:420)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.doRunSafe(HealthCheckTask.java:242)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.execute(HealthCheckTask.java:132)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.execute(HealthCheckTask.java:1)
      	at org.sonatype.nexus.scheduling.TaskSupport.call(TaskSupport.java:94)
      	at org.sonatype.nexus.quartz.internal.task.QuartzTaskJob.doExecute(QuartzTaskJob.java:145)
      	at org.sonatype.nexus.quartz.internal.task.QuartzTaskJob.execute(QuartzTaskJob.java:108)
      	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
      	at org.sonatype.nexus.thread.internal.MDCAwareRunnable.run(MDCAwareRunnable.java:40)
      	at org.apache.shiro.subject.support.SubjectRunnable.doRun(SubjectRunnable.java:120)
      	at org.apache.shiro.subject.support.SubjectRunnable.run(SubjectRunnable.java:108)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.ONestedRollbackException: Atomic operation was rolled back by internal component, exception which caused this rollback is :
      com.orientechnologies.orient.core.exception.OPaginatedClusterException: Error during record creation
      	DB name="component"
      	Component Name="assetdownloadcount"
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createSinglePageRecord(OPaginatedCluster.java:634)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createDataRecord(OPaginatedCluster.java:511)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createRecord(OPaginatedCluster.java:476)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doCreateRecord(OAbstractPaginatedStorage.java:3841)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commitEntry(OAbstractPaginatedStorage.java:4472)
      	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commit(OAbstractPaginatedStorage.java:1802)
      	at com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:541)
      	at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:99)
      	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2907)
      	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2876)
      	at org.sonatype.nexus.orient.transaction.OrientTransaction.commit(OrientTransaction.java:83)
      	at org.sonatype.nexus.transaction.TransactionalWrapper.proceedWithTransaction(TransactionalWrapper.java:67)
      	at org.sonatype.nexus.transaction.Operations.transactional(Operations.java:200)
      	at org.sonatype.nexus.transaction.Operations.run(Operations.java:155)
      	at org.sonatype.nexus.orient.transaction.OrientOperations.run(OrientOperations.java:63)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl.setMonthlyVulnerableCount(AssetDownloadCountStoreImpl.java:174)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b.CGLIB$setMonthlyVulnerableCount$7(<generated>)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b$$FastClassByGuice$$1be47c0c.invoke(<generated>)
      	at com.google.inject.internal.cglib.proxy.$MethodProxy.invokeSuper(MethodProxy.java:228)
      	at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:76)
      	at org.sonatype.nexus.common.stateguard.MethodInvocationAction.run(MethodInvocationAction.java:39)
      	at org.sonatype.nexus.common.stateguard.StateGuard$GuardImpl.run(StateGuard.java:270)
      	at org.sonatype.nexus.common.stateguard.GuardedInterceptor.invoke(GuardedInterceptor.java:53)
      	at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:77)
      	at com.google.inject.internal.InterceptorStackCallback.intercept(InterceptorStackCallback.java:55)
      	at org.sonatype.nexus.repository.assetdownloadcount.internal.AssetDownloadCountStoreImpl$$EnhancerByGuice$$46e3783b.setMonthlyVulnerableCount(<generated>)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.putDownloadCountsInDb(HealthCheckTask.java:441)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.doDownload(HealthCheckTask.java:420)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.doRunSafe(HealthCheckTask.java:242)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.execute(HealthCheckTask.java:132)
      	at com.sonatype.nexus.plugins.healthcheck.task.HealthCheckTask.execute(HealthCheckTask.java:1)
      	at org.sonatype.nexus.scheduling.TaskSupport.call(TaskSupport.java:94)
      	at org.sonatype.nexus.quartz.internal.task.QuartzTaskJob.doExecute(QuartzTaskJob.java:145)
      	at org.sonatype.nexus.quartz.internal.task.QuartzTaskJob.execute(QuartzTaskJob.java:108)
      	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
      	at org.sonatype.nexus.thread.internal.MDCAwareRunnable.run(MDCAwareRunnable.java:40)
      	at org.apache.shiro.subject.support.SubjectRunnable.doRun(SubjectRunnable.java:120)
      	at org.apache.shiro.subject.support.SubjectRunnable.run(SubjectRunnable.java:108)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.IllegalArgumentException
      	at java.nio.Buffer.position(Buffer.java:244)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OWALPageChangesPortion.readData(OWALPageChangesPortion.java:226)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OWALPageChangesPortion.getIntValue(OWALPageChangesPortion.java:85)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurablePage.getIntValue(ODurablePage.java:116)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OClusterPage.appendRecord(OClusterPage.java:103)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.addEntry(OPaginatedCluster.java:1928)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createSinglePageRecord(OPaginatedCluster.java:625)
      	... 42 more
      
      
      	DB name="component"
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.endAtomicOperation(OAtomicOperationsManager.java:457)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.endAtomicOperation(OAtomicOperationsManager.java:412)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurableComponent.endAtomicOperation(ODurableComponent.java:116)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OClusterPositionMap.remove(OClusterPositionMap.java:444)
      	at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.deleteRecord(OPaginatedCluster.java:881)
      	... 40 common frames omitted
      

      It seems in OAtomicOperationsManager.startAtomicOperation() we are getting an exclusive lock, then all the problems above happens, and during OAtomicOperationsManager.endAtomicOperation() we are hitting the case on line 446

      if (operation.isRollback() && !rollback)

      which throws the exception above. And completely skips the code down at line 484 where the lock is actually released. Thus any thread in the future that tries to read these records is going to get blocked.

        Attachments

          Activity

            People

            Assignee:
            dbradicich Damian Bradicich
            Reporter:
            dbradicich Damian Bradicich
            Last Updated By:
            Peter Lynch
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:
              Date of First Response:

                tigCommentSecurity.panel-title