-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 3.21.1
-
Fix Version/s: 3.22.0
-
Component/s: Scheduled Tasks
-
Labels:
-
Notability:2
See the attached 6 thread dumps. Customer was running 3.21.1 with prometheus monitoring installed. What happens is prometheus is scraping mbeans for data ( thread ids pool-1-thread-1 through pool-1-thread-5 ), one of these mbeans being JobStoreimpl. You'll notice consistently that the mbean scraping is blocking other application threads needing read only access to job details and triggers.
JobStoreimpl read operations seem to have a single read locks making concurrent access by multiple threads a bottleneck.
"pool-1-thread-3" #160 daemon prio=5 os_prio=0 tid=0x00007fce58007800 nid=0x1435 runnable [0x00007fce689c2000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.hash(HashMap.java:338) at java.util.LinkedHashMap.get(LinkedHashMap.java:440) at com.orientechnologies.orient.core.tx.OTransactionRealAbstract.getRecordEntry(OTransactionRealAbstract.java:161) at com.orientechnologies.orient.core.tx.OTransactionRealAbstract.getRecord(OTransactionRealAbstract.java:165) at com.orientechnologies.orient.core.tx.OTransactionOptimistic.loadRecord(OTransactionOptimistic.java:169) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:1757) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:103) at com.orientechnologies.orient.core.iterator.OIdentifiableIterator.readCurrentRecord(OIdentifiableIterator.java:300) at com.orientechnologies.orient.core.iterator.ORecordIteratorClusters.hasNext(ORecordIteratorClusters.java:167) at com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:42) at com.google.common.collect.Iterators$5.computeNext(Iterators.java:637) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136) at com.google.common.collect.Iterators$5.computeNext(Iterators.java:637) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.getTriggerKeys(JobStoreImpl.java:709) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.lambda$25(JobStoreImpl.java:695) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl$$Lambda$547/2016992059.execute(Unknown Source) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl$$Lambda$394/1964436330.apply(Unknown Source) at org.sonatype.nexus.orient.transaction.OrientOperations.lambda$1(OrientOperations.java:55) at org.sonatype.nexus.orient.transaction.OrientOperations$$Lambda$297/1991948303.call(Unknown Source) at org.sonatype.nexus.transaction.OperationPoint.proceed(OperationPoint.java:64) at org.sonatype.nexus.transaction.TransactionalWrapper.proceedWithTransaction(TransactionalWrapper.java:57) at org.sonatype.nexus.transaction.Operations.proceedWithTransaction(Operations.java:232) at org.sonatype.nexus.transaction.Operations.transactional(Operations.java:223) at org.sonatype.nexus.transaction.Operations.call(Operations.java:166) at org.sonatype.nexus.orient.transaction.OrientOperations.call(OrientOperations.java:55) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.execute(JobStoreImpl.java:202) - locked <0x00000004d5e123f8> (a java.lang.Object) at org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.getTriggerKeys(JobStoreImpl.java:695) at org.quartz.core.QuartzScheduler.getTriggerKeys(QuartzScheduler.java:1475) at org.quartz.core.QuartzSchedulerMBeanImpl.getAllTriggers(QuartzSchedulerMBeanImpl.java:124) at sun.reflect.GeneratedMethodAccessor531.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) at com.sun.jmx.mbeanserver.MBeanSupport.getAttributes(MBeanSupport.java:213) at javax.management.StandardMBean.getAttributes(StandardMBean.java:390) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttributes(DefaultMBeanServerInterceptor.java:709) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttributes(JmxMBeanServer.java:705) at io.prometheus.jmx.shaded.io.prometheus.jmx.JmxScraper.scrapeBean(JmxScraper.java:151) at io.prometheus.jmx.shaded.io.prometheus.jmx.JmxScraper.doScrape(JmxScraper.java:117) at io.prometheus.jmx.shaded.io.prometheus.jmx.JmxCollector.collect(JmxCollector.java:456) at io.prometheus.jmx.shaded.io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:183) at io.prometheus.jmx.shaded.io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:216) at io.prometheus.jmx.shaded.io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:137) at io.prometheus.jmx.shaded.io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:22) at io.prometheus.jmx.shaded.io.prometheus.client.exporter.HTTPServer$HTTPMetricHandler.handle(HTTPServer.java:59) at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79) at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83) at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82) at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675) at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79) at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
That single thread is blocking 12 other threads trying to access JobStoreImpl at these lines:
- org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.retrieveTrigger(JobStoreImpl.java:677)
- org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.acquireNextTriggers(JobStoreImpl.java:920)
- org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.getTriggerGroupNames(JobStoreImpl.java:728)
- org.sonatype.nexus.quartz.internal.orient.JobStoreImpl.getJobKeys(JobStoreImpl.java:424)
Expected
Read operations need to be highly concurrent. JobStoreImpl read only operations need to be designed expecting highly concurrent access by multiple threads.