View Issue Details

IDProjectCategoryView StatusLast Update
0002721SymmetricDSBugpublic2016-08-16 18:57
Reporteriberras Assigned Toelong  
Prioritynormal 
Status closedResolutionfixed 
Product Version3.8.0 
Target Version3.8.1Fixed in Version3.8.1 
Summary0002721: Null Pointer on Initial load - sync triggers failed to create trigger
DescriptionThere is a null pointer on the server. I cannot reproduce the bug. It always occurs when an Initial Load is queued. But it doesn't happen all the time. Maybe its because of the trigger failure creation. This didn't happen with the same configuration in 3.7.37.

The trigger failure is the following:
2016-08-11 18:32:03,036 ERROR [serverdevelopment] [TriggerRouterService] [serverdevelopment-sync-triggers-1] Failed to create triggers for sym_load_filter
java.lang.IllegalStateException: Timed out after 5001 ms trying to get the next val for trigger_hist
        at org.jumpmind.symmetric.service.impl.SequenceService.nextValFromDatabase(SequenceService.java:137)
        at org.jumpmind.symmetric.service.impl.SequenceService$1.execute(SequenceService.java:118)
        at org.jumpmind.symmetric.service.impl.SequenceService$1.execute(SequenceService.java:116)
        at org.jumpmind.symmetric.service.impl.SequenceService$DoTransaction.execute(SequenceService.java:283)
        at org.jumpmind.symmetric.service.impl.SequenceService.nextValFromDatabase(SequenceService.java:116)
        at org.jumpmind.symmetric.service.impl.SequenceService.nextVal(SequenceService.java:91)
        at org.jumpmind.symmetric.service.impl.TriggerRouterService.insert(TriggerRouterService.java:956)
        at org.jumpmind.symmetric.service.impl.TriggerRouterService.rebuildTriggerIfNecessary(TriggerRouterService.java:1753)
        at org.jumpmind.symmetric.service.impl.TriggerRouterService.updateOrCreateDatabaseTriggers(TriggerRouterService.java:1636)
        at org.jumpmind.symmetric.service.impl.TriggerRouterService.updateOrCreateDatabaseTrigger(TriggerRouterService.java:1518)
        at org.jumpmind.symmetric.service.impl.TriggerRouterService$2.run(TriggerRouterService.java:1504)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

But it creates all the other triggers of the database correctly so its not that the user db doest have permissions. It also happens with this trigger:

java.lang.IllegalStateException: Timed out after 5000 ms trying to get the next val for trigger_hist.

And i dont know if i should do another issue, but the initial load also is not taking into account the property to not skip the duplicated batches. I have this deactivated and sometimes it tries to delete batches that are duplicated.

Because of this errors i had to go back to 3.7.37.
Steps To ReproduceStart an initial load.

Additional InformationNull Pointer:

2016-08-12 09:06:31,607 INFO [serverdevelopment] [DataGapDetector] [serverdevelopment-job-1] Full gap analysis is done after 44 ms
2016-08-12 09:06:31,614 INFO [serverdevelopment] [RouterService] [serverdevelopment-job-1] The 'config' channel is NOT in common batch mode
2016-08-12 09:06:31,942 INFO [serverdevelopment] [ConfigurationChangedDataRouter] [serverdevelopment-job-1] About to refresh the cache of node security because new configuration came through the data router
2016-08-12 09:06:31,943 INFO [serverdevelopment] [RouterService] [serverdevelopment-job-1] Routed 2 data events in 389 ms
2016-08-12 09:06:40,827 INFO [serverdevelopment] [DataLoaderService] [qtp615281750-20] 62 data and 23 batches loaded during push request from cabana:008:008.
2016-08-12 09:06:42,058 INFO [serverdevelopment] [DataService] [serverdevelopment-job-2] Queueing up an initial008
2016-08-12 09:06:42,685 ERROR [serverdevelopment] [RouterService] [serverdevelopment-job-2]
java.lang.NullPointerException
        at org.jumpmind.symmetric.service.impl.DataService.insertSqlEvent(DataService.java:1024)
        at org.jumpmind.symmetric.service.impl.DataService.insertSqlEvent(DataService.java:1017)
        at SimpleClassCompiler0.beforeReload(SimpleClassCompiler0.java:1)
        at org.jumpmind.symmetric.service.impl.DataService.callReloadListeners(DataService.java:526)
        at org.jumpmind.symmetric.service.impl.DataService.insertReloadEvents(DataService.java:411)
        at org.jumpmind.symmetric.service.impl.DataService.insertReloadEvents(DataService.java:344)
        at org.jumpmind.symmetric.service.impl.RouterService.insertInitialLoadEvents(RouterService.java:262)
        at org.jumpmind.symmetric.service.impl.RouterService.routeData(RouterService.java:198)
        at org.jumpmind.symmetric.job.RouterJob.doJob(RouterJob.java:40)
        at org.jumpmind.symmetric.job.AbstractJob.invoke(AbstractJob.java:180)
        at org.jumpmind.symmetric.job.AbstractJob.run(AbstractJob.java:224)
        at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2016-08-12 09:06:42,753 INFO [serverdevelopment] [ConfigurationChangedDataRouter] [serverdevelopment-job-2] About to refresh the cache of node security because new configuration came through the data router
2016-08-12 09:06:42,753 INFO [serverdevelopment] [RouterService] [serverdevelopment-job-2] The 'transaction' channel is NOT in common batch mode
2016-08-12 09:06:42,753 INFO [serverdevelopment] [RouterService] [serverdevelopment-job-2] The 'transaction' channel for the 'server' node group has only default routers assigned to it. Change data won't be selected during routing
2016-08-12 09:06:43,085 INFO [serverdevelopment] [RouterService] [serverdevelopment-job-2] Routed 33 data events in 400 ms

Repeated batches:

ERROR DataLoaderService - Failed to load batch 000-640 (Previously reported 1 time(s), will report 10) Could not find the target table DATOS_CABANA
INFO PushService - Push data sent to server:000:000
INFO RouterService - Routed 1 data events in 3 ms
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000667.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000668.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000669.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000670.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000671.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000672.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000673.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000674.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000675.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000676.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000677.ready because it already existed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000678.ready because it already existed
INFO PullService - There was a failure while pulling data from server:000:000 on channel thread default. 0 rows and 1 batches were processed
INFO PushService - Pushed data to server:000:000. 2 data and 1 batches were processed
WARN StagedResource - We had to delete /var/folders/vz/653d1c_s5jz8s97_39msjrxh0000gn/T/cabana-002/incoming/000/0000000658.ready because it already existed
INFO IncomingBatchService - Retrying batch 000-640
TagsNo tags attached.

Activities

elong

2016-08-12 13:03

developer   ~0000827

Do you have cluster.lock.enabled=true? Do you need that enabled? It only needs to be enabled when you have more than one SymmetricDS server pointed to the same database. If you don't need clustering, can you turn it off by setting cluster.lock.enabled=false? It is much more efficient and will avoid that first timed out error. Still looking at the rest of your post to understand it.

iberras

2016-08-12 13:08

reporter   ~0000828

I do not have cluster.lock.enabled enabled. I will have several engines running, but in the issue there is only one engine configured. If in 3.8.0 comes enabled by default then yes its enabled, but i haven't touched the cluster configuration.

elong

2016-08-12 13:08

developer   ~0000829

It looks like you have a reload listener that calls dataService.insertSqlEvent(), and that's where the null pointer happens. Are you maybe passing a null for the TriggerHistory argument?

elong

2016-08-12 13:33

developer   ~0000830

Sorry, ignore my last comment. I think I see the problem now.

The sync triggers process got multiple threads for 3.8 (to speed it up), but the access to get the next trigger_hist_id sequence is not synchronized, which is a race condition that leads to a time out and then the "failed to create trigger" error. With trigger history missing, the initial load is failing to run.

Short term workaround: set the parameter sync.triggers.thread.count.per.server=1 in your engine properties file or in the sym_parameter table.

Long term workaround: I will fix this issue in the code by synchronizing threads for 3.8.1. That version is scheduled for release on Monday, 2016-08-15.

Thanks for this report!

iberras

2016-08-12 13:41

reporter   ~0000831

You are very welcome! Looking forward for the 3.8.1.

The Initial loads are very very fast with what the team did so I hope you can fix this!

Related Changesets

SymmetricDS: 3.8 f9d3de26

2016-08-12 09:42:39

admin

Details Diff
0002721: Null Pointer on Initial load,
synchronize the sequence service
Affected Issues
0002721
mod - symmetric-core/src/main/java/org/jumpmind/symmetric/service/impl/SequenceService.java Diff File

Issue History

Date Modified Username Field Change
2016-08-12 12:10 iberras New Issue
2016-08-12 13:03 elong Note Added: 0000827
2016-08-12 13:08 iberras Note Added: 0000828
2016-08-12 13:08 elong Note Added: 0000829
2016-08-12 13:33 elong Note Added: 0000830
2016-08-12 13:34 elong Assigned To => elong
2016-08-12 13:34 elong Status new => assigned
2016-08-12 13:34 elong Target Version => 3.8.1
2016-08-12 13:41 iberras Note Added: 0000831
2016-08-12 14:00 admin Changeset attached => SymmetricDS 3.8 f9d3de26
2016-08-12 14:30 elong Status assigned => resolved
2016-08-12 14:30 elong Resolution open => fixed
2016-08-12 14:30 elong Fixed in Version => 3.8.1
2016-08-12 14:30 elong Summary Null Pointer on Initial load => Null Pointer on Initial load - sync triggers failed to create trigger
2016-08-16 18:57 elong Status resolved => closed