View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0003913 | SymmetricDS | Bug | public | 2019-04-11 17:01 | 2019-04-24 18:25 |
Reporter | mmichalek | Assigned To | mmichalek | ||
Priority | normal | ||||
Status | closed | Resolution | fixed | ||
Product Version | 3.9.0 | ||||
Target Version | 3.9.19 | Fixed in Version | 3.9.19 | ||
Summary | 0003913: Exception during keep alive can cause concurrent loading of batches | ||||
Description | 1) The source sends batches to the targets. 2) The target starts loading the data. 3) The source quickly times out after 10 seconds. 4) The source tries to send the batches again. First, it is rejected. 5) Some minor IO error happens on the target. (Probably cannot send the keep alive because the client has timed out the HTTP connection) This causes the server to release the client's lock. But the batches sent in step 0000001 are still loading. 6) The client sends the batches again because of the short timeout. This time it gets in because the lock has been broken. 7) The same batches start loading again, leading to inconsistent results on the target. Example logging where this was reprocued in dev: 2019-04-11 11:31:44,658 INFO [corp-000] [SqlRunner] [sql-runner-402] [store-001] Executing: insert into "DEMO-STORE-001"."PUBLIC"."SALE_TRANSACTION" ("TRAN_ID", "STORE_ID", "WORKSTATION", "DAY", "SEQ") values (3,'301','','',0) 2019-04-11 11:31:46,117 INFO [store-001] [RouterService] [store-001-job-16] Routed 1 data events in 4 ms 2019-04-11 11:31:50,790 INFO [store-001] [PushService] [store-001-push-default-3] Push data sent to corp:000:000 2019-04-11 11:31:50,802 INFO [corp-000] [SymmetricServlet] [qtp524223214-207] Error while processing PUT request for node: 001 at 127.0.0.1 with path: /corp-000/push The message is: Fake exception testing dell case. 2019-04-11 11:31:50,802 WARN [store-001] [PushService] [store-001-push-default-3] Expected but did not receive an ack for batch 000-22. This could be because the batch is corrupt - removing the batch from staging. (org.jumpmind.symmetric.service.impl.AbstractService.readAcks(AbstractService.java:341)) 2019-04-11 11:31:55,819 INFO [store-001] [PushService] [store-001-push-default-4] Push data sent to corp:000:000 2019-04-11 11:31:55,819 WARN [corp-000] [ThroughputStatisticManager] [corp-000-dataloader-8] Starting a new process even though the previous 'Service Database Push Load' process had not finished (org.jumpmind.symmetric.statistic.StatisticManager.newProcessInfo(StatisticManager.java:117)) 2019-04-11 11:31:55,820 INFO [corp-000] [ThroughputStatisticManager] [corp-000-dataloader-8] Details from the previous process: processType=Service Database Push Load,sourceNodeId=001,targetNodeId=000,queue=default,status=Loading,startTime=Thu Apr 11 11:31:50 EDT 2019 2019-04-11 11:31:55,821 INFO [corp-000] [IncomingBatchService] [corp-000-dataloader-8] Retrying batch 001-22 2019-04-11 11:31:55,830 INFO [corp-000] [SymmetricServlet] [qtp524223214-30] Error while processing PUT request for node: 001 at 127.0.0.1 with path: /corp-000/push The message is: Fake exception testing dell case. 2019-04-11 11:31:55,831 WARN [store-001] [PushService] [store-001-push-default-4] Expected but did not receive an ack for batch 000-22. This could be because the batch is corrupt - removing the batch from staging. (org.jumpmind.symmetric.service.impl.AbstractService.readAcks(AbstractService.java:341)) 2019-04-11 11:32:00,810 DEBUG [corp-000] [DefaultDatabaseWriter] [corp-000-dataloader-7] Preparing dml: insert into "DEMO-CORP"."PUBLIC"."SALE_TRANSACTION" ("TRAN_ID", "STORE_ID", "WORKSTATION", "DAY", "SEQ") values (?,?,?,?,?) 2019-04-11 11:32:00,811 DEBUG [corp-000] [DefaultDatabaseWriter] [corp-000-dataloader-7] Submitting data [3, '301', '', '', 0] with types [INTEGER, VARCHAR, VARCHAR, VARCHAR, INTEGER] 2019-04-11 11:32:00,838 INFO [store-001] [PushService] [store-001-push-default-5] Push data sent to corp:000:000 2019-04-11 11:32:00,839 WARN [corp-000] [ThroughputStatisticManager] [corp-000-dataloader-9] Starting a new process even though the previous 'Service Database Push Load' process had not finished (org.jumpmind.symmetric.statistic.StatisticManager.newProcessInfo(StatisticManager.java:117)) 2019-04-11 11:32:00,839 INFO [corp-000] [ThroughputStatisticManager] [corp-000-dataloader-9] Details from the previous process: processType=Service Database Push Load,sourceNodeId=001,targetNodeId=000,queue=default,status=Loading,startTime=Thu Apr 11 11:31:55 EDT 2019 2019-04-11 11:32:00,841 INFO [corp-000] [IncomingBatchService] [corp-000-dataloader-9] Skipping batch 001-22 2019-04-11 11:32:00,841 INFO [corp-000] [DataLoaderService] [qtp524223214-17] 1 data and 1 batches loaded during push request from store:001:001. 2019-04-11 11:32:00,844 INFO [store-001] [PushService] [store-001-push-default-5] Pushed data to node corp:000:000. 1 data and 1 batches were processed. (sale_transaction) 2019-04-11 11:32:05,837 DEBUG [corp-000] [DefaultDatabaseWriter] [corp-000-dataloader-8] Preparing dml: insert into "DEMO-CORP"."PUBLIC"."SALE_TRANSACTION" ("TRAN_ID", "STORE_ID", "WORKSTATION", "DAY", "SEQ") values (?,?,?,?,?) 2019-04-11 11:32:05,837 DEBUG [corp-000] [DefaultDatabaseWriter] [corp-000-dataloader-8] Submitting data [3, '301', '', '', 0] with types [INTEGER, VARCHAR, VARCHAR, VARCHAR, INTEGER] 2019-04-11 11:32:05,842 DEBUG [corp-000] [DefaultDatabaseWriter] [corp-000-dataloader-8] Preparing dml: update "DEMO-CORP"."PUBLIC"."SALE_TRANSACTION" set "STORE_ID" = ?, "WORKSTATION" = ?, "DAY" = ?, "SEQ" = ? where "TRAN_ID" = ? 2019-04-11 11:32:05,842 DEBUG [corp-000] [DefaultDatabaseWriter] [corp-000-dataloader-8] Submitting data ['301', '', '', 0, 3] with types [VARCHAR, VARCHAR, VARCHAR, INTEGER, INTEGER] | ||||
Steps To Reproduce | 1. Install 3.9.18 Demo 2. Change http timeout to 1 second 3. Force an exception to be thrown during the send of keep alive from the target DataLoaderService.java: while (!executor.awaitTermination(keepAliveMillis, TimeUnit.MILLISECONDS)) { outWriter.write("1=1&"); outWriter.flush(); if (true) { // force exception to be thrown here. throw new IOException("Fake exception testing dell case."); } } | ||||
Tags | No tags attached. | ||||
SymmetricDS: 3.9 6d82ba32 2019-04-11 13:12:54 Details Diff |
0003913: Exception during keep alive can cause concurrent loading of batches |
Affected Issues 0003913 |
|
mod - symmetric-core/src/main/java/org/jumpmind/symmetric/service/impl/DataLoaderService.java | Diff File | ||
SymmetricDS: 3.10 b7d38cb0 2019-04-11 13:12:54 Details Diff |
0003913: Exception during keep alive can cause concurrent loading of batches |
Affected Issues 0003913 |
|
mod - symmetric-core/src/main/java/org/jumpmind/symmetric/service/impl/DataLoaderService.java | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2019-04-11 17:01 | mmichalek | New Issue | |
2019-04-11 17:01 | mmichalek | Status | new => assigned |
2019-04-11 17:01 | mmichalek | Assigned To | => mmichalek |
2019-04-11 17:16 | mmichalek | Status | assigned => resolved |
2019-04-11 17:16 | mmichalek | Resolution | open => fixed |
2019-04-11 17:16 | mmichalek | Fixed in Version | => 3.9.19 |
2019-04-11 18:00 | mmichalek | Changeset attached | => SymmetricDS 3.9 6d82ba32 |
2019-04-11 18:00 | mmichalek | Changeset attached | => SymmetricDS 3.10 b7d38cb0 |
2019-04-23 15:14 | elong | Status | resolved => closed |
2019-04-24 18:25 | elong | Relationship added | related to 0003575 |