View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0005314 | SymmetricDS | Bug | public | 2022-05-30 11:04 | 2023-06-05 13:00 |
Reporter | psergey | Assigned To | emiller | ||
Priority | normal | ||||
Status | closed | Resolution | fixed | ||
Product Version | 3.13.4 | ||||
Target Version | 3.14.8 | Fixed in Version | 3.14.8 | ||
Summary | 0005314: Infinite synchronization loop when 3 nodes are connected to each other and sync_on_incoming_batch = 1 | ||||
Description | It is possible to fall into infinite synchronization loop when 3 nodes are connected to each other and sync_on_incoming_batch = 1. After updating single row in the database, the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds). I found couple ways to achieve infinite loop. Short reproducing description (full reproducing description with engine files and MSSQL 2017 scripts is below in the section 'Steps To Reproduce'). Common steps: 1. Tune three nodes (corp004, corp005 and corp006). 2. Create test database Sym_ReplTest_3NodesEachToEach, create test replicating table and add string into the table. 3. Create bidirectional three nodes replication (all 3 nodes are connected to each other) with sync_on_incoming_batch = 1. 4. Tune conflict resolver channels priority (set channels resolve_type to 'FALLBACK' and 'IGNORE' to prioritize nodes corp004 - corp005 - corp006). First infinite loop scenario: A. Update test string on every of three nodes simultaneously. Result: After single row update (simultaneous update on every of three nodes), the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds). Second infinite loop scenario: A. Block SymmetricDS outgoing link from node4 to node5 (using firewall). B. Update test string on node4, wait a bit (several seconds) and update test string on node4 again (one update is not enough). C. Unblock SymmetricDS outgoing link from node4 to node5. Result: After unblocking SymmetricDS outgoing link, the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds). | ||||
Steps To Reproduce | Full reproducing description. Common steps: 1. Tune three nodes (corp004, corp005 and corp006). node corp004, engine/corp004-004.properties: engine.name=corp004-004 db.driver=net.sourceforge.jtds.jdbc.Driver db.url=jdbc:jtds:sqlserver://localhost:1433/Sym_ReplTest_3NodesEachToEach;useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880 db.user=SymmetricDsUser db.password=Password db.validation.query=select 1 registration.url=http://192.168.0.187:31415/sync/corp005-005 sync.url=http://192.168.0.188:31415/sync/corp004-004 group.id=corp004 external.id=004 job.routing.period.time.ms=500 job.push.period.time.ms=1000 job.pull.period.time.ms=1000 auto.registration=true initial.load.create.first=true log.conflict.resolution=true job.purge.incoming.cron=0 0 */4 * * * job.purge.outgoing.cron=0 0 */4 * * * purge.retention.minutes=120 node corp005, engine/corp005-005.properties: engine.name=corp005-005 db.driver=net.sourceforge.jtds.jdbc.Driver db.url=jdbc:jtds:sqlserver://localhost:1433/Sym_ReplTest_3NodesEachToEach;useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880 db.user=SymmetricDsUser db.password=Password db.validation.query=select 1 registration.url= sync.url=http://192.168.0.187:31415/sync/corp005-005 group.id=corp005 external.id=005 job.routing.period.time.ms=500 job.push.period.time.ms=1000 job.pull.period.time.ms=1000 auto.registration=true initial.load.create.first=true log.conflict.resolution=true job.purge.incoming.cron=0 0 */4 * * * job.purge.outgoing.cron=0 0 */4 * * * purge.retention.minutes=120 node corp006, engine/corp006-006.properties: engine.name=corp006-006 db.driver=net.sourceforge.jtds.jdbc.Driver db.url=jdbc:jtds:sqlserver://localhost:1433/Sym_ReplTest_3NodesEachToEach;useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880 db.user=SymmetricDsUser db.password=Password db.validation.query=select 1 registration.url=http://192.168.0.187:31415/sync/corp005-005 sync.url=http://192.168.0.189:31415/sync/corp004-004 group.id=corp006 external.id=006 job.routing.period.time.ms=500 job.push.period.time.ms=1000 job.pull.period.time.ms=1000 auto.registration=true initial.load.create.first=true log.conflict.resolution=true job.purge.incoming.cron=0 0 */4 * * * job.purge.outgoing.cron=0 0 */4 * * * purge.retention.minutes=120 ----- 2. Create test database Sym_ReplTest_3NodesEachToEach, create test replicating table and add string into the table. Script for creating table: create table dbo.RSTable1 ( id int not null, constraint PK__RSTable1 primary key (id), test_id uniqueidentifier not null, rectime datetime not null default getutcdate(), col1 int not null default 0) go insert dbo.RSTable1 (id, test_id) select 1, newid(); go ----- 3. Create bidirectional three nodes replication (all 3 nodes are connected to each other) with sync_on_incoming_batch = 1. Script for creating replication: use Sym_ReplTest_3NodesEachToEach; GO -- Clear and load SymmetricDS Configuration delete from sym_conflict; delete from sym_trigger_router; delete from sym_trigger; delete from sym_router; delete from sym_channel where channel_id in ('forward', 'backward', 'peer'); delete from sym_node_group_link; delete from sym_node_group; delete from sym_node_host; delete from sym_node_identity; delete from sym_node_security; delete from sym_node; -- Channels -- Channel "peer" for all tables insert into sym_channel (channel_id, processing_order, max_batch_size, enabled, description) values('peer', 1, 100000, 1, 'transactional data from one peer to another'); -- Node Groups insert into sym_node_group (node_group_id) values ('corp004'); insert into sym_node_group (node_group_id) values ('corp005'); insert into sym_node_group (node_group_id) values ('corp006'); -- Node Group Links -- Sends changes to other peer insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp004', 'corp005', 'P'); insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp004', 'corp006', 'P'); insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp005', 'corp004', 'P'); insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp005', 'corp006', 'P'); insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp006', 'corp004', 'P'); insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp006', 'corp005', 'P'); -- Triggers -- Triggers for tables on "peer" channel insert into sym_trigger (trigger_id,source_table_name,channel_id,last_update_time,create_time) values('RSTable1','RSTable1','peer',current_timestamp,current_timestamp); -- Triggers sync_on_incoming_batch update sym_trigger set sync_on_incoming_batch = 1 -- Routers -- Router sends all data from one peer to another insert into sym_router (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time) values('corp004_2_corp005', 'corp004', 'corp005', 'default',current_timestamp, current_timestamp); insert into sym_router (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time) values('corp004_2_corp006', 'corp004', 'corp006', 'default',current_timestamp, current_timestamp); insert into sym_router (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time) values('corp005_2_corp004', 'corp005', 'corp004', 'default',current_timestamp, current_timestamp); insert into sym_router (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time) values('corp005_2_corp006', 'corp005', 'corp006', 'default',current_timestamp, current_timestamp); insert into sym_router (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time) values('corp006_2_corp004', 'corp006', 'corp004', 'default',current_timestamp, current_timestamp); insert into sym_router (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time) values('corp006_2_corp005', 'corp006', 'corp005', 'default',current_timestamp, current_timestamp); -- Trigger Routers -- Send all items to all stores insert into sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('RSTable1','corp004_2_corp005', 100, current_timestamp, current_timestamp); insert into sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('RSTable1','corp004_2_corp006', 100, current_timestamp, current_timestamp); insert into sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('RSTable1','corp005_2_corp004', 100, current_timestamp, current_timestamp); insert into sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('RSTable1','corp005_2_corp006', 100, current_timestamp, current_timestamp); insert into sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('RSTable1','corp006_2_corp004', 100, current_timestamp, current_timestamp); insert into sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('RSTable1','corp006_2_corp005', 100, current_timestamp, current_timestamp); ----- 4. Tune conflict resolver channels priority (set channels resolve_type to 'FALLBACK' and 'IGNORE' to prioritize nodes corp004 - corp005 - corp006). -- Conflicts delete from sym_conflict; insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time) values ('conflict_corp004_2_corp005', 'peer', 'corp004', 'corp005', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp); insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time) values ('conflict_corp004_2_corp006', 'peer', 'corp004', 'corp006', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp); insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time) values ('conflict_corp005_2_corp004', 'peer', 'corp005', 'corp004', 'USE_CHANGED_DATA', 'IGNORE', 'OFF', 0, 1, current_timestamp, current_timestamp); insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time) values ('conflict_corp005_2_corp006', 'peer', 'corp005', 'corp006', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp); insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time) values ('conflict_corp006_2_corp004', 'peer', 'corp006', 'corp004', 'USE_CHANGED_DATA', 'IGNORE', 'OFF', 0, 1, current_timestamp, current_timestamp); insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time) values ('conflict_corp006_2_corp005', 'peer', 'corp006', 'corp005', 'USE_CHANGED_DATA', 'IGNORE', 'OFF', 0, 1, current_timestamp, current_timestamp); ----- First infinite loop scenario: A. Update test string on every of three nodes simultaneously. SQL request: use Sym_ReplTest_3NodesEachToEach; go set nocount on; go declare @col1 int = 0, -- server id @test_id uniqueidentifier, @maxid int; if @@SERVERNAME = 'corp004' set @col1=4; if @@SERVERNAME = 'corp005' set @col1=5; if @@SERVERNAME = 'corp006' set @col1=6; set @test_id = newid(); update dbo.RSTable1 set test_id=@test_id, rectime=default, col1=@col1 where id=1; go Result: After single row update (simultaneous update on every of three nodes), the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds). ----- Second infinite loop scenario: A. Block SymmetricDS outgoing link from node4 to node5 (use firewall to block connections from IP: 192.168.0.188 to IP: 192.168.0.187, port: 31415). B. Update test string on node4, wait a bit (several seconds) and update test string on node4 again (one update is not enough). SQL request: use Sym_ReplTest_3NodesEachToEach; go set nocount on; go declare @col1 int = 4, -- server id @test_id uniqueidentifier, @maxid int; set @test_id = newid(); update dbo.RSTable1 set test_id=@test_id, rectime=default, col1=@col1 where id=1; go C. Unblock SymmetricDS outgoing link from node4 to node5. Result: After unblocking SymmetricDS outgoing link, the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds). | ||||
Tags | data sync | ||||
|
This is documented in the Sync on Incoming section of the documentation found here: https://www.symmetricds.org/doc/3.13/html/user-guide.html#_table_triggers This appears to be working as intended. For what you are trying to achieve, Master to Master might be the correct configuration setup. This is achieved by having only one group, compared to the 3 that get you the infinite loop. |
|
There are the same infinite synchronization loop exists when 3 nodes are located in one group. Steps to reproduce are the same for both scenario from "Description" section. Engine configuration files and replication scripts are below. If you think, "one group" case requires separate ticked, comment or close this ticket, I will create new one. It is impossible to prioritize nodes in single group; there are only one string with resolution type 'FALLBACK' in sym_conflict table: insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time) values ('conflict_corp_2_corp', 'peer', 'corp', 'corp', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp); I cannot choose other automatic resolution types if I need to be sure to have the same data on every nodes in case of conflict. In case of simultaneous string update on several nodes, resolution type 'IGNORE' leads to inconsistency by design, resolution type 'NEWER_WINS' also leads to inconsistency (comment to ticket 0005312: https://www.symmetricds.org/issues/view.php?id=5312#c2083, I have checked the case in the ticket when nodes are located in one group). ----- Changed strings in engine files for "3 nodes in one group" configuration (compared to engine files in "Steps to reproduce" section): node corp004, engine/corp-004.properties: engine.name=corp-004 registration.url=http://192.168.0.187:31415/sync/corp-005 sync.url=http://192.168.0.188:31415/sync/corp-004 group.id=corp node corp005, engine/corp-005.properties: engine.name=corp-005 sync.url=http://192.168.0.187:31415/sync/corp005-005 group.id=corp node corp006, engine/corp-006.properties: engine.name=corp-006 registration.url=http://192.168.0.187:31415/sync/corp-005 sync.url=http://192.168.0.189:31415/sync/corp-004 group.id=corp ----- Script for creating replication ("3 nodes in one group"): use Sym_ReplTest_3NodesEachToEach; GO -- Clear and load SymmetricDS Configuration delete from sym_conflict; delete from sym_trigger_router; delete from sym_trigger; delete from sym_router; delete from sym_channel where channel_id in ('forward', 'backward', 'peer'); delete from sym_node_group_link; delete from sym_node_group; delete from sym_node_host; delete from sym_node_identity; delete from sym_node_security; delete from sym_node; -- Channels -- Channel "peer" for all tables insert into sym_channel (channel_id, processing_order, max_batch_size, enabled, description) values('peer', 1, 100000, 1, 'transactional data from one peer to another'); -- Node Groups insert into sym_node_group (node_group_id) values ('corp'); -- Node Group Links -- Sends changes to other peer insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp', 'corp', 'P'); -- Triggers -- Triggers for tables on "peer" channel insert into sym_trigger (trigger_id,source_table_name,channel_id,last_update_time,create_time) values('RSTable1','RSTable1','peer',current_timestamp,current_timestamp); -- Triggers sync_on_incoming_batch update sym_trigger set sync_on_incoming_batch = 1 -- Routers -- Router sends all data from one peer to another insert into sym_router (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time) values('corp_2_corp', 'corp', 'corp', 'default',current_timestamp, current_timestamp); -- Trigger Routers -- Send all items to all stores insert into sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('RSTable1','corp_2_corp', 100, current_timestamp, current_timestamp); -- Conflicts delete from sym_conflict; insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time) values ('conflict_corp_2_corp', 'peer', 'corp', 'corp', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp); |
|
If you setup with a single node group you will not need sync on incoming on and you will achieve the same replication outcome. All data will go everywhere without the looping. The sync on incoming should be used with caution as it will create looping without proper setup. |
SymmetricDS: 3.14 384691a8 2023-06-05 12:21:37 evan-miller-jumpmind Details Diff |
0005314: Prevented infinite sync loop when sync_on_incoming_batch=1 in a master-to-master setup |
Affected Issues 0005314 |
|
mod - symmetric-core/src/main/java/org/jumpmind/symmetric/service/impl/TriggerRouterService.java | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2022-05-30 11:04 | psergey | New Issue | |
2022-05-30 11:04 | psergey | Tag Attached: data sync | |
2022-05-30 11:04 | psergey | Tag Attached: looping | |
2022-06-01 17:14 | jvanmeter | Note Added: 0002084 | |
2022-06-01 17:16 | jvanmeter | Status | new => closed |
2022-06-01 17:16 | jvanmeter | Resolution | open => no change required |
2022-06-03 12:19 | psergey | Status | closed => feedback |
2022-06-03 12:19 | psergey | Resolution | no change required => reopened |
2022-06-03 12:19 | psergey | Note Added: 0002085 | |
2022-06-15 17:33 | josh-a-hicks | Note Added: 0002087 | |
2022-08-09 17:59 | elong | Tag Detached: looping | |
2023-06-05 12:17 | emiller | Assigned To | => emiller |
2023-06-05 12:17 | emiller | Status | feedback => assigned |
2023-06-05 12:22 | emiller | Status | assigned => resolved |
2023-06-05 12:22 | emiller | Resolution | reopened => fixed |
2023-06-05 12:22 | emiller | Fixed in Version | => 3.14.8 |
2023-06-05 12:23 | emiller | Target Version | => 3.14.8 |
2023-06-05 13:00 | Changeset attached | => SymmetricDS 3.14 384691a8 | |
2023-07-19 12:58 | admin | Status | resolved => closed |