View Issue Details

IDProjectCategoryView StatusLast Update
0005314SymmetricDSBugpublic2023-06-05 13:00
Reporterpsergey Assigned Toemiller  
Prioritynormal 
Status closedResolutionfixed 
Product Version3.13.4 
Target Version3.14.8Fixed in Version3.14.8 
Summary0005314: Infinite synchronization loop when 3 nodes are connected to each other and sync_on_incoming_batch = 1
DescriptionIt is possible to fall into infinite synchronization loop when 3 nodes are connected to each other and sync_on_incoming_batch = 1.
After updating single row in the database, the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds).

I found couple ways to achieve infinite loop.

Short reproducing description (full reproducing description with engine files and MSSQL 2017 scripts is below in the section 'Steps To Reproduce').

Common steps:
1. Tune three nodes (corp004, corp005 and corp006).
2. Create test database Sym_ReplTest_3NodesEachToEach, create test replicating table and add string into the table.
3. Create bidirectional three nodes replication (all 3 nodes are connected to each other) with sync_on_incoming_batch = 1.
4. Tune conflict resolver channels priority (set channels resolve_type to 'FALLBACK' and 'IGNORE' to prioritize nodes corp004 - corp005 - corp006).

First infinite loop scenario:
A. Update test string on every of three nodes simultaneously.

Result:
After single row update (simultaneous update on every of three nodes), the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds).

Second infinite loop scenario:
A. Block SymmetricDS outgoing link from node4 to node5 (using firewall).
B. Update test string on node4, wait a bit (several seconds) and update test string on node4 again (one update is not enough).
C. Unblock SymmetricDS outgoing link from node4 to node5.

Result:
After unblocking SymmetricDS outgoing link, the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds).
Steps To ReproduceFull reproducing description.

Common steps:

1. Tune three nodes (corp004, corp005 and corp006).

node corp004, engine/corp004-004.properties:

engine.name=corp004-004
db.driver=net.sourceforge.jtds.jdbc.Driver
db.url=jdbc:jtds:sqlserver://localhost:1433/Sym_ReplTest_3NodesEachToEach;useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880
db.user=SymmetricDsUser
db.password=Password
db.validation.query=select 1
registration.url=http://192.168.0.187:31415/sync/corp005-005
sync.url=http://192.168.0.188:31415/sync/corp004-004
group.id=corp004
external.id=004
job.routing.period.time.ms=500
job.push.period.time.ms=1000
job.pull.period.time.ms=1000
auto.registration=true
initial.load.create.first=true
log.conflict.resolution=true
job.purge.incoming.cron=0 0 */4 * * *
job.purge.outgoing.cron=0 0 */4 * * *
purge.retention.minutes=120

node corp005, engine/corp005-005.properties:

engine.name=corp005-005
db.driver=net.sourceforge.jtds.jdbc.Driver
db.url=jdbc:jtds:sqlserver://localhost:1433/Sym_ReplTest_3NodesEachToEach;useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880
db.user=SymmetricDsUser
db.password=Password
db.validation.query=select 1
registration.url=
sync.url=http://192.168.0.187:31415/sync/corp005-005
group.id=corp005
external.id=005
job.routing.period.time.ms=500
job.push.period.time.ms=1000
job.pull.period.time.ms=1000
auto.registration=true
initial.load.create.first=true
log.conflict.resolution=true
job.purge.incoming.cron=0 0 */4 * * *
job.purge.outgoing.cron=0 0 */4 * * *
purge.retention.minutes=120

node corp006, engine/corp006-006.properties:

engine.name=corp006-006
db.driver=net.sourceforge.jtds.jdbc.Driver
db.url=jdbc:jtds:sqlserver://localhost:1433/Sym_ReplTest_3NodesEachToEach;useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880
db.user=SymmetricDsUser
db.password=Password
db.validation.query=select 1
registration.url=http://192.168.0.187:31415/sync/corp005-005
sync.url=http://192.168.0.189:31415/sync/corp004-004
group.id=corp006
external.id=006
job.routing.period.time.ms=500
job.push.period.time.ms=1000
job.pull.period.time.ms=1000
auto.registration=true
initial.load.create.first=true
log.conflict.resolution=true
job.purge.incoming.cron=0 0 */4 * * *
job.purge.outgoing.cron=0 0 */4 * * *
purge.retention.minutes=120

-----

2. Create test database Sym_ReplTest_3NodesEachToEach, create test replicating table and add string into the table.

Script for creating table:

create table dbo.RSTable1 (
 id int not null,
 constraint PK__RSTable1 primary key (id),
 test_id uniqueidentifier not null,
 rectime datetime not null default getutcdate(),
 col1 int not null default 0)
go
insert dbo.RSTable1 (id, test_id) select 1, newid();
go

-----

3. Create bidirectional three nodes replication (all 3 nodes are connected to each other) with sync_on_incoming_batch = 1.

Script for creating replication:

use Sym_ReplTest_3NodesEachToEach;
GO

-- Clear and load SymmetricDS Configuration

delete from sym_conflict;
delete from sym_trigger_router;
delete from sym_trigger;
delete from sym_router;
delete from sym_channel where channel_id in ('forward', 'backward', 'peer');
delete from sym_node_group_link;
delete from sym_node_group;
delete from sym_node_host;
delete from sym_node_identity;
delete from sym_node_security;
delete from sym_node;

-- Channels
-- Channel "peer" for all tables
insert into sym_channel
(channel_id, processing_order, max_batch_size, enabled, description)
values('peer', 1, 100000, 1, 'transactional data from one peer to another');

-- Node Groups
insert into sym_node_group (node_group_id) values ('corp004');
insert into sym_node_group (node_group_id) values ('corp005');
insert into sym_node_group (node_group_id) values ('corp006');

-- Node Group Links
-- Sends changes to other peer
insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp004', 'corp005', 'P');
insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp004', 'corp006', 'P');
insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp005', 'corp004', 'P');
insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp005', 'corp006', 'P');
insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp006', 'corp004', 'P');
insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp006', 'corp005', 'P');

-- Triggers
-- Triggers for tables on "peer" channel
insert into sym_trigger
(trigger_id,source_table_name,channel_id,last_update_time,create_time)
values('RSTable1','RSTable1','peer',current_timestamp,current_timestamp);

-- Triggers sync_on_incoming_batch
update sym_trigger set sync_on_incoming_batch = 1

-- Routers
-- Router sends all data from one peer to another
insert into sym_router
(router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
values('corp004_2_corp005', 'corp004', 'corp005', 'default',current_timestamp, current_timestamp);

insert into sym_router
(router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
values('corp004_2_corp006', 'corp004', 'corp006', 'default',current_timestamp, current_timestamp);

insert into sym_router
(router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
values('corp005_2_corp004', 'corp005', 'corp004', 'default',current_timestamp, current_timestamp);

insert into sym_router
(router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
values('corp005_2_corp006', 'corp005', 'corp006', 'default',current_timestamp, current_timestamp);

insert into sym_router
(router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
values('corp006_2_corp004', 'corp006', 'corp004', 'default',current_timestamp, current_timestamp);

insert into sym_router
(router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
values('corp006_2_corp005', 'corp006', 'corp005', 'default',current_timestamp, current_timestamp);

-- Trigger Routers
-- Send all items to all stores
insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('RSTable1','corp004_2_corp005', 100, current_timestamp, current_timestamp);

insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('RSTable1','corp004_2_corp006', 100, current_timestamp, current_timestamp);

insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('RSTable1','corp005_2_corp004', 100, current_timestamp, current_timestamp);

insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('RSTable1','corp005_2_corp006', 100, current_timestamp, current_timestamp);

insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('RSTable1','corp006_2_corp004', 100, current_timestamp, current_timestamp);

insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('RSTable1','corp006_2_corp005', 100, current_timestamp, current_timestamp);

-----

4. Tune conflict resolver channels priority (set channels resolve_type to 'FALLBACK' and 'IGNORE' to prioritize nodes corp004 - corp005 - corp006).

-- Conflicts
delete from sym_conflict;

insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time)
values ('conflict_corp004_2_corp005', 'peer', 'corp004', 'corp005', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp);

insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time)
values ('conflict_corp004_2_corp006', 'peer', 'corp004', 'corp006', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp);

insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time)
values ('conflict_corp005_2_corp004', 'peer', 'corp005', 'corp004', 'USE_CHANGED_DATA', 'IGNORE', 'OFF', 0, 1, current_timestamp, current_timestamp);

insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time)
values ('conflict_corp005_2_corp006', 'peer', 'corp005', 'corp006', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp);

insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time)
values ('conflict_corp006_2_corp004', 'peer', 'corp006', 'corp004', 'USE_CHANGED_DATA', 'IGNORE', 'OFF', 0, 1, current_timestamp, current_timestamp);

insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time)
values ('conflict_corp006_2_corp005', 'peer', 'corp006', 'corp005', 'USE_CHANGED_DATA', 'IGNORE', 'OFF', 0, 1, current_timestamp, current_timestamp);

-----

First infinite loop scenario:
A. Update test string on every of three nodes simultaneously.

SQL request:

use Sym_ReplTest_3NodesEachToEach;
go
set nocount on;
go
declare
 @col1 int = 0, -- server id
 @test_id uniqueidentifier, @maxid int;
if @@SERVERNAME = 'corp004' set @col1=4;
if @@SERVERNAME = 'corp005' set @col1=5;
if @@SERVERNAME = 'corp006' set @col1=6;
set @test_id = newid();
update dbo.RSTable1 set test_id=@test_id, rectime=default, col1=@col1 where id=1;
go

Result:
After single row update (simultaneous update on every of three nodes), the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds).

-----

Second infinite loop scenario:
A. Block SymmetricDS outgoing link from node4 to node5 (use firewall to block connections from IP: 192.168.0.188 to IP: 192.168.0.187, port: 31415).

B. Update test string on node4, wait a bit (several seconds) and update test string on node4 again (one update is not enough).

SQL request:

use Sym_ReplTest_3NodesEachToEach;
go
set nocount on;
go
declare
 @col1 int = 4, -- server id
 @test_id uniqueidentifier, @maxid int;
set @test_id = newid();
update dbo.RSTable1 set test_id=@test_id, rectime=default, col1=@col1 where id=1;
go

C. Unblock SymmetricDS outgoing link from node4 to node5.

Result:
After unblocking SymmetricDS outgoing link, the same data will be infinitely added to outgoing batch on every of 3 nodes (every several seconds).
Tagsdata sync

Activities

jvanmeter

2022-06-01 17:14

developer   ~0002084

This is documented in the Sync on Incoming section of the documentation found here:

https://www.symmetricds.org/doc/3.13/html/user-guide.html#_table_triggers

This appears to be working as intended. For what you are trying to achieve, Master to Master might be the correct configuration setup. This is achieved by having only one group, compared to the 3 that get you the infinite loop.

psergey

2022-06-03 12:19

reporter   ~0002085

There are the same infinite synchronization loop exists when 3 nodes are located in one group.
Steps to reproduce are the same for both scenario from "Description" section. Engine configuration files and replication scripts are below.

If you think, "one group" case requires separate ticked, comment or close this ticket, I will create new one.

It is impossible to prioritize nodes in single group; there are only one string with resolution type 'FALLBACK' in sym_conflict table:
insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time)
values ('conflict_corp_2_corp', 'peer', 'corp', 'corp', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp);

I cannot choose other automatic resolution types if I need to be sure to have the same data on every nodes in case of conflict.
In case of simultaneous string update on several nodes, resolution type 'IGNORE' leads to inconsistency by design, resolution type 'NEWER_WINS' also leads to inconsistency (comment to ticket 0005312: https://www.symmetricds.org/issues/view.php?id=5312#c2083, I have checked the case in the ticket when nodes are located in one group).

-----

Changed strings in engine files for "3 nodes in one group" configuration (compared to engine files in "Steps to reproduce" section):

node corp004, engine/corp-004.properties:
engine.name=corp-004
registration.url=http://192.168.0.187:31415/sync/corp-005
sync.url=http://192.168.0.188:31415/sync/corp-004
group.id=corp

node corp005, engine/corp-005.properties:
engine.name=corp-005
sync.url=http://192.168.0.187:31415/sync/corp005-005
group.id=corp

node corp006, engine/corp-006.properties:
engine.name=corp-006
registration.url=http://192.168.0.187:31415/sync/corp-005
sync.url=http://192.168.0.189:31415/sync/corp-004
group.id=corp

-----

Script for creating replication ("3 nodes in one group"):

use Sym_ReplTest_3NodesEachToEach;
GO

-- Clear and load SymmetricDS Configuration

delete from sym_conflict;

delete from sym_trigger_router;
delete from sym_trigger;
delete from sym_router;
delete from sym_channel where channel_id in ('forward', 'backward', 'peer');
delete from sym_node_group_link;
delete from sym_node_group;
delete from sym_node_host;
delete from sym_node_identity;
delete from sym_node_security;
delete from sym_node;

-- Channels
-- Channel "peer" for all tables
insert into sym_channel
(channel_id, processing_order, max_batch_size, enabled, description)
values('peer', 1, 100000, 1, 'transactional data from one peer to another');

-- Node Groups
insert into sym_node_group (node_group_id) values ('corp');

-- Node Group Links
-- Sends changes to other peer
insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('corp', 'corp', 'P');

-- Triggers
-- Triggers for tables on "peer" channel
insert into sym_trigger
(trigger_id,source_table_name,channel_id,last_update_time,create_time)
values('RSTable1','RSTable1','peer',current_timestamp,current_timestamp);

-- Triggers sync_on_incoming_batch
update sym_trigger set sync_on_incoming_batch = 1

-- Routers
-- Router sends all data from one peer to another
insert into sym_router
(router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
values('corp_2_corp', 'corp', 'corp', 'default',current_timestamp, current_timestamp);

-- Trigger Routers
-- Send all items to all stores
insert into sym_trigger_router
(trigger_id,router_id,initial_load_order,last_update_time,create_time)
values('RSTable1','corp_2_corp', 100, current_timestamp, current_timestamp);

-- Conflicts
delete from sym_conflict;
insert into sym_conflict (conflict_id, target_channel_id, source_node_group_id, target_node_group_id, detect_type, resolve_type, ping_back, resolve_changes_only, resolve_row_only, create_time, last_update_time)
values ('conflict_corp_2_corp', 'peer', 'corp', 'corp', 'USE_CHANGED_DATA', 'FALLBACK', 'OFF', 0, 1, current_timestamp, current_timestamp);

josh-a-hicks

2022-06-15 17:33

developer   ~0002087

If you setup with a single node group you will not need sync on incoming on and you will achieve the same replication outcome. All data will go everywhere without the looping. The sync on incoming should be used with caution as it will create looping without proper setup.

Related Changesets

SymmetricDS: 3.14 384691a8

2023-06-05 12:21:37

evan-miller-jumpmind

Details Diff
0005314: Prevented infinite sync loop when sync_on_incoming_batch=1 in a master-to-master setup Affected Issues
0005314
mod - symmetric-core/src/main/java/org/jumpmind/symmetric/service/impl/TriggerRouterService.java Diff File

Issue History

Date Modified Username Field Change
2022-05-30 11:04 psergey New Issue
2022-05-30 11:04 psergey Tag Attached: data sync
2022-05-30 11:04 psergey Tag Attached: looping
2022-06-01 17:14 jvanmeter Note Added: 0002084
2022-06-01 17:16 jvanmeter Status new => closed
2022-06-01 17:16 jvanmeter Resolution open => no change required
2022-06-03 12:19 psergey Status closed => feedback
2022-06-03 12:19 psergey Resolution no change required => reopened
2022-06-03 12:19 psergey Note Added: 0002085
2022-06-15 17:33 josh-a-hicks Note Added: 0002087
2022-08-09 17:59 elong Tag Detached: looping
2023-06-05 12:17 emiller Assigned To => emiller
2023-06-05 12:17 emiller Status feedback => assigned
2023-06-05 12:22 emiller Status assigned => resolved
2023-06-05 12:22 emiller Resolution reopened => fixed
2023-06-05 12:22 emiller Fixed in Version => 3.14.8
2023-06-05 12:23 emiller Target Version => 3.14.8
2023-06-05 13:00 Changeset attached => SymmetricDS 3.14 384691a8
2023-07-19 12:58 admin Status resolved => closed