View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0006616 | SymmetricDS | Improvement | public | 2024-10-11 12:54 | 2024-10-30 22:00 |
Reporter | pbelov | Assigned To | pbelov | ||
Priority | normal | ||||
Status | resolved | Resolution | fixed | ||
Product Version | 3.15.0 | ||||
Target Version | 3.16.0 | Fixed in Version | 3.16.0 | ||
Summary | 0006616: Save column references as numeric values for faster look-up in AbstractDatabaseWriter.getRowData() | ||||
Description | Background: AbstractDatabaseWriter.getRowData() is used to prepare array of string values for import into target table. It does so by lookup up source columns by name and copying values from the source array to the target array. Currently getRowData() is searching for source columns (to copy data from) for every single data row. This adds up to about 1 second overhead per 500K rows. Originally discovered while working on bulk load, but affects all load batches. Proposed solution is to save column references as numeric values (source column 1 ==> target column 1) and re-use these numeric references to copy values from source to target skipping additional column name look-ups (for every row of data, after the first). | ||||
Steps To Reproduce | Unit test AbstractDatabaseWriterTest illustrates this issue. Specifically testGetRowData_LotsOfRandomAndFewSkippedColumns() can target current and new implementation to capture run times. Current way: rowData = abstractDatabaseWriter.getRowDataOld(csvData, CsvData.ROW_DATA); New way: rowData = abstractDatabaseWriter.getRowDataNew(csvData, CsvData.ROW_DATA); | ||||
Additional Information | Given: S = number of columns in the source table, T = number of columns in the target table, N = number of rows in the data load batch, Current algorithm cost is: O( S * T * N) Proposed algorithm cost is: O( S * T ) + O( N ); For large N this cost growth is linear | ||||
Tags | initial/partial load, performance | ||||
SymmetricDS: 3.16 649524e5 2024-10-30 21:58:42 Committer: GitHub Details Diff |
6616: Save column references as numeric values for faster look-ups in AbstractDatabaseWriter.getRowData (#205) * New TableColumnSourceReferences class and unit test to store column lookups in AbstractDatabaseWriter.getRowData() * AbstractDatabaseWriterTest unit test. |
Affected Issues 0006616 |
|
add - symmetric-db/src/main/java/org/jumpmind/db/model/TableColumnSourceReferences.java | Diff File | ||
mod - symmetric-io/src/main/java/org/jumpmind/symmetric/io/data/writer/AbstractDatabaseWriter.java | Diff File | ||
add - symmetric-io/src/test/java/org/jumpmind/symmetric/io/data/writer/AbstractDatabaseWriterTest.java | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2024-10-11 12:54 | pbelov | New Issue | |
2024-10-11 12:54 | pbelov | Status | new => assigned |
2024-10-11 12:54 | pbelov | Assigned To | => pbelov |
2024-10-11 12:54 | pbelov | Tag Attached: initial/partial load | |
2024-10-11 12:54 | pbelov | Tag Attached: performance | |
2024-10-11 14:26 | elong | Description Updated | View Revisions |
2024-10-11 14:27 | elong | Description Updated | View Revisions |
2024-10-11 16:06 | pbelov | Additional Information Updated | View Revisions |
2024-10-11 16:06 | pbelov | Note Added: 0002500 | |
2024-10-29 17:31 | pbelov | Status | assigned => resolved |
2024-10-29 17:31 | pbelov | Fixed in Version | => 3.16.0 |
2024-10-29 17:32 | pbelov | Tag Attached: load only | |
2024-10-29 17:32 | pbelov | Tag Detached: load only | |
2024-10-29 17:33 | pbelov | Note Edited: 0002500 | View Revisions |
2024-10-30 22:00 | pbelov | Changeset attached | => SymmetricDS 3.16 649524e5 |