Bulk import times scale with the number of tablets in a table. #5201

keith-turner · 2024-12-19T20:19:54Z

Describe the bug

When bulk importing into N tablets the bulk import v2 code will scan all tablet in the metadata table between the minimum and maximum tablet being imported into. For example if importing into 10 tablets into a table with 100K tablets its possible that the bulk import scans all 100K tablets, it depends on where the minimum and maximum tablet in the 10 fall in the 100k.

Expected behavior

Ideally the amount of scanning done would be directly related to the number of tablets being bulk imported and not the number of tablets int he table. This would be a large change to the way the code works. A good first step would be to add some logging to the current code that captures how much time this behavior is wasting. Then further decisions could be made about improving the code based on that.

keith-turner · 2024-12-19T20:20:20Z

This applies to bulk v2 code, not sure if applies to bulk v1 code.

keith-turner added the bug This issue has been verified to be a bug. label Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulk import times scale with the number of tablets in a table. #5201

Bulk import times scale with the number of tablets in a table. #5201

keith-turner commented Dec 19, 2024

keith-turner commented Dec 19, 2024

Bulk import times scale with the number of tablets in a table. #5201

Bulk import times scale with the number of tablets in a table. #5201

Comments

keith-turner commented Dec 19, 2024

keith-turner commented Dec 19, 2024