Avoid having tablet metadata in memory for queued compaction jobs #5188
Labels
enhancement
This issue describes a new feature, improvement, or optimization.
Milestone
Is your feature request related to a problem? Please describe.
Currently when a compaction job starts executing it goes through the following process.
When lots of tablet have lots of files (this could happen when compactions are not running for some period of time), keeping lots of tablet metadata objects in memory could cause lots of memory pressure on the manager. This could lead to cascading failures where when the rest of the system is unhealthy it causes the manager to become unhealthy, leaving the manager unable to work through a temporary problem.
Describe the solution you'd like
It's probably possible to stop keeping the tablet metadata in memory when a compaction job is queued. This would make memory usage scale with the number of files in compaction jobs instead of the number of files in tablets, which is much better. The following change would also make compaction reservation more efficient as it would avoid reading tablet metadata in some cases and then submitting a conditional mutation.
This process would use less memory overall and would streamline compaction reservation likely decreasing the time that it takes to atomically reserve a set of files for compaction for a tablet. This change could also simplify the code.
The text was updated successfully, but these errors were encountered: