-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Mergeability column to support automatic merges #5187
base: main
Are you sure you want to change the base?
Conversation
This adds a new Mergeability column for marking if/when tablets are eligible to me merged by the system based on a threshold. The column stores a duration that is relative to the time the Manager uses, Steady time. There are 3 possible states for the value: 1) -1 : This means a tablet will never automatically merge 2) 0 : Tablet is eligible now to automatically merge 3) positive duration: eligible to merge after the given duration relative to the current system Steady time. Ie. the tablet can be merged if SteadyTime is later than the given duration. This change only adds the new column itself and populates it. The default is to never merge automatically for all cases except for when the system automatically splits tablets. In that case the newly split tablets are marked as being eligible to merge immediately. Future updates will add API enhancements to allow setting/viewing the mergeability setting as well as to enable automatic merging by the system that is based on this new column value. When automatic merging is enabled, if a user wants to make a tablet elgible to be merged in the future they would do so by adding a period of time to the current SteadyTime. For example, to make a tablet eligible to be merged 3 days from now the user would read the current SteadyTime value (represented as a duration), add 3 days to that and then store that new duration in the column. When the current steady time passes that duration the tablet would be eligible to be merged. See apache#5014 for more details
70dfb91
to
346f936
Compare
} else if (isNever()) { | ||
return "TabletMergeability=NEVER"; | ||
} | ||
return "TabletMergeability=AFTER:" + delay.toNanos(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to convert this to ms
or s
in the toString
method for ease of understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, not a bad idea, nanoseconds would be pretty tough to understand in a log.
core/src/main/java/org/apache/accumulo/core/client/admin/TabletMergeability.java
Outdated
Show resolved
Hide resolved
@@ -392,6 +393,8 @@ interface TabletUpdates<T> { | |||
|
|||
T putCloned(); | |||
|
|||
T putTabletMergeability(TabletMergeability tabletMergeability); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we will need a steady time to evaluate if something should merge. Seems like the steady time should always be set when the mergability is set. Although its only needed when the duration is > 0. The steady time could be encoded in the same columns value.
T putTabletMergeability(TabletMergeability tabletMergeability); | |
T putTabletMergeability(TabletMergeability tabletMergeability, SteadyTime steadTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steady time could be added in later PRs. Just wondering how it will be persisted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original plan was to persist everything as one value by adding the current manager time to the value. So basically to compute the mergability duration you would read the current SteadyTime then add the offset to it and store it and then later you could compare to the current time by taking a diff.
However, I was thinking more about it and I think that storing it as two separate values makes sense because it allows you to do better debugging (logigng) plus you can see extra information such as when the value was created. It also allows doing other things like update/replacing the original SteadyTime, etc and other metric calculations if we keep it separate so I'll change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original plan was to persist everything as one value by adding the current manager time to the value.
Ok I had thought this would store the exact duration specified by the user. Summing the steady time and duration from the user would work. There may be a slight advantage for keeping them separate for debugging purposes as mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keith-turner - The latest update adds a new class just to wrap the metadata and make it easier to validate and store the delay + the steady time as a separate value. The intent is that the the client will still only use and pass TabletMergeability because only the Manager will be touching TabletMergeabilityMetadata
as it just exists in Ample. A user would just set the TabletMergeability
and if a delay is set then the Manager will set the SteadyTime value when inserting TabletMergeabilityMetadata
.
server/manager/src/main/java/org/apache/accumulo/manager/tableOps/merge/MergeTablets.java
Outdated
Show resolved
Hide resolved
This update introduces TabletMergeabilityMetadata so that the steady time of the system can be stored with the delay value separately which can come in handy for debugging or if it needs to be modified. SteadyTime is only stored if the delay is > 0
Note: The following description as been updated now that SteadyTime is stored separately in the mergeability metadata
This adds a new Mergeability column for marking if/when tablets are eligible to me merged by the system based on a threshold. The column stores two values, a duration which is a delay for when a tablet can be merged that is relative to the time the Manager uses, Steady time. It also stores the current Steady time value when inserted so that later we can add the delay plus the original time when inserted to see if enough time has passed. The Steady Time value will only be stored if the delay is positive, otherwise it will be null as it won't be used.
There are 3 possible states for the delay value:
to the current system Steady time. Ie. the tablet can be merged if the current manager time is later than the delay value + the steady time value when inserted
This change only adds the new column itself and populates it. The default is to never merge automatically for all cases except for when the system automatically splits tablets. In that case the newly split tablets are marked as being eligible to merge immediately.
Future updates will add API enhancements to allow setting/viewing the mergeability setting as well as to enable automatic merging by the system that is based on this new column value.
When automatic merging is enabled, if a user wants to make a tablet eligible to be merged in the future they would do so by setting a delay that is positive. For example, to make a tablet eligible to be merged 3 days from now the user set a duration of 3 days in the API (future PR) and when the system inserts the value into metadata it will also include the current SteadyTime on creation. Later when the the current steady time passes that set delay + original stored steady time value the tablet would be eligible to be merged.
See #5014 for more details