Skip to content

Provide ability to replace root volume and reboot instances instead of full termination #8735

@vaietc

Description

@vaietc

Description

What problem are you trying to solve?

tl;dr: We'd like the ability to have drift corrections on EC2 instances optionally leverage the replace root volume approach instead of always terminating and replacing the instances.

Use case 1: We have a set of Stateful applications that use EC2 instances with instances storage for better performance (relative to something like EBS volumes). We'd like to use Karpenter to manage the underlying capacity and provide drift resolutions (respecting the applications PDB for replica loss) but preferably without needing to replace the full instance (i.e terminate and launch new instance). This is because replacing the full instance results in the Instance store being wiped and a very long startup time for stateful systems that need to load huge amounts of data via a snapshot. Instead, we'd like a configuration option in the EC2NodeClass that allows us to leverage root volume replacement (for the OS volume) and perform a reboot when an AMI changes, for example.

Use case 2: Outside of Stateful applications, there is also benefit for users who rely on on-demand capacity which is hard to acquire (ex. rarer capacity types) and would prefer to reboot those instances instead of immediately terminating them.

How important is this feature to you?

We're working on a plan to migrate a number of our Stateful systems running on EC2 to Kubernetes - we'd like to have a proof-of-concept in the next 3 months or so. We'd be happy to contribute this feature if the proposal sounds appealing. We'd also like to hear about other options for managing capacity for large stateful systems like datavases.

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or requestneeds-triageIssues that need to be triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions