)]}'
{"/COMMIT_MSG":[{"author":{"_account_id":1000002,"name":"Serge Bazanski","display_name":"Serge","email":"serge@monogon.tech","username":"serge","avatars":[{"url":"https://www.gravatar.com/avatar/52c41428b6369f2c02b9717425216f7d.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d32","height":32},{"url":"https://www.gravatar.com/avatar/52c41428b6369f2c02b9717425216f7d.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d56","height":56},{"url":"https://www.gravatar.com/avatar/52c41428b6369f2c02b9717425216f7d.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d100","height":100},{"url":"https://www.gravatar.com/avatar/52c41428b6369f2c02b9717425216f7d.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d120","height":120}]},"change_message_id":"cf476af55c952ed43bf5f79bb688465ab8618ee5","unresolved":true,"context_lines":[{"line_number":13,"context_line":"counted as a voting member by etcd."},{"line_number":14,"context_line":""},{"line_number":15,"context_line":"If the membership is removed, but then removing the role fails, the"},{"line_number":16,"context_line":"cluster ends up in an inconsistent state. If the affected node was the"},{"line_number":17,"context_line":"curator or etcd leader, that will almost certainly happen. In this case,"},{"line_number":18,"context_line":"the request can just be retried until it succeeds, and then the cluster"},{"line_number":19,"context_line":"state is consistent again between etcd membership and roles."},{"line_number":20,"context_line":""}],"source_content_type":"text/x-gerrit-commit-message","patch_set":1,"id":"11a97334_2eb91705","line":17,"range":{"start_line":16,"start_character":41,"end_line":17,"end_character":58},"updated":"2024-09-19 12:27:22.000000000","message":"I think we should immediately fail the request if it is being served by the curator/raft leader that the user is attempting to remove (and return eg. FAILED_PRECONDITION). I don\u0027t think there\u0027s much sense in trying something that will obviously fail or is guaranteed to leave the cluster in an inconsistent state.\n\nThen we should have an RPC that allows the user to both move the raft leader (etcd maintenance.MoveLeader) and Curator leader (clientv3 Election.Resign). Then metroctl/client code has all the tools necessary to at least _attempt_ a successful leader removal.\n\n(this can/should be a follow-up CL)","commit_id":"74604bb01ca92dbc52e2ece5b9b4e0fdfe3ef8e0"},{"author":{"_account_id":1000038,"name":"Jan Schär","display_name":"Jan","email":"jan@monogon.tech","username":"jan","avatars":[{"url":"https://www.gravatar.com/avatar/fd0e7f48847aa0e46c8f361df2d6c26b.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d32","height":32},{"url":"https://www.gravatar.com/avatar/fd0e7f48847aa0e46c8f361df2d6c26b.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d56","height":56},{"url":"https://www.gravatar.com/avatar/fd0e7f48847aa0e46c8f361df2d6c26b.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d100","height":100},{"url":"https://www.gravatar.com/avatar/fd0e7f48847aa0e46c8f361df2d6c26b.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d120","height":120}]},"change_message_id":"838107b959dddf194f27db3b05b105e3c1f28047","unresolved":true,"context_lines":[{"line_number":13,"context_line":"counted as a voting member by etcd."},{"line_number":14,"context_line":""},{"line_number":15,"context_line":"If the membership is removed, but then removing the role fails, the"},{"line_number":16,"context_line":"cluster ends up in an inconsistent state. If the affected node was the"},{"line_number":17,"context_line":"curator or etcd leader, that will almost certainly happen. In this case,"},{"line_number":18,"context_line":"the request can just be retried until it succeeds, and then the cluster"},{"line_number":19,"context_line":"state is consistent again between etcd membership and roles."},{"line_number":20,"context_line":""}],"source_content_type":"text/x-gerrit-commit-message","patch_set":1,"id":"742a2a11_8ae1b83c","line":17,"range":{"start_line":16,"start_character":41,"end_line":17,"end_character":58},"in_reply_to":"11a97334_2eb91705","updated":"2024-09-19 16:40:33.000000000","message":"The inconsistent state is not really a problem. It will get resolved by the background process, and we need to handle it anyway as it could also be caused by a machine failure during the operation. The main benefit of moving leadership before the operation is to avoid the etcd/curator downtime until the election timeout.\n\nI don\u0027t think we need to expose an RPC for moving leadership; this is an implementation detail that clients should not need to care about. Instead, I would have UpdateNodeRoles automatically move leadership and return Unavailable when trying to remove the ConsensusMember role from itself. Clients can then just retry the update RPC until it succeeds (same as it already works now).\n\nThe challenge with moving the leadership is that maintenance.MoveLeader can only be called on the etcd leader node, and etcd leadership is currently independent from curator leadership. We could solve that by making the curator leadership follow etcd leadership, which might have performance benefits as well.","commit_id":"74604bb01ca92dbc52e2ece5b9b4e0fdfe3ef8e0"},{"author":{"_account_id":1000002,"name":"Serge Bazanski","display_name":"Serge","email":"serge@monogon.tech","username":"serge","avatars":[{"url":"https://www.gravatar.com/avatar/52c41428b6369f2c02b9717425216f7d.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d32","height":32},{"url":"https://www.gravatar.com/avatar/52c41428b6369f2c02b9717425216f7d.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d56","height":56},{"url":"https://www.gravatar.com/avatar/52c41428b6369f2c02b9717425216f7d.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d100","height":100},{"url":"https://www.gravatar.com/avatar/52c41428b6369f2c02b9717425216f7d.jpg?d\u003didenticon\u0026r\u003dpg\u0026s\u003d120","height":120}]},"change_message_id":"21a5529d6138347cfb85298f3781115198a458fb","unresolved":false,"context_lines":[{"line_number":13,"context_line":"counted as a voting member by etcd."},{"line_number":14,"context_line":""},{"line_number":15,"context_line":"If the membership is removed, but then removing the role fails, the"},{"line_number":16,"context_line":"cluster ends up in an inconsistent state. If the affected node was the"},{"line_number":17,"context_line":"curator or etcd leader, that will almost certainly happen. In this case,"},{"line_number":18,"context_line":"the request can just be retried until it succeeds, and then the cluster"},{"line_number":19,"context_line":"state is consistent again between etcd membership and roles."},{"line_number":20,"context_line":""}],"source_content_type":"text/x-gerrit-commit-message","patch_set":1,"id":"fda3fd26_68120a15","line":17,"range":{"start_line":16,"start_character":41,"end_line":17,"end_character":58},"in_reply_to":"742a2a11_8ae1b83c","updated":"2024-09-24 11:16:26.000000000","message":"Acknowledged","commit_id":"74604bb01ca92dbc52e2ece5b9b4e0fdfe3ef8e0"}]}
