SQL Server

SQL Server AlwaysOn Availability Group Failover Time

KruizerG 2020. 4. 9. 10:27

[환경]

- OS: Windows Server 2016

- SQL Server: SQL Server 2017

[내용]

  • Netbios disable 통해 AG 환경에서 Failover 시간을 단축 시킬수 있는 것으로 확인된다.
  • VIP NetBIOS 상태는 아래의 레지스트리 경로에서 확인이 가능하다.

HKEY_LOCAL_MACHINE\0.Cluster\Resources\c0e0f13a-4dd9-45f3-8699-5a88f5eeee01\Parameters

[테스트 내용]

  1. AG 리스터 ip 확인

2. AG 리스너 netBIOS disable 상태의 failover 시간 ( 5 가량 소요)

Get-ClusterResource "tesetAG_192.168.1.30" | Set-ClusterParameter EnableNetBIOS 0

  • 리소스를 [오프라인 상태로 전환 --> 온라인 상태로 전환] 작업을 하지 않아도 적용 됨을 확인

 

/*클러스터 로그 확인*/

  • Failover 시간 5 소요

000013ec.0000192c::2020/04/07-09:46:43.015 INFO  [RCM] rcm::RcmApi::MoveGroup: (tesetAG, 2, 0, MoveType::Manual )

000013ec.00000d64::2020/04/07-09:46:43.021 INFO  [GUM] Node 2: Executing locally gumId: 27987, updates: 1, first action: /dm/update

000013ec.00000d64::2020/04/07-09:46:43.024 INFO  [GUM] Node 2: Executing locally gumId: 27988, updates: 1, first action: /dm/update

000013ec.0000192c::2020/04/07-09:46:43.469 INFO  [GUM] Node 2: Executing locally gumId: 27989, updates: 1, first action: /rcm/gum/GroupMoveOperation

........중략

000013ec.00000dd8::2020/04/07-09:46:47.367 INFO  [RCM] Res tesetAG: OnlinePending -> Online( StateUnknown )

000013ec.00000dd8::2020/04/07-09:46:47.367 INFO  [RCM] TransitionToState(tesetAG) OnlinePending-->Online.

000013ec.00000dd8::2020/04/07-09:46:47.367 INFO  [RCM] rcm::RcmGroup::UpdateStateIfChanged: (tesetAG, Pending --> Online)

000013ec.00000dd8::2020/04/07-09:46:47.367 INFO  [RCM] rcm::RcmGroup::ProcessStateChange: [RCM] Group move for 'tesetAG' has completed.

000013ec.00000dd8::2020/04/07-09:46:47.367 INFO  [RCM] rcm::RcmGroup::ProcessStateChange: [RCM] DrainMgr: Not Sending message to Src node for group tesetAG after successful online

3. AG 리스너 NetBIOS enable상태의 failover 시간(10 정도 소요)

/*클러스터 로그 확인*/

  • Failover 시간이 10 정도 소요된 것을 확인

--AG리스너 netBIOS Enable

00000cc0.000019a4::2020/04/07-09:46:56.358 INFO  [RES] IP Address <tesetAG_192.168.1.30>: IpaValidatePrivateResProperties: attempting to change property EnableNetBIOS from 0 to 1.

 

--Failover 테스트

00000a18.000024c0::2020/04/07-09:47:06.479 INFO  [RCM] rcm::RcmApi::MoveGroup: (tesetAG, 1, 0, MoveType::Manual )

00000a18.000026fc::2020/04/07-09:47:06.480 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter NodeDownFilter

00000a18.000026fc::2020/04/07-09:47:06.480 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter NodeShuttingDownFilter

00000a18.000026fc::2020/04/07-09:47:06.480 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter CurrentNodeFilter

00000a18.000026fc::2020/04/07-09:47:06.480 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter PausedNodeFilter

00000a18.000026fc::2020/04/07-09:47:06.481 INFO  [RCM] Group tesetAG: done going through resources, returning true

00000a18.000026fc::2020/04/07-09:47:06.481 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter PossibleOwnerFilter

00000a18.000026fc::2020/04/07-09:47:06.481 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter QueuingPerviouslyRejectedFilter

00000a18.000026fc::2020/04/07-09:47:06.481 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter PreferredOwnerWaitFilter

00000a18.000026fc::2020/04/07-09:47:06.481 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter AntiAffinityFilter

00000a18.000028a0::2020/04/07-09:47:06.481 INFO  [RCM] applying timely STM connectivity snapshot for group tesetAG

00000a18.000026fc::2020/04/07-09:47:06.481 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter StmFilter

00000a18.000026fc::2020/04/07-09:47:06.481 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter DependentCsvSiteFilter

00000a18.000026fc::2020/04/07-09:47:06.481 INFO  [RCM-plcmt] Group tesetAG allowed to move to node 1 by filter CPUReservationFilter

00000a18.000026fc::2020/04/07-09:47:06.486 INFO  [GUM] Node 1: Executing locally gumId: 28014, updates: 1, first action: /dm/update

00000a18.000026fc::2020/04/07-09:47:06.486 INFO  [DM] Starting replica transaction, paxos: 29:29:33078, smartPtr: HDL( 448ffec00 ), internalPtr: HDL( 1251915d630 )

00000a18.000026fc::2020/04/07-09:47:06.487 INFO  [DM] Finished replica transaction, paxos: 29:29:33078, smartPtr: HDL( 448ffec00 ), internalPtr: HDL( 1251915d630 ), status: 0

00000a18.000026fc::2020/04/07-09:47:06.489 INFO  [GUM] Node 1: Executing locally gumId: 28015, updates: 1, first action: /dm/update

00000a18.000026fc::2020/04/07-09:47:06.489 INFO  [DM] Starting replica transaction, paxos: 29:29:33079, smartPtr: HDL( 448ffec00 ), internalPtr: HDL( 1251915d0e0 )

00000a18.000026fc::2020/04/07-09:47:06.490 INFO  [DM] Finished replica transaction, paxos: 29:29:33079, smartPtr: HDL( 448ffec00 ), internalPtr: HDL( 1251915d0e0 ), status: 0

00000a18.000026fc::2020/04/07-09:47:07.349 INFO  [GUM] Node 1: Executing locally gumId: 28016, updates: 1, first action: /rcm/gum/GroupMoveOperation

00000a18.000026fc::2020/04/07-09:47:07.349 INFO  [RCM] rcm::RcmGum::GroupMoveOperation(1)

00000a18.000026fc::2020/04/07-09:47:07.349 INFO  [RCM] move of group tesetAG from testsql02(2) to testsql01(1) of type MoveType::Manual is about to succeed, failoverCount=1, lastFailoverTime=2020/04/06-10:51:02.467 targeted=true

......중략

00000a18.000026fc::2020/04/07-09:47:17.169 INFO  [RCM] Res tesetAG: OnlinePending -> Online( StateUnknown )

00000a18.000026fc::2020/04/07-09:47:17.169 INFO  [RCM] TransitionToState(tesetAG) OnlinePending-->Online.

00000a18.000026fc::2020/04/07-09:47:17.169 INFO  [RCM] rcm::RcmGroup::UpdateStateIfChanged: (tesetAG, Pending --> Online)

00000a18.000026fc::2020/04/07-09:47:17.169 INFO  [RCM] rcm::RcmGroup::ProcessStateChange: [RCM] Group move for 'tesetAG' has completed.

00000a18.000026fc::2020/04/07-09:47:17.169 INFO  [RCM] rcm::RcmGroup::ProcessStateChange: [RCM] DrainMgr: Not Sending message to Src node for group tesetAG after successful online

 

[참고]

https://techcommunity.microsoft.com/t5/failover-clustering/speeding-up-failover-tips-n-tricks/ba-p/372086