Wednesday, 5 August 2015

FAST-START FAILOVER (FSF)

FAST-START FAILOVER (FSF)


The following conditions must be met before you can use the broker:

Primary and standby DB’s must be on same version
You must use a SPFILE to ensure the broker can persistently reconcile values between broker properties.
DG_BROKER_START parameter must be set to TRUE.
DG_BROKER_CONFIG_FILE file should be place in the shared area for RAC.
Oracle Net Services network files must be set up on the primary database and on the standby database.
To enable DGMGRL to restart instances during the course of broker operations, a service with a specific name must be statically registered with the local listener of each instance.
the primary database must be opened in ARCHIVELOG mode.
Ensure the COMPATIBLE initialization parameter is set to the same value on all systems.
■ Flashback database should be enable for fast start failover.


Enabling Fast-Start Failover

If you have more than one standby database, then we have to select the one which will be used by the FAST START FAILOVER operation


FAST START FAILOVER CONFIGURATION

DGMGRL> show fast_start failover

Fast-Start Failover: DISABLED

  Threshold:        180 seconds
  Target:           (none)
  Observer:         (none)
  Lag Limit:        30 seconds
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configurable Failover Conditions
  Health Conditions:
    Corrupted Controlfile          YES
    Corrupted Dictionary           YES
    Inaccessible Logfile            NO
    Stuck Archiver                  NO
    Datafile Offline               YES

  Oracle Error Conditions:
    (none)

ASSIGNING FAILOVER TARGET

DGMGRL> EDIT DATABASE 'BHUVAN_A' SET PROPERTY FastStartFailoverTarget = 'BHUVAN_B';
Property "faststartfailovertarget" updated

DGMGRL> EDIT DATABASE 'BHUVAN_B' SET PROPERTY FastStartFailoverTarget = 'BHUVAN_A';
Property "faststartfailovertarget" updated

DGMGRL> show fast_start failover

Note: When you have only one target on the standby database, then there is no need to specify the target.

SETTING PROTECTION MODE


DGMGRL> SHOW DATABASE 'BHUVAN_A' 'LogXptStatus';
LOG TRANSPORT STATUS
PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME               STATUS
               FE1_1                BHUVAN_B
               FE1_2                BHUVAN_B

To Display the protection mode

DGMGRL> show configuration

Configuration - DG_BHUVAN

  Protection Mode: MaxAvailability
  Databases:
    BHUVAN_A - Primary database
    BHUVAN_B - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Enable maximum availability mode or maximum performance mode.

DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;

Note: 1) If you cannot tolerate any loss of data, then ensure that the configuration protection mode is set to maximum availability. To do this, the LogXptMode database property for both the primary and target standby database must be set to SYNC.
DGMGRL> EDIT DATABASE ’BHUVAN_A’ SET PROPERTY LogXptMode=SYNC;
DGMGRL> EDIT DATABASE ’BHUVAN_B’ SET PROPERTY LogXptMode=SYNC;

     2) If you can tolerate data loss, then we can go for maximum performance mode and set FastStartFailoverLagLimit. This property specifies the amount of data, in seconds, that the target standby database can lag behind the primary database in terms of redo applied.

DGMGRL> EDIT DATABASE ’BHUVAN_A’ SET PROPERTY LogXptMode=ASYNC;
DGMGRL> EDIT DATABASE ’BHUVAN_B’ SET PROPERTY LogXptMode=ASYNC;
DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxPerformance;
DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverLagLimit=45;


FAST START FAILOVER CONFIGURATION PROPERTY.

Fast-start failover will occur if both the observer and the target standby database lose connection to the primary database for the period of time specified by the FastStartFailoverThreshold configuration property.

DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold = '60';
Property "faststartfailoverthreshold" updated

Note: 1) Setting the Threshold value for RAC System[ID 1319917.1]
Check the css value from the cluster environment

$ crsctl get css misscount
CRS-4678: Successful get misscount 30 for Cluster Synchronization Services.

Add 30 to 30 sec extra. I have 30 sec for my miss count. So I am setting this value as 60

2) Setting FastStartFailoverPmyShutdown
If the FastStartFailoverPmyShutdown configuration property is set to TRUE, the primary database will shut down after FastStartFailoverThreshold seconds has elapsed if redo generation has been stalled and the primary database is unable to re-establish connectivity with either the observer or target standby database.

DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverPmyShutdown = 'TRUE';
Property "faststartfailoverpmyshutdown" updated

DGMGRL> show fast_start failover;

Fast-Start Failover: DISABLED

  Threshold:        60 seconds
  Target:           (none)
  Observer:         (none)
  Lag Limit:        30 seconds
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configurable Failover Conditions
  Health Conditions:
    Corrupted Controlfile          YES
    Corrupted Dictionary           YES
    Inaccessible Logfile            NO

    Stuck Archiver                  NO
    Datafile Offline               YES

  Oracle Error Conditions:
    (none)

3) Setting FastStartFailoverAutoReinstate
This configuration property causes the former primary database to be automatically reinstated if a fast-start failover was initiated because the primary database was either isolated or had crashed.

DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverAutoReinstate = 'TRUE';
Property "faststartfailoverautoreinstate" updated

4) ObserverConnectIdentifier, This database property is used to specify how the observer should connect to and monitor the primary and standby database. Set this property for the
Primary and target standby database if you want the observer to use a different connect identifier than that used to ship redo data (that is, the connect identifier specified by the DGConnectIdentifierproperty).

DGMGRL> EDIT DATABASE ‘DB_NAME’ SET PROPERTY ObserverConnectIdentifier = ' ';


5) Enable additional fast-start failover conditions

Fast-start failover is done when both the observer and the standby cannot reach the primary after the configured time threshold (FastStartFailoverThreshold) has passed.
You can optionally indicate the database health conditions that should cause fast-start failover to occur.

Below parameters are enable by default.
1) A datafile is offline because of a write error
2) Dictionary corruption of a critical database object
3) Control file damaged because of a disk error
4) LGWR is unable to write to any member of the log group because of an I/O error
5) Archive is unable to archive a redo log because the device is full or unavailable
6) Primary to observer and primary to standby network failure
7) An instance crash occurs (single instance)
8) All instances of a rac crash
9) Shutdown abort of primary
10) You can specify a error message, if you want to start the Fast start failover process. When I get ORA-xxxxx error is detected on the primary database with the following command:

DGMGRL> ENABLE FAST_START FAILOVER CONDITION xxxxx;


DGMGRL> enable fast_start failover condition "Inaccessible Logfile";
Succeeded.

DGMGRL> enable fast_start failover condition "Stuck Archiver";
Succeeded.

ENABLE FAST-START FAILOVER

DGMGRL> enable fast_start failover;
Enabled.

DGMGRL> show fast_start failover;

Fast-Start Failover: ENABLED

  Threshold:        60 seconds
  Target:           BHUVAN_B
  Observer:         (none)
  Lag Limit:        30 seconds (not in use)
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configurable Failover Conditions
  Health Conditions:
    Corrupted Controlfile          YES
    Corrupted Dictionary           YES
    Inaccessible Logfile            NO
    Stuck Archiver                  NO
    Datafile Offline               YES

  Oracle Error Conditions:
    (none)

Start the Observer

Must install DGMGRL on an observer computer (Not on the same DB server)
1) Install complete Oracle Client Administrator
2) Install a full db installation

1. PRE-REQS Must be in max availability or max performance
2. LogXptMode
LogXptMode must be in SYNC in max availability for 11g
LogXptMode must be in ASYNC in max performance for 11g
3. FLASHBACK DB must be enabled on primary and standby
4. tnsnames.ora must be configured on the observer
5. A static service name must exist so the observer can automatically restart databases.

You can start the observer before or after you enable fast-start failover. If fast-start failover is already enabled, the observer immediately begins monitoring the status and connections to the primary and target standby databases. If fast-start failover is not already enabled, the observer waits until fast-start failover gets enabled and then begins monitoring.

#!/bin/ksh
# startobserver
dgmgrl -logfile 11g_observer.log << eof
connect sys/oracle@bhuvan
START OBSERVER;
Eof

You can check the process in the unix side, whether the process is running
Ps –ef|grep filename

Tips
1) Error “ORA-16820”
Error message: Fast-Start Failover observer is no longer observing this db
Solution è Check the reason why the observer cannot contact this database. If the problem cannot be corrected, stop the current observer by connecting to the Data Guard configuration and issue the DGMGRL "STOP OBSERVER" command. Then restart the observer

2) To get more information about the configuration
DGMGRL> show configuration verbose;
DGMGRL> show database verbose ‘DB_NAME’;

3) OBSERVER CONFIGURATION
DGMGRL> START OBSERVER FILE=/oracle/observer/obs.dat;
If file is not set the current working directory is searched for a file name FSFO.dat.

4) To view FAST START FAIL OVER INFORMATION (PRIMARY & STANDBY)

SQL> SELECT FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET, db_unique_name, FS_FAILOVER_THRESHOLD, FS_FAILOVER_OBSERVER_PRESENT,FS_FAILOVER_OBSERVER_HOST
FROM V$DATABASE;

5) To view the reason for the FAST START FAILOVER

SELECT LAST_FAILOVER_TIME, LAST_FAILOVER_REASON FROM
V$FS_FAILOVER_STATUS;

No comments:

Post a Comment