Failover Automático em Caso de Falha do Primary

Failover Automático

Neste penúltimo da série sobre MAA com Oracle RAC 11GR2 vou falar um pouco sobre como ocorre o failover automático em caso de falha do primary. Vou demonstrar que basicamente você não faz nada, todo o trabalho “sujo” será pelo próprio Oracle.

NONO ARTIGO

Este artigo irá mostrar como o MAA resolve de forma automática todos os pontos de um failover. Claro que para isso você tem que ter tudo devidamente configurado e operacional. Você vai precisar de um Observer configurado e o Fast-Start Failover habilitado (como demonstrado aqui) bem como um Broker operacional (veja aqui). Se você leu os artigos anteriores você já tem tudo isso configurado e não irá se preocupar com mais nada.

AMBIENTE

Até o momento você tem um banco de dados primary (maa) e um banco de dados standby (maastb) sincronizados em Maximum Availability (com Real-Time no envio de redo). Além disso o Fast-Start Failover está habilitado.

Já aviso que o artigo pode ser extenso devido aos logs, tentarei suprimir as informações que não são necessárias. Mas mesmo assim recomendo a leitura para compreender tudo o que ocorre.

FAILOVER AUTOMÁTICO

Neste artigo você vai ver que os comandos executados serão exclusivamente para monitoramento das atividades, tudo o que você precisava fazer já foi feito. O que acontecerá abaixo é uma simulação do que você veria em um ambiente real, você não vai querer que ocorra isso no seu ambiente de produção, não é? Mas se você chegou até aqui já tem configurado um ambiente com MAA e Oracle RAC e isso já denota a importância do ambiente e que deseja estar protegido das adversidades.

Como um failover automático não acontece “naturalmente” na vida real, vamos simular com um abort em todas as instâncias do primary:

[oracle@rac11pri01 ~]$ srvctl stop database -d maa -o abort

Note que o log do Observer já detectou isso:

[W000 07/28 21:57:00.35] Observer started.

23:43:15.41 Monday, July 28, 2014

Initiating Fast-Start Failover to database "maastb"...

Performing failover NOW, please wait...

Failover succeeded, new primary is "maastb"

23:45:11.19 Monday, July 28, 2014

Como visto acima, em 3 minutos ocorreu o failover completo do ambiente, o banco maastb passou a responder como novo primary. Claro que esse tempo dependerá do tamanho de sua base de dados mas vamos ver como tudo isso aconteceu “de verdade”.

No log do Broker (da instância maastb01) temos informações mais interessantes, veja abaixo:

07/29/2014 02:23:36
Fast-Start Failover cannot proceed because: "standby has connectivity to the primary"
07/29/2014 02:23:38
FAILOVER TO maastb
07/29/2014 02:23:39
Beginning failover to database maastb
Notifying Oracle Clusterware to teardown database for FAILOVER
07/29/2014 02:24:20
DMON: Old primary "maa" needs reinstatement
07/29/2014 02:25:07
Protection mode set to MAXIMUM AVAILABILITY
07/29/2014 02:25:27
Deferring associated archivelog destinations of sites permanently disabled due to Failover
Notifying Oracle Clusterware to buildup primary database after FAILOVER
Data Guard notifying Oracle Clusterware to start services and other instances change
07/29/2014 02:25:30
Posting DB_DOWN alert ...
... with reason Data Guard Fast-Start Failover - Primary Disconnected
07/29/2014 02:25:32
Command FAILOVER TO maastb completed
07/29/2014 02:25:35
Data Guard Broker Status Summary:
  Type                        Name                             Severity  Status
  Configuration               dgrac                             Warning  ORA-16608
  Primary Database            maastb                            Warning  ORA-16817
  Physical Standby Database   maa                                 Error  ORA-16661
07/29/2014 02:26:34
Data Guard Broker Status Summary:
  Type                        Name                             Severity  Status
  Configuration               dgrac                             Warning  ORA-16608
  Primary Database            maastb                            Warning  ORA-16817
  Physical Standby Database   maa                                 Error  ORA-16661

Deixei em negrito algumas informações, mas observe que que o antigo primary precisará sofrer o reinstate assim que foi iniciado (já que ocorreu um failover) e que existem mensagens de erro informando que não há nenhum destino para manter a disponibilidade desejada.

Mas existem locais que podem nos dar mais detalhes do que ocorreu (dos passos) e isso está no alertlog da antiga standby. No log abaixo eu suprimi os erros de TNS entre as bases, mas deixei em negrito pontos importantes.

[root@rac11stb01 ~]# tail -f /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/alert_maastb1.log

Tue Jul 29 02:22:36 2014

RFS[15]: Possible network disconnect with primary database

Tue Jul 29 02:22:36 2014

RFS[13]: Possible network disconnect with primary database

Tue Jul 29 02:22:36 2014

RFS[18]: Possible network disconnect with primary database

Tue Jul 29 02:22:37 2014

RFS[19]: Possible network disconnect with primary database

Tue Jul 29 02:22:37 2014

RFS[17]: Possible network disconnect with primary database

Tue Jul 29 02:23:37 2014

Attempting Fast-Start Failover because the threshold of 60 seconds has elapsed.

Tue Jul 29 02:23:39 2014

Data Guard Broker: Beginning failover

Tue Jul 29 02:23:44 2014

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL

Tue Jul 29 02:23:45 2014

MRP0: Background Media Recovery cancelled with status 16037

Errors in file /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/maastb1_pr00_3786.trc:

ORA-16037: user requested cancel of managed recovery operation

Tue Jul 29 02:23:45 2014

Managed Standby Recovery not using Real Time Apply

Recovery interrupted!

Recovered data files to a consistent state at change 19932270

Tue Jul 29 02:23:49 2014

MRP0: Background Media Recovery process shutdown (maastb1)

Managed Standby Recovery Canceled (maastb1)

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE

Attempt to do a Terminal Recovery (maastb1)

Media Recovery Start: Managed Standby Recovery (maastb1)

 started logmerger process

Tue Jul 29 02:23:55 2014

Managed Standby Recovery not using Real Time Apply

Tue Jul 29 02:23:57 2014

Parallel Media Recovery started with 2 slaves

Tue Jul 29 02:24:00 2014

Begin: Standby Redo Logfile archival

End: Standby Redo Logfile archival

Terminal Recovery timestamp is '07/29/2014 02:24:00'

Terminal Recovery: applying standby redo logs.

Terminal Recovery: thread 1 seq# 1181 redo required

Terminal Recovery:

Recovery of Online Redo Log: Thread 1 Group 6 Seq 1181 Reading mem 0

  Mem# 0: +DG01/maastb/onlinelog/group_6.267.844716079

  Mem# 1: +FRA/maastb/onlinelog/group_6.604.844716081

Identified End-Of-Redo (failover) for thread 1 sequence 1181 at SCN 0xffff.ffffffff

Terminal Recovery: thread 2 seq# 1395 redo required

Terminal Recovery:

Recovery of Online Redo Log: Thread 2 Group 9 Seq 1395 Reading mem 0

  Mem# 0: +DG01/maastb/onlinelog/group_9.260.844716095

  Mem# 1: +FRA/maastb/onlinelog/group_9.591.844716097

Identified End-Of-Redo (failover) for thread 2 sequence 1395 at SCN 0xffff.ffffffff

Incomplete Recovery applied until change 19932272 time 07/29/2014 05:42:13

Media Recovery Complete (maastb1)

Terminal Recovery: successful completion

Tue Jul 29 02:24:02 2014

ARC0: Archiving not possible: error count exceeded

Tue Jul 29 02:24:03 2014

ARC3: Archiving not possible: error count exceeded

Forcing ARSCN to IRSCN for TR 0:19932272

Attempt to set limbo arscn 0:19932272 irscn 0:19932272

Resetting standby activation ID 728027883 (0x2b64d2eb)

ARCH: Archival stopped, error occurred. Will continue retrying

ORACLE Instance maastb1 - Archival Error

ORA-16014: log 6 sequence# 1181 not archived, no available destinations

ORA-00312: online log 6 thread 1: '+DG01/maastb/onlinelog/group_6.267.844716079'

ORA-00312: online log 6 thread 1: '+FRA/maastb/onlinelog/group_6.604.844716081'

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE

ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN

ALTER DATABASE SWITCHOVER TO PRIMARY (maastb1)

Maximum wait for role transition is 15 minutes.

Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/maastb1_rsm0_3741.trc

Tue Jul 29 02:24:08 2014

Standby terminal recovery start SCN: 19932270

RESETLOGS after incomplete recovery UNTIL CHANGE 19932272

Online log +DG01/maastb/onlinelog/group_1.257.844716051: Thread 1 Group 1 was previously cleared

Online log +FRA/maastb/onlinelog/group_1.568.844716053: Thread 1 Group 1 was previously cleared

Online log +DG01/maastb/onlinelog/group_2.268.844716057: Thread 1 Group 2 was previously cleared

Online log +FRA/maastb/onlinelog/group_2.564.844716059: Thread 1 Group 2 was previously cleared

Online log +DG01/maastb/onlinelog/group_3.269.844716063: Thread 2 Group 3 was previously cleared

Online log +FRA/maastb/onlinelog/group_3.562.844716065: Thread 2 Group 3 was previously cleared

Online log +DG01/maastb/onlinelog/group_4.270.844716067: Thread 2 Group 4 was previously cleared

Online log +FRA/maastb/onlinelog/group_4.559.844716071: Thread 2 Group 4 was previously cleared

Standby became primary SCN: 19932269

Tue Jul 29 02:24:12 2014

Setting recovery target incarnation to 4

Switchover: Complete - Database mounted as primary

Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN

Tue Jul 29 02:24:20 2014

ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY

Completed: ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY

ALTER DATABASE OPEN

Data Guard Broker initializing...

This instance was first to open

Picked broadcast on commit scheme to generate SCNs

Tue Jul 29 02:24:21 2014

Assigning activation ID 732752315 (0x2bace9bb)

LGWR: Primary database is in MAXIMUM AVAILABILITY mode

Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED

LGWR: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR

Thread 1 advanced to log sequence 2 (thread open)

Tue Jul 29 02:24:22 2014

ARC3: Becoming the 'no SRL' ARCH

Tue Jul 29 02:24:22 2014

ARC0: Becoming the 'no SRL' ARCH

Thread 1 opened at log sequence 2

  Current log# 2 seq# 2 mem# 0: +DG01/maastb/onlinelog/group_2.268.844716057

  Current log# 2 seq# 2 mem# 1: +FRA/maastb/onlinelog/group_2.564.844716059

Successful open of redo thread 1

MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set

Tue Jul 29 02:24:23 2014

SMON: enabling cache recovery

Tue Jul 29 02:24:23 2014

...

...

...

TNS-12564: TNS:connection refused

ns secondary err code: 0

nt main err code: 0

nt secondary err code: 0

nt OS err code: 0

Error 12514 received logging on to the standby

PING[ARC2]: Heartbeat failed to connect to standby 'maa'. Error is 12514.

Instance recovery: looking for dead threads

Instance recovery: lock domain invalid but no dead threads

Tue Jul 29 02:24:30 2014

FSFP started with pid=40, OS id=9787

Tue Jul 29 02:24:34 2014

minact-scn: Inst 1 is a slave inc#:4 mmon proc-id:3596 status:0x2

minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000

Tue Jul 29 02:24:37 2014

[3741] Successfully onlined Undo Tablespace 2.

Undo initialization finished serial:0 start:8127994 end:8132374 diff:4380 (43 seconds)

Verifying file header compatibility for 11g tablespace encryption..

Verifying 11g file header compatibility for tablespace encryption completed

Tue Jul 29 02:24:37 2014

SMON: enabling tx recovery

Database Characterset is WE8MSWIN1252

Tue Jul 29 02:24:48 2014

No Resource Manager plan active

Starting background process GTX0

Tue Jul 29 02:24:57 2014

GTX0 started with pid=41, OS id=9799

Starting background process RCBG

Tue Jul 29 02:24:57 2014

RCBG started with pid=45, OS id=9802

replication_dependency_tracking turned off (no async multimaster replication found)

Tue Jul 29 02:24:59 2014

Starting background process QMNC

Tue Jul 29 02:24:59 2014

QMNC started with pid=46, OS id=9804

Completed: ALTER DATABASE OPEN

ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maastb1';

ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maastb1';

Tue Jul 29 02:25:18 2014

...

...

...

TNS-12564: TNS:connection refused

ns secondary err code: 0

nt main err code: 0

nt secondary err code: 0

nt OS err code: 0

Error 12514 received logging on to the standby

PING[ARC2]: Heartbeat failed to connect to standby 'maa'. Error is 12514.

Tue Jul 29 02:25:20 2014

ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';

Tue Jul 29 02:25:20 2014

Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

Tue Jul 29 02:25:20 2014

Starting background process VKRM

Tue Jul 29 02:25:20 2014

VKRM started with pid=49, OS id=9823

Tue Jul 29 02:25:22 2014

Starting background process CJQ0

Tue Jul 29 02:25:22 2014

CJQ0 started with pid=58, OS id=9838

ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';

ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';

ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';

ALTER SYSTEM SET db_file_name_convert='+DATA/maa','+DG01/maastb','+FRA/maa','+FRA/maastb' SCOPE=SPFILE;

ALTER SYSTEM SET log_file_name_convert='+DATA/maa','+DG01/maastb','+FRA/maa','+FRA/maastb' SCOPE=SPFILE;

Setting Resource Manager plan SCHEDULER[0x3189]:DEFAULT_MAINTENANCE_PLAN via scheduler window

Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

Tue Jul 29 02:25:27 2014

Starting background process SMCO

Tue Jul 29 02:25:27 2014

SMCO started with pid=56, OS id=9845

Tue Jul 29 02:25:28 2014

ALTER SYSTEM SET log_archive_dest_state_2='RESET' SCOPE=BOTH;

Tue Jul 29 02:25:28 2014

Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized. If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 4194304 bytes. Both ASH size and the total number of emergency flushes since instance startup can be monitored by running the following query:

 select total_size,awr_flush_emergency_count from v$ash_info;

Failover succeeded. Primary database is now maastb.

Observe que no começo do alertlog temos informações que o primary pode estar indisponível e após 60 segundos (que defini quando configurei o Fast-Start Faiolver) o failover automático iniciou. Depois disso tivemos o cancelamento da aplicação dos redo’s por parte do standby. Um pouco mais abaixo tivemos o switchover da base para primary e a informação de qual o SCN utilizado como ponto de virada (RESETLOGS after incomplete recovery UNTIL CHANGE 19932272). Após isso tivemos a abertura da base em modo OPEN e que o destino archive_dest 2 está dessincronizado (isso é normal já que o antigo primary está fora).

No Broker temos a configuração demonstrada abaixo, observe que o banco maastb já aparece listado como primary e o banco maa está com erro. Além disso o Fast-Start Failover está ativo e também que o antigo primary deve sofrer resinstate ao ser iniciado.

[oracle@rac11stb01 ~]$ dgmgrl

DGMGRL for Linux: Version 11.2.0.3.0 - 64bit Production

Copyright (c) 2000, 2009, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.

DGMGRL> connect sys@maastb>

Password:

Connected.

DGMGRL> SHOW CONFIGURATION VERBOSE;

Configuration - dgrac

  Protection Mode: MaxAvailability

  Databases:

maastb - Primary database

  Warning: ORA-16817: unsynchronized fast-start failover configuration

maa    - (*) Physical standby database (disabled)

  ORA-16661: the standby database needs to be reinstated

  (*) Fast-Start Failover target

  Properties:

FastStartFailoverThreshold      = '60'

OperationTimeout                = '30'

FastStartFailoverLagLimit       = '30'

CommunicationTimeout            = '180'

FastStartFailoverAutoReinstate  = 'TRUE'

FastStartFailoverPmyShutdown    = 'TRUE'

BystandersFollowRoleChange      = 'ALL'

Fast-Start Failover: ENABLED

  Threshold:        60 seconds

  Target:           maa

  Observer:         sbdobs

  Lag Limit:        30 seconds (not in use)

  Shutdown Primary: TRUE

  Auto-reinstate:   TRUE

Configuration Status:

WARNING

DGMGRL>

DGMGRL> SHOW DATABASE maa;

Database - maa

  Role:            PHYSICAL STANDBY

  Intended State:  APPLY-ON

  Transport Lag:   (unknown)

  Apply Lag:       (unknown)

  Real Time Query: OFF

  Instance(s):

maa1

maa2

Database Status:

ORA-16661: the standby database needs to be reinstated

DGMGRL> SHOW DATABASE maastb;

Database - maastb

  Role:            PRIMARY

  Intended State:  TRANSPORT-ON

  Instance(s):

maastb1

maastb2

  Database Warning(s):

ORA-16817: unsynchronized fast-start failover configuration

Database Status:

WARNING

DGMGRL>

Como já aconteceu em artigos anteriores, o CRS precisa de um ajuste manual para que o novo primary sempre inicie em modo OPEN:

[oracle@rac11stb01 ~]$ srvctl config database -d maastb -v

Database unique name: maastb

Database name:

Oracle home: /u01/app/oracle/product/11.2.0.3/db_1

Oracle user: oracle

Spfile:

Domain:

Start options: mount

Stop options: immediate

Database role: PRIMARY

Management policy: AUTOMATIC

Server pools: maastb

Database instances: maastb1,maastb2

Disk Groups: DG01,FRA

Mount point paths:

Services:

Type: RAC

Database is administrator managed

[oracle@rac11stb01 ~]$ srvctl modify database -d maastb -s OPEN

[oracle@rac11stb01 ~]$ srvctl config database -d maastb -v

Database unique name: maastb

Database name:

Oracle home: /u01/app/oracle/product/11.2.0.3/db_1

Oracle user: oracle

Spfile:

Domain:

Start options: open

Stop options: immediate

Database role: PRIMARY

Management policy: AUTOMATIC

Server pools: maastb

Database instances: maastb1,maastb2

Disk Groups: DG01,FRA

Mount point paths:

Services:

Type: RAC

Database is administrator managed

[oracle@rac11stb01 ~]$

RELIGANDO O BANCO

Você deve ter notado acima que o antigo primary ainda não foi religado, se você o iniciar agora ele irá sofrer o reinstate e voltará como novo standby. De qualquer forma recomendo que antes de ligar o antigo primary verificar o motivo da queda, não adianta ter um ambiente standby que não é confiável. Muitas vezes é impossível religar o antigo primary e neste caso você precisa remover ele do Broker e adicionar um novo standby.

Supondo que você consiga religar o antigo primary e deixar ele estável verá algo parecido com o alertlog abaixo. É um log com muita informação: o banco tentará abrir em modo OPEN e o DataGuard irá informar que ele não é mais o novo primary. O banco sofrerá flashback para o SCN detectado como ponto de recover e será convertido em physical standby. Por fim um restart do banco e definição deste como standby juntamente com a sincronização dos archives com o primary. Tudo isso está em negrito no log abaixo.

[root@rac11pri01 ~]# tail -f /u01/app/oracle/diag/rdbms/maa/maa1/trace/alert_maa

  Current log# 1 seq# 1181 mem# 1: +FRA/maa/onlinelog/group_1.286.843488555

Tue Jul 29 04:30:00 2014

Archived Log entry 5432 added for thread 1 sequence 1180 ID 0x2b64d2eb dest 1:

Tue Jul 29 05:42:12 2014

Shutting down instance (abort)

License high water mark = 7

USER (ospid: 3606): terminating the instance

Instance terminated by USER, pid = 3606

Tue Jul 29 05:42:17 2014

Instance shutdown complete

Tue Jul 29 06:16:58 2014

Starting ORACLE instance (normal)

****************** Large Pages Information *****************

Total Shared Global Region in Large Pages = 0 KB (0%)

Large Pages used by this instance: 0 (0 KB)

Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB)

Large Pages configured system wide = 0 (0 KB)

Large Page size = 2048 KB

RECOMMENDATION:

  Total Shared Global Region size is 1026 MB. For optimal performance,

  prior to the next instance restart increase the number

  of unused Large Pages by atleast 513 2048 KB Large Pages (1026 MB)

  system wide to get 100% of the Shared

  Global Region allocated with Large pages

***********************************************************

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

Private Interface 'eth2:1' configured from GPnP for use as a private interconnect.

  [name='eth2:1', type=1, ip=169.254.187.5, mac=00-0c-29-8c-91-79, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]

Public Interface 'eth0' configured from GPnP for use as a public interface.

  [name='eth0', type=1, ip=10.17.42.10, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]

Public Interface 'eth0:2' configured from GPnP for use as a public interface.

  [name='eth0:2', type=1, ip=10.17.42.22, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]

Public Interface 'eth0:4' configured from GPnP for use as a public interface.

  [name='eth0:4', type=1, ip=10.17.42.17, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]

Public Interface 'eth0:5' configured from GPnP for use as a public interface.

  [name='eth0:5', type=1, ip=10.17.42.14, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]

Picked latch-free SCN scheme 3

Autotune of undo retention is turned on.

LICENSE_MAX_USERS = 0

SYS auditing is disabled

Starting up:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Partitioning and Real Application Clusters options.

ORACLE_HOME = /u01/app/oracle/product/11.2.0.3/db_1

System name:    Linux

Node name:      rac11pri01.tjsc.jus.br

Release:        2.6.39-400.17.1.el6uek.x86_64

Version:        #1 SMP Fri Feb 22 18:16:18 PST 2013

Machine:        x86_64

VM name:        VMWare Version: 6

Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0.3/db_1/dbs/initmaa1.ora

System parameters with non-default values:

  processes                = 150

  spfile                   = "+DATA/maa/spfilemaa.ora"

  sga_target               = 1G

  control_files            = "+DATA/maa/controlfile/current.273.843488553"

  control_files            = "+FRA/maa/controlfile/current.256.843488553"

  db_file_name_convert     = "+DG01/maastb"

  db_file_name_convert     = "+DATA/maa"

  db_file_name_convert     = "+FRA/maastb"

  db_file_name_convert     = "+FRA/maa"

  log_file_name_convert    = "+DG01/maastb"

  log_file_name_convert    = "+DATA/maa"

  log_file_name_convert    = "+FRA/maastb"

  log_file_name_convert    = "+FRA/maa"

  db_block_size            = 8192

  compatible               = "11.2.0.0.0"

  log_archive_dest_1       = "LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=maa"

  log_archive_dest_2       = "SERVICE=maastb SYNC AFFIRM LGWR VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=maastb"

  log_archive_dest_state_2 = "ENABLE"

  log_archive_min_succeed_dest= 1

  fal_server               = "maastb"

  log_archive_trace        = 0

  log_archive_config       = "DG_CONFIG=(maa,maastb)"

  log_archive_format       = "arch_%t_%s_%r.arc"

  log_archive_max_processes= 4

  archive_lag_target       = 0

  cluster_database         = TRUE

  db_create_file_dest      = "+DATA"

  db_recovery_file_dest    = "+FRA"

  db_recovery_file_dest_size= 10G

  standby_file_management  = "AUTO"

  thread                   = 1

  undo_tablespace          = "UNDOTBS1"

  instance_number          = 1

  remote_login_passwordfile= "EXCLUSIVE"

  db_domain                = ""

  remote_listener          = "rac11pri-scan.tjsc.jus.br:1521"

  audit_file_dest          = "/u01/app/oracle/admin/maa/adump"

  audit_trail              = "DB"

  db_name                  = "maa"

  db_unique_name           = "maa"

  open_cursors             = 300

  pga_aggregate_target     = 256M

  dg_broker_start          = TRUE

  dg_broker_config_file1   = "+DATA/maa/dr1maa.dat"

  dg_broker_config_file2   = "+FRA/maa/dr2maa.dat"

  diagnostic_dest          = "/u01/app/oracle"

Cluster communication is configured to use the following interface(s) for this instance

  169.254.187.5

cluster interconnect IPC version:Oracle UDP/IP (generic)

IPC Vendor 1 proto 2

Tue Jul 29 06:17:06 2014

PMON started with pid=2, OS id=7421

Tue Jul 29 06:17:06 2014

PSP0 started with pid=3, OS id=7423

Tue Jul 29 06:17:08 2014

VKTM started with pid=4, OS id=7425 at elevated priority

VKTM running at (1)millisec precision with DBRM quantum (100)ms

Tue Jul 29 06:17:08 2014

GEN0 started with pid=5, OS id=7430

Tue Jul 29 06:17:08 2014

DIAG started with pid=6, OS id=7432

Tue Jul 29 06:17:08 2014

DBRM started with pid=7, OS id=7434

Tue Jul 29 06:17:08 2014

PING started with pid=8, OS id=7436

Tue Jul 29 06:17:08 2014

ACMS started with pid=9, OS id=7438

Tue Jul 29 06:17:08 2014

DIA0 started with pid=10, OS id=7440

Tue Jul 29 06:17:08 2014

LMON started with pid=11, OS id=7442

Tue Jul 29 06:17:08 2014

LMD0 started with pid=12, OS id=7444

* Load Monitor used for high load check

* New Low - High Load Threshold Range = [1920 - 2560]

Tue Jul 29 06:17:08 2014

LMS0 started with pid=13, OS id=7446 at elevated priority

Tue Jul 29 06:17:08 2014

RMS0 started with pid=14, OS id=7450

Tue Jul 29 06:17:08 2014

LMHB started with pid=15, OS id=7452

Tue Jul 29 06:17:08 2014

MMAN started with pid=16, OS id=7454

Tue Jul 29 06:17:08 2014

DBW0 started with pid=17, OS id=7456

Tue Jul 29 06:17:09 2014

LGWR started with pid=18, OS id=7458

Tue Jul 29 06:17:09 2014

CKPT started with pid=19, OS id=7460

Tue Jul 29 06:17:09 2014

SMON started with pid=20, OS id=7462

Tue Jul 29 06:17:09 2014

RECO started with pid=21, OS id=7464

Tue Jul 29 06:17:09 2014

RBAL started with pid=22, OS id=7466

Tue Jul 29 06:17:09 2014

ASMB started with pid=23, OS id=7468

Tue Jul 29 06:17:09 2014

MMON started with pid=24, OS id=7470

Tue Jul 29 06:17:09 2014

MMNL started with pid=25, OS id=7474

NOTE: initiating MARK startup

Starting background process MARK

Tue Jul 29 06:17:09 2014

MARK started with pid=26, OS id=7476

lmon registered with NM - instance number 1 (internal mem no 0)

NOTE: MARK has subscribed

Reconfiguration started (old inc 0, new inc 2)

List of instances:

 1 (myinst: 1)

 Global Resource Directory frozen

* allocate domain 0, invalid = TRUE

 Communication channels reestablished

 Master broadcasted resource hash value bitmaps

 Non-local Process blocks cleaned out

 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived

 Set master node info

 Submitted all remote-enqueue requests

 Dwn-cvts replayed, VALBLKs dubious

 All grantable enqueues granted

 Post SMON to start 1st pass IR

 Submitted all GCS remote-cache requests

 Post SMON to start 1st pass IR

 Fix write in gcs resources

Reconfiguration complete

Tue Jul 29 06:17:11 2014

LCK0 started with pid=28, OS id=7482

Tue Jul 29 06:17:11 2014

Starting background process RSMN

Tue Jul 29 06:17:11 2014

RSMN started with pid=29, OS id=7484

Error: can not register my instance state - 4

ORACLE_BASE not set in environment. It is recommended

that ORACLE_BASE be set in the environment

Reusing ORACLE_BASE from an earlier startup = /u01/app/oracle

Tue Jul 29 06:17:12 2014

DMON started with pid=30, OS id=7486

Reconfiguration started (old inc 2, new inc 4)

List of instances:

 1 2 (myinst: 1)

 Global Resource Directory frozen

 Communication channels reestablished

 Master broadcasted resource hash value bitmaps

 Non-local Process blocks cleaned out

 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived

 Set master node info

 Submitted all remote-enqueue requests

 Dwn-cvts replayed, VALBLKs dubious

 All grantable enqueues granted

 Post SMON to start 1st pass IR

 Submitted all GCS remote-cache requests

 Post SMON to start 1st pass IR

 Fix write in gcs resources

Reconfiguration complete

Tue Jul 29 06:17:13 2014

ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.17.42.14)(PORT=1521))))' SCOPE=MEMORY SID='maa1';

ALTER DATABASE MOUNT /* db agent *//* {1:18282:220} */

This instance was first to mount

NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so

NOTE: Loaded library: System

SUCCESS: diskgroup DATA was mounted

NOTE: dependency between database maa and diskgroup resource ora.DATA.dg is established

SUCCESS: diskgroup FRA was mounted

NOTE: dependency between database maa and diskgroup resource ora.FRA.dg is established

Tue Jul 29 06:17:18 2014

NSS2 started with pid=33, OS id=7506

Successful mount of redo thread 1, with mount id 732690857

Allocated 3981120 bytes in shared pool for flashback generation buffer

Starting background process RVWR

Tue Jul 29 06:17:18 2014

RVWR started with pid=34, OS id=7508

Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)

Lost write protection disabled

Completed: ALTER DATABASE MOUNT /* db agent *//* {1:18282:220} */

ALTER DATABASE OPEN /* db agent *//* {1:18282:220} */

Data Guard Broker initializing...

Data Guard Broker initialization complete

Data Guard: verifying database primary role...

Tue Jul 29 06:17:25 2014

Starting Data Guard Broker (DMON)

Tue Jul 29 06:17:25 2014

INSV started with pid=36, OS id=7514

Tue Jul 29 06:17:29 2014

NSV1 started with pid=37, OS id=7516

Tue Jul 29 06:17:37 2014

Data Guard: version check completed

Data Guard determines a failover has occurred - this is no longer a primary database

ORA-16649 signalled during: ALTER DATABASE OPEN /* db agent *//* {1:18282:220} */...

Tue Jul 29 06:17:52 2014

Reconfiguration started (old inc 4, new inc 6)

List of instances:

 1 (myinst: 1)

 Global Resource Directory frozen

 * dead instance detected - domain 0 invalid = TRUE

 Communication channels reestablished

 Master broadcasted resource hash value bitmaps

 Non-local Process blocks cleaned out

Tue Jul 29 06:17:52 2014

 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived

 Set master node info

 Submitted all remote-enqueue requests

 Dwn-cvts replayed, VALBLKs dubious

 All grantable enqueues granted

 Post SMON to start 1st pass IR

 Submitted all GCS remote-cache requests

 Post SMON to start 1st pass IR

 Fix write in gcs resources

Reconfiguration complete

Tue Jul 29 06:17:59 2014

NSV1 started with pid=39, OS id=7566

Tue Jul 29 06:18:07 2014

RSM0 started with pid=40, OS id=7570

Tue Jul 29 06:18:11 2014

Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST

RFS[1]: Assigned to RFS process 7573

RFS[1]: Database mount ID mismatch [0x2bace9bb:0x2babf9a9] (732752315:732690857)

FLASHBACK DATABASE TO SCN 19932269

Tue Jul 29 06:18:13 2014

RFS[2]: Assigned to RFS process 7577

RFS[2]: Database mount ID mismatch [0x2bace9bb:0x2babf9a9] (732752315:732690857)

Flashback Restore Start

Flashback Restore Complete

Flashback Media Recovery Start

 started logmerger process

Tue Jul 29 06:18:18 2014

Parallel Media Recovery started with 2 slaves

Tue Jul 29 06:18:18 2014

Recovery of Online Redo Log: Thread 1 Group 1 Seq 1181 Reading mem 0

  Mem# 0: +DATA/maa/onlinelog/group_1.272.843488553

  Mem# 1: +FRA/maa/onlinelog/group_1.286.843488555

Recovery of Online Redo Log: Thread 2 Group 3 Seq 1394 Reading mem 0

  Mem# 0: +DATA/maa/onlinelog/group_3.257.843489101

  Mem# 1: +FRA/maa/onlinelog/group_3.284.843489101

Recovery of Online Redo Log: Thread 2 Group 4 Seq 1395 Reading mem 0

  Mem# 0: +DATA/maa/onlinelog/group_4.262.843489103

  Mem# 1: +FRA/maa/onlinelog/group_4.283.843489103

Incomplete Recovery applied until change 19932270 time 07/29/2014 05:42:12

Flashback Media Recovery Complete

Completed: FLASHBACK DATABASE TO SCN 19932269

alter database convert to physical standby

ALTER DATABASE CONVERT TO PHYSICAL STANDBY (maa1)

Flush standby redo logfile failed:1649

Clearing standby activation ID 728027883 (0x2b64d2eb)

The primary database controlfile was created using the

'MAXLOGFILES 192' clause.

There is space for up to 188 standby redo logfiles

Use the following SQL commands on the standby database to create

standby redo logfiles that match the primary database:

ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;

ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;

ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;

ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;

ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 52428800;

Waiting for all non-current ORLs to be archived...

All non-current ORLs have been archived.

Clearing online redo logfile 1 +DATA/maa/onlinelog/group_1.272.843488553

Clearing online log 1 of thread 1 sequence number 1181

Clearing online redo logfile 1 complete

Clearing online redo logfile 2 +DATA/maa/onlinelog/group_2.271.843488555

Clearing online log 2 of thread 1 sequence number 1180

Clearing online redo logfile 2 complete

Clearing online redo logfile 3 +DATA/maa/onlinelog/group_3.257.843489101

Clearing online log 3 of thread 2 sequence number 1394

Clearing online redo logfile 3 complete

Clearing online redo logfile 4 +DATA/maa/onlinelog/group_4.262.843489103

Clearing online log 4 of thread 2 sequence number 1395

Clearing online redo logfile 4 complete

Completed: alter database convert to physical standby

Tue Jul 29 06:18:29 2014

Shutting down instance (immediate)

Shutting down instance: further logons disabled

Stopping background process MMNL

Stopping background process MMON

License high water mark = 6

alter database CLOSE NORMAL

ORA-1109 signalled during: alter database CLOSE NORMAL...

alter database DISMOUNT

Shutting down archive processes

Archiving is disabled

Tue Jul 29 06:18:33 2014

NOTE: Deferred communication with ASM instance

NOTE: deferred map free for map id 5

Tue Jul 29 06:18:33 2014

NOTE: Deferred communication with ASM instance

NOTE: deferred map free for map id 150

Tue Jul 29 06:18:34 2014

NOTE: Deferred communication with ASM instance

NOTE: deferred map free for map id 2

Completed: alter database DISMOUNT

ARCH: Archival disabled due to shutdown: 1089

Shutting down archive processes

Archiving is disabled

Shutting down Data Guard Broker processes

Tue Jul 29 06:18:35 2014

NOTE: Deferred communication with ASM instance

Tue Jul 29 06:18:36 2014

Completed: Data Guard Broker shutdown

NOTE: force a map free for map id 2

Tue Jul 29 06:18:38 2014

Stopping background process VKTM

ARCH: Archival disabled due to shutdown: 1089

Shutting down archive processes

Archiving is disabled

Tue Jul 29 06:18:38 2014

NOTE: Shutting down MARK background process

NOTE: force a map free for map id 192

NOTE: force a map free for map id 190

NOTE: force a map free for map id 188

NOTE: force a map free for map id 186

NOTE: force a map free for map id 184

NOTE: force a map free for map id 182

NOTE: force a map free for map id 143

NOTE: force a map free for map id 191

NOTE: force a map free for map id 189

NOTE: force a map free for map id 187

NOTE: force a map free for map id 185

NOTE: force a map free for map id 183

NOTE: force a map free for map id 181

NOTE: force a map free for map id 142

Tue Jul 29 06:18:41 2014

freeing rdom 0

Tue Jul 29 06:18:43 2014

Instance shutdown complete

Tue Jul 29 06:18:44 2014

Starting ORACLE instance (normal)

****************** Large Pages Information *****************

Total Shared Global Region in Large Pages = 0 KB (0%)

Large Pages used by this instance: 0 (0 KB)

Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB)

Large Pages configured system wide = 0 (0 KB)

Large Page size = 2048 KB

RECOMMENDATION:

  Total Shared Global Region size is 1026 MB. For optimal performance,

  prior to the next instance restart increase the number

  of unused Large Pages by atleast 513 2048 KB Large Pages (1026 MB)

  system wide to get 100% of the Shared

  Global Region allocated with Large pages

***********************************************************

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

Private Interface 'eth2:1' configured from GPnP for use as a private interconnect.

  [name='eth2:1', type=1, ip=169.254.187.5, mac=00-0c-29-8c-91-79, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]

Public Interface 'eth0' configured from GPnP for use as a public interface.

  [name='eth0', type=1, ip=10.17.42.10, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]

Public Interface 'eth0:2' configured from GPnP for use as a public interface.

  [name='eth0:2', type=1, ip=10.17.42.22, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]

Public Interface 'eth0:4' configured from GPnP for use as a public interface.

  [name='eth0:4', type=1, ip=10.17.42.17, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]

Public Interface 'eth0:5' configured from GPnP for use as a public interface.

  [name='eth0:5', type=1, ip=10.17.42.14, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]

Picked latch-free SCN scheme 3

Autotune of undo retention is turned on.

LICENSE_MAX_USERS = 0

SYS auditing is disabled

Starting up:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Partitioning and Real Application Clusters options.

ORACLE_HOME = /u01/app/oracle/product/11.2.0.3/db_1

System name:    Linux

Node name:      rac11pri01.tjsc.jus.br

Release:        2.6.39-400.17.1.el6uek.x86_64

Version:        #1 SMP Fri Feb 22 18:16:18 PST 2013

Machine:        x86_64

VM name:        VMWare Version: 6

Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0.3/db_1/dbs/initmaa1.ora

System parameters with non-default values:

  processes                = 150

  spfile                   = "+DATA/maa/spfilemaa.ora"

  sga_target               = 1G

  control_files            = "+DATA/maa/controlfile/current.273.843488553"

  control_files            = "+FRA/maa/controlfile/current.256.843488553"

  db_file_name_convert     = "+DG01/maastb"

  db_file_name_convert     = "+DATA/maa"

  db_file_name_convert     = "+FRA/maastb"

  db_file_name_convert     = "+FRA/maa"

  log_file_name_convert    = "+DG01/maastb"

  log_file_name_convert    = "+DATA/maa"

  log_file_name_convert    = "+FRA/maastb"

  log_file_name_convert    = "+FRA/maa"

  db_block_size            = 8192

  compatible               = "11.2.0.0.0"

  log_archive_dest_1       = "LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=maa"

  log_archive_dest_2       = "SERVICE=maastb SYNC AFFIRM LGWR VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=maastb"

  log_archive_dest_state_2 = "ENABLE"

  log_archive_min_succeed_dest= 1

  fal_server               = "maastb"

  log_archive_trace        = 0

  log_archive_config       = "DG_CONFIG=(maa,maastb)"

  log_archive_format       = "arch_%t_%s_%r.arc"

  log_archive_max_processes= 4

  archive_lag_target       = 0

  cluster_database         = TRUE

  db_create_file_dest      = "+DATA"

  db_recovery_file_dest    = "+FRA"

  db_recovery_file_dest_size= 10G

  standby_file_management  = "AUTO"

  thread                   = 1

  undo_tablespace          = "UNDOTBS1"

  instance_number          = 1

  remote_login_passwordfile= "EXCLUSIVE"

  db_domain                = ""

  remote_listener          = "rac11pri-scan.tjsc.jus.br:1521"

  audit_file_dest          = "/u01/app/oracle/admin/maa/adump"

  audit_trail              = "DB"

  db_name                  = "maa"

  db_unique_name           = "maa"

  open_cursors             = 300

  pga_aggregate_target     = 256M

  dg_broker_start          = TRUE

  dg_broker_config_file1   = "+DATA/maa/dr1maa.dat"

  dg_broker_config_file2   = "+FRA/maa/dr2maa.dat"

  diagnostic_dest          = "/u01/app/oracle"

Cluster communication is configured to use the following interface(s) for this instance

  169.254.187.5

cluster interconnect IPC version:Oracle UDP/IP (generic)

IPC Vendor 1 proto 2

Tue Jul 29 06:18:47 2014

PMON started with pid=2, OS id=7618

Tue Jul 29 06:18:47 2014

PSP0 started with pid=3, OS id=7620

Tue Jul 29 06:18:48 2014

VKTM started with pid=4, OS id=7622 at elevated priority

VKTM running at (1)millisec precision with DBRM quantum (100)ms

Tue Jul 29 06:18:48 2014

GEN0 started with pid=5, OS id=7626

Tue Jul 29 06:18:48 2014

DIAG started with pid=6, OS id=7628

Tue Jul 29 06:18:48 2014

DBRM started with pid=7, OS id=7630

Tue Jul 29 06:18:48 2014

PING started with pid=8, OS id=7632

Tue Jul 29 06:18:48 2014

ACMS started with pid=9, OS id=7634

Tue Jul 29 06:18:49 2014

DIA0 started with pid=10, OS id=7636

Tue Jul 29 06:18:49 2014

LMON started with pid=11, OS id=7638

Tue Jul 29 06:18:49 2014

LMD0 started with pid=12, OS id=7640

Tue Jul 29 06:18:49 2014

LMS0 started with pid=13, OS id=7642 at elevated priority

Tue Jul 29 06:18:49 2014

RMS0 started with pid=14, OS id=7646

Tue Jul 29 06:18:49 2014

LMHB started with pid=15, OS id=7648

Tue Jul 29 06:18:49 2014

MMAN started with pid=16, OS id=7650

* Load Monitor used for high load check

* New Low - High Load Threshold Range = [1920 - 2560]

Tue Jul 29 06:18:49 2014

DBW0 started with pid=17, OS id=7652

Tue Jul 29 06:18:49 2014

LGWR started with pid=18, OS id=7654

Tue Jul 29 06:18:50 2014

CKPT started with pid=19, OS id=7656

Tue Jul 29 06:18:50 2014

SMON started with pid=20, OS id=7658

Tue Jul 29 06:18:50 2014

RECO started with pid=21, OS id=7660

Tue Jul 29 06:18:50 2014

RBAL started with pid=22, OS id=7662

Tue Jul 29 06:18:50 2014

ASMB started with pid=23, OS id=7664

Tue Jul 29 06:18:50 2014

MMON started with pid=24, OS id=7666

Tue Jul 29 06:18:50 2014

MMNL started with pid=25, OS id=7670

lmon registered with NM - instance number 1 (internal mem no 0)

NOTE: initiating MARK startup

Starting background process MARK

Tue Jul 29 06:18:50 2014

MARK started with pid=26, OS id=7672

NOTE: MARK has subscribed

Reconfiguration started (old inc 0, new inc 2)

List of instances:

 1 (myinst: 1)

 Global Resource Directory frozen

* allocate domain 0, invalid = TRUE

 Communication channels reestablished

 Master broadcasted resource hash value bitmaps

 Non-local Process blocks cleaned out

 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived

 Set master node info

 Submitted all remote-enqueue requests

 Dwn-cvts replayed, VALBLKs dubious

 All grantable enqueues granted

 Post SMON to start 1st pass IR

 Submitted all GCS remote-cache requests

 Post SMON to start 1st pass IR

 Fix write in gcs resources

Reconfiguration complete

Tue Jul 29 06:18:50 2014

LCK0 started with pid=28, OS id=7678

Starting background process RSMN

Tue Jul 29 06:18:51 2014

RSMN started with pid=29, OS id=7680

ORACLE_BASE from environment = /u01/app/oracle

Tue Jul 29 06:18:51 2014

DMON started with pid=30, OS id=7682

Tue Jul 29 06:18:51 2014

alter database  mount

This instance was first to mount

Tue Jul 29 06:18:52 2014

ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.17.42.14)(PORT=1521))))' SCOPE=MEMORY SID='maa1';

NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so

NOTE: Loaded library: System

SUCCESS: diskgroup DATA was mounted

SUCCESS: diskgroup FRA was mounted

NOTE: dependency between database maa and diskgroup resource ora.DATA.dg is established

NOTE: dependency between database maa and diskgroup resource ora.FRA.dg is established

Tue Jul 29 06:18:58 2014

NSS2 started with pid=33, OS id=7720

ARCH: STARTING ARCH PROCESSES

Tue Jul 29 06:18:58 2014

ARC0 started with pid=34, OS id=7722

ARC0: Archival started

ARCH: STARTING ARCH PROCESSES COMPLETE

ARC0: STARTING ARCH PROCESSES

Tue Jul 29 06:18:59 2014

ARC1 started with pid=35, OS id=7724

Successful mount of redo thread 1, with mount id 732706059

Tue Jul 29 06:18:59 2014

ARC2 started with pid=36, OS id=7726

Allocated 3981120 bytes in shared pool for flashback generation buffer

Tue Jul 29 06:18:59 2014

ARC3 started with pid=37, OS id=7728

Starting background process RVWR

ARC1: Archival started

ARC2: Archival started

ARC1: Becoming the 'no FAL' ARCH

ARC1: Becoming the 'no SRL' ARCH

ARC2: Becoming the heartbeat ARCH

Tue Jul 29 06:18:59 2014

RVWR started with pid=38, OS id=7730

Physical Standby Database mounted.

Lost write protection disabled

ARC2: Becoming the active heartbeat ARCH

ARC2: Becoming the active heartbeat ARCH

ARC3: Archival started

ARC0: STARTING ARCH PROCESSES COMPLETE

Completed: alter database  mount

Tue Jul 29 06:19:04 2014

Starting Data Guard Broker (DMON)

Tue Jul 29 06:19:05 2014

INSV started with pid=40, OS id=7747

Tue Jul 29 06:19:11 2014

NSV1 started with pid=41, OS id=7752

Tue Jul 29 06:19:16 2014

RSM0 started with pid=43, OS id=7758

Tue Jul 29 06:19:21 2014

Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST

Primary database is in MAXIMUM AVAILABILITY mode

Changing standby controlfile to RESYNCHRONIZATION level

ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maa1';

Standby controlfile consistent with primary

ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maa1';

ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';

ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';

RFS[1]: Assigned to RFS process 7760

RFS[1]: Selected log 5 for thread 1 sequence 4 dbid 722024964 branch 854159044

ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';

ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';

ALTER SYSTEM SET db_file_name_convert='+DG01/maastb','+DATA/maa','+FRA/maastb','+FRA/maa' SCOPE=SPFILE;

ALTER SYSTEM SET log_file_name_convert='+DG01/maastb','+DATA/maa','+FRA/maastb','+FRA/maa' SCOPE=SPFILE;

ALTER SYSTEM SET fal_server='maastb' SCOPE=BOTH;

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE

Attempt to start background Managed Standby Recovery process (maa1)

Tue Jul 29 06:19:23 2014

Primary database is in MAXIMUM AVAILABILITY mode

Standby controlfile consistent with primary

Tue Jul 29 06:19:23 2014

MRP0 started with pid=46, OS id=7764

MRP0: Background Managed Standby Recovery process started (maa1)

Standby controlfile consistent with primary

RFS[2]: Assigned to RFS process 7762

RFS[2]: Selected log 8 for thread 2 sequence 4 dbid 722024964 branch 854159044

Tue Jul 29 06:19:25 2014

RFS[3]: Assigned to RFS process 7766

RFS[3]: Selected log 6 for thread 1 sequence 3 dbid 722024964 branch 854159044

Tue Jul 29 06:19:25 2014

RFS[4]: Assigned to RFS process 7768

RFS[4]: Selected log 9 for thread 2 sequence 3 dbid 722024964 branch 854159044

RFS[3]: New Archival REDO Branch(resetlogs_id): 854159044  Prior: 847284763

RFS[3]: Archival Activation ID: 0x2bace9bb Current: 0x0

RFS[3]: Effect of primary database OPEN RESETLOGS

RFS[3]: Managed Standby Recovery process is active

RFS[3]: Incarnation entry added for Branch(resetlogs_id): 854159044 (maa1)

Tue Jul 29 06:19:25 2014

Setting recovery target incarnation to 4

Tue Jul 29 06:19:26 2014

Archived Log entry 5433 added for thread 2 sequence 3 ID 0x2bace9bb dest 1:

Tue Jul 29 06:19:26 2014

Archived Log entry 5434 added for thread 1 sequence 3 ID 0x2bace9bb dest 1:

RFS[4]: Opened log for thread 2 sequence 1395 dbid 722024964 branch 847284763

RFS[3]: Opened log for thread 1 sequence 1181 dbid 722024964 branch 847284763

Archived Log entry 5435 added for thread 2 sequence 1395 rlc 847284763 ID 0x2b64d2eb dest 2:

 started logmerger process

Archived Log entry 5436 added for thread 1 sequence 1181 rlc 847284763 ID 0x2b64d2eb dest 2:

Managed Standby Recovery starting Real Time Apply

RFS[4]: Opened log for thread 2 sequence 2 dbid 722024964 branch 854159044

Tue Jul 29 06:19:29 2014

RFS[5]: Assigned to RFS process 7786

RFS[5]: Opened log for thread 2 sequence 1 dbid 722024964 branch 854159044

RFS[3]: Opened log for thread 1 sequence 1 dbid 722024964 branch 854159044

Archived Log entry 5437 added for thread 2 sequence 2 rlc 854159044 ID 0x2bace9bb dest 2:

Archived Log entry 5438 added for thread 2 sequence 1 rlc 854159044 ID 0x2bace9bb dest 2:

Parallel Media Recovery started with 2 slaves

Tue Jul 29 06:19:30 2014

Media Recovery start incarnation depth : 1, target inc# : 4, irscn : 19932272

Archived Log entry 5439 added for thread 1 sequence 1 rlc 854159044 ID 0x2bace9bb dest 2:

Waiting for all non-current ORLs to be archived...

All non-current ORLs have been archived.

Tue Jul 29 06:19:30 2014

RFS[6]: Assigned to RFS process 7788

RFS[6]: Opened log for thread 1 sequence 2 dbid 722024964 branch 854159044

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_1395.2326.854173167

Identified End-Of-Redo (failover) for thread 2 sequence 1395 at SCN 0x0.1302470

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_1181.2327.854173167

Identified End-Of-Redo (failover) for thread 1 sequence 1181 at SCN 0x0.1302470

Archived Log entry 5440 added for thread 1 sequence 2 rlc 854159044 ID 0x2bace9bb dest 2:

Resetting standby activation ID 728027883 (0x2b64d2eb)

Media Recovery End-Of-Redo indicator encountered

Media Recovery Continuing

Tue Jul 29 06:19:31 2014

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_1.2329.854173169

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_1.2330.854173169

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_2.2331.854173171

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_2.2328.854173169

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_3.2325.854173167

Tue Jul 29 06:19:34 2014

Archived Log entry 5441 added for thread 2 sequence 4 ID 0x2bace9bb dest 1:

Tue Jul 29 06:19:34 2014

Primary database is in MAXIMUM AVAILABILITY mode

Standby controlfile consistent with primary

Standby controlfile consistent with primary

RFS[7]: Assigned to RFS process 7806

RFS[7]: Selected log 8 for thread 2 sequence 5 dbid 722024964 branch 854159044

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_3.2324.854173167

Media Recovery Waiting for thread 1 sequence 4 (in transit)

Recovery of Online Redo Log: Thread 1 Group 5 Seq 4 Reading mem 0

  Mem# 0: +DATA/maa/onlinelog/group_5.263.843615365

  Mem# 1: +FRA/maa/onlinelog/group_5.289.843615367

Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_4.2332.854173175

Media Recovery Waiting for thread 2 sequence 5 (in transit)

Recovery of Online Redo Log: Thread 2 Group 8 Seq 5 Reading mem 0

  Mem# 0: +DATA/maa/onlinelog/group_8.259.843615383

  Mem# 1: +FRA/maa/onlinelog/group_8.703.843615385

Tue Jul 29 06:19:37 2014

Archived Log entry 5442 added for thread 1 sequence 4 ID 0x2bace9bb dest 1:

Tue Jul 29 06:19:38 2014

Primary database is in MAXIMUM AVAILABILITY mode

Standby controlfile consistent with primary

Media Recovery Waiting for thread 1 sequence 5 (in transit)

Standby controlfile consistent with primary

RFS[8]: Assigned to RFS process 7809

RFS[8]: Selected log 5 for thread 1 sequence 5 dbid 722024964 branch 854159044

Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0

  Mem# 0: +DATA/maa/onlinelog/group_5.263.843615365

  Mem# 1: +FRA/maa/onlinelog/group_5.289.843615367

Tue Jul 29 06:19:46 2014

Reconfiguration started (old inc 2, new inc 4)

List of instances:

 1 2 (myinst: 1)

 Global Resource Directory frozen

 Communication channels reestablished

 Master broadcasted resource hash value bitmaps

 Non-local Process blocks cleaned out

Tue Jul 29 06:19:46 2014

 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived

 Set master node info

 Submitted all remote-enqueue requests

 Dwn-cvts replayed, VALBLKs dubious

 All grantable enqueues granted

 Post SMON to start 1st pass IR

 Submitted all GCS remote-cache requests

 Post SMON to start 1st pass IR

 Fix write in gcs resources

Reconfiguration complete

No log do Broker irá ver um resumo do está no alertlog. Detecção que o antigo primary voltou e precisa de reinstate, após isso o banco maa passa a ser o destino preferencial.

07/29/2014 02:58:18

REINSTATE DATABASE maa

Database maa can be reinstated

07/29/2014 02:58:50

Command REINSTATE DATABASE maa completed with warning ORA-16570

Data Guard Broker Status Summary:

  Type                        Name                             Severity  Status

  Configuration               dgrac                             Warning  ORA-16608

  Primary Database            maastb                            Warning  ORA-16817

  Physical Standby Database   maa                                 Error  ORA-16661

07/29/2014 02:59:31

REINSTATE DATABASE maa

Database maa can be reinstated

07/29/2014 02:59:53

Successfully completed reinstatement of database 0x01001000; removing ReinstateContextArray property

07/29/2014 03:00:03

Command REINSTATE DATABASE maa completed

Data Guard Broker Status Summary:

  Type                        Name                             Severity  Status

  Configuration               dgrac                             Warning  ORA-16608

  Primary Database            maastb                            Warning  ORA-16817

  Physical Standby Database   maa                               Warning  ORA-16817

EDIT DATABASE maa SET PROPERTY ActualApplyInstance = maa1

07/29/2014 03:00:04

Apply Instance for database maa is maa1

Command EDIT DATABASE maa SET PROPERTY ActualApplyInstance = maa1 completed

Caso estivesse acompanhando o log do Observer veria as mesmas informações (de forma mais resumida ainda). Conforme o log abaixo demonstra o banco maa não chega a abrir durante o processo devido ao reinstate.

Failover succeeded, new primary is "maastb"

23:45:11.19  Monday, July 28, 2014

00:17:56.55  Tuesday, July 29, 2014

Initiating reinstatement for database "maa"...

Reinstating database "maa", please wait...

Operation requires shutdown of instance "maa1" on database "maa"

Shutting down instance "maa1"...

ORA-01109: database not open

Database dismounted.

ORACLE instance shut down.

Operation requires startup of instance "maa1" on database "maa"

Starting instance "maa1"...

ORACLE instance started.

Database mounted.

Continuing to reinstate database "maa" ...

Reinstatement of database "maa" succeeded

00:19:41.47  Tuesday, July 29, 2014

Como sempre, o ajuste necessário do CRS para não abrir o antigo primary em modo OPEN em caso de restart do server:

[oracle@rac11pri01 ~]$ srvctl config database -d maa -v

Database unique name: maa

Database name: maa

Oracle home: /u01/app/oracle/product/11.2.0.3/db_1

Oracle user: oracle

Spfile: +DATA/maa/spfilemaa.ora

Domain:

Start options: open

Stop options: immediate

Database role: PHYSICAL_STANDBY

Management policy: AUTOMATIC

Server pools: maa

Database instances: maa1,maa2

Disk Groups: DATA,FRA

Mount point paths:

Services:

Type: RAC

Database is administrator managed

[oracle@rac11pri01 ~]$ srvctl modify database -d maa -s MOUNT

[oracle@rac11pri01 ~]$ srvctl config database -d maa -v

Database unique name: maa

Database name: maa

Oracle home: /u01/app/oracle/product/11.2.0.3/db_1

Oracle user: oracle

Spfile: +DATA/maa/spfilemaa.ora

Domain:

Start options: mount

Stop options: immediate

Database role: PHYSICAL_STANDBY

Management policy: AUTOMATIC

Server pools: maa

Database instances: maa1,maa2

Disk Groups: DATA,FRA

Mount point paths:

Services:

Type: RAC

Database is administrator managed

[oracle@rac11pri01 ~]$

AMBIENTE FINAL

Chegamos aqui com o ambiente com dois bancos operando em Maximum Availability. O banco primary é o maastb e o standby é o maa. A troca de papeis ocorreu de forma automática pois o Fast-Start Failover está ativo com o Observer monitorando tudo.

Claro que seu ambiente não é só isso, você precisará que suas aplicações e TNS estejam preparadas para chavear as conexões para o novo primary. Não é o escopo aqui, mas é um ponto importante a ser verificado.

PRÓXIMO ARTIGO

No próximo artigo falarei do que temos de mudança para o DataGuard do Oracle 12c. Além disso um resumo dos dez artigos para que você possa ter um guia do que pode ser feito para configurar MAA com oracle RAC.

ORACLE E MAA (Maximum Availability Architecture) – Parte IX – Failover Automático

Failover Automático

Fernando Simon

Prestigie o autor e deixe o seu comentário: Cancelar resposta