阅读 106

DBCA建库问题处理之-无法创建RAC实例

在安装完RAC11.2.0.4,并打上PSU后,在通过DBCA创建数据库实例是出现诡异现象:  
downloadattachmentprocessor?parent=DOCUMENT&sourceId=1073926.1&attachid=1073926.1:SNAP1&clickstream=yes
弹出来的框为单实例创建,二并非RAC实例。反复试了几次都是同样的问题,检查安装无问题。  
在metalink上查看了下,问题原因很多,同事通过sqlplus连接空实例测试出现如下报错:  
[oracle@zderp2vprd01 cfgtoollogs]$ sqlplus / as sysdba  


SQL*Plus: Release 11.2.0.4.0 Production on Thu Nov 17 09:59:10 2016  


Copyright (c) 1982, 2013, Oracle.  All rights reserved.  


ERROR:  
ORA-12547: TNS:lost contact  




Enter user-name: ^C  

上述报错通常出现在打完PSU后,看到这个错误,怀疑是一个经常遇到的老问题了。目录权限问题,检查如下:  
[oracle@zderp2vprd01 11.2.0]$ ls -lrt  
total 8  
drwxr-xr-x 73 root oinstall 4096 Nov 16 21:29 grid_1  
drwxrwxr-x 78 grid oinstall 4096 Nov 17 09:40 db_1  
[oracle@zderp2vprd01 11.2.0]$ chown oracle:oinstall db_1   
[root@zderp2vprd01 11.2.0]# ls -lrt  
total 8  
 drwxr-xr-x 73 root oinstall 4096 Nov 16 21:29 grid_1  
 drwxrwxr-x 78 grid oinstall 4096 Nov 17 09:40 db_1  
 [root@zderp2vprd01 11.2.0]# chown oracle:oinstall db_1  

修改后终于可以正常创建RAC实例了。  

此处将metalink上的信息也贴出来,供大家参考学习。  
Oracle Database - Enterprise Edition - Version 10.2.0.1 to 11.2.0.4 [Release 10.2 to 11.2]  
Information in this document applies to any platform.  

SYMPTOMS

After CRS or Grid Infrastructure installation and RDBMS installation, using dbca to create a RAC database,  it gives a welcome screen without the 'Oracle Real Applications Clusters' and 'Oracle single instance' database options. (same for netca).

downloadattachmentprocessor?parent=DOCUMENT&sourceId=1073926.1&attachid=1073926.1:SNAP1&clickstream=yes

While it supposes to be:

downloadattachmentprocessor?parent=DOCUMENT&sourceId=1073926.1&attachid=1073926.1:SNAP2&clickstream=yes

This can happen for the following cases.


Case I

With 11.2 Grid Infrastructure, using dbca to create a RAC database from a 10.x or 11.1.x ORACLE_HOME.

$ORACLE_HOME/cfgtoologs/dbca/trace.log shows:

[main] [10:36:4:10] [OCR.<init>:259] Primary Group of Current user: oracle  
[main] [10:36:4:11] [HASContext.getInstance:200] Module init : 24  
[main] [10:36:4:11] [HASContext.getInstance:223] Local Module init : 24  
[main] [10:36:4:13] [ClusterLock.<init>:56] ClusterLock Instance Allocation Failed: oracle.ops.mgmt.has.HASContextException: OCR Error(Native: prsr_initCLSS:[21])  
[main] [10:36:4:20] [Cluster.isCluster:206] oracle.ops.mgmt.cluster.Cluster.isCluster(Cluster.java:260)  
oracle.ops.mgmt.cluster.Cluster.isCluster(Cluster.java:197)  
oracle.sysman.assistants.util.ClusterUtils.<init>(ClusterUtils.java:185)  
oracle.sysman.assistants.util.ClusterUtils.getInstance(ClusterUtils.java:236)  
oracle.sysman.assistants.dbca.backend.Host.checkOPS(Host.java:2070)  
oracle.sysman.assistants.dbca.backend.Host.startOperation(Host.java:2303)  
oracle.sysman.assistants.dbca.Dbca.execute(Dbca.java:115)  
oracle.sysman.assistants.dbca.Dbca.main(Dbca.java:180)  
[main] [10:36:4:20] [Host.checkOPS:2073] cluster existence:false  
[main] [10:36:4:20] [Host.checkOPS:2111] Cluster installed=false  
[Finalizer] [10:36:4:25] [ClusterLock.finalize:88] ClusterLock: finalized called for oracle.ops.mgmt.has.ClusterLock@5c7734

Case II

RDBMS is at the same release as CRS or Grid Infrastructure, RDBMS is installed with Real Application Cluster option

Case III

RDBMS is at the same release as CRS or Grid Infrastructure, RDBMS is installed on each individual node without selecting Real Application Cluster option.

Case IV,  V,  VI & VII

Grid Infrastructure and RDBMS are on the same release, they are installed as different user: grid and oracle. Grid Infrastructure is running on all nodes, RDBMS is installed with RAC option.

CAUSE

Case I

  This is caused by the cluster nodes not being pinned in 11.2 Grid Infrastructure.

See Oracle Grid Infrastructure Installation Guide 11g Release 2 (11.2) for Linux Chapter 5.3.3 Pinning Cluster Nodes for Oracle Database Release 10.x or 11.x for more details.

Case II

This is caused by the problem inventory, eg: oraInventory/ContentsXML/inventory.xml does not list RAC nodes for the corresponding RDBMS ORACLE_HOME. For example:

/HOME>  
<HOME NAME="OraDb11g_home" LOC="/opt/oracle/product/11.2.0/racdb" TYPE="O" IDX="3">  
</HOME>


While it should be similar to:

/HOME>  
<HOME NAME="OraDb11g_home" LOC="/opt/oracle/product/11.2.0/racdb" TYPE="O" IDX="3">  
   <NODE_LIST>  
     <NODE NAME="racnode1"/>  
     <NODE NAME="racnode2"/>  
   </NODE_LIST>  
</HOME>

 

Case III

If the RDBMS software is not installed with RAC option, then dbca will not show RAC option either. To confirm if RDBMS software is installed with RAC option, please refer Note 284785.1 How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC.

Case IV

The problem is caused by RDBMS ORACLE_HOME wrongly owned by grid user. It should be owned by oracle user.
dbca trace shows:

[main] [ 2012-08-09 03:46:46.745 GST ] [OracleHome.isDatabaseOptionOn:1133]  Checking Database Option: Real Application Clusters  
[main] [ 2012-08-09 03:46:46.745 GST ] [OracleHome.isDatabaseOptionOn:1190]  Database Option Real Application Clusters is false

Despite RDBMS ORACLE_HOME has RAC option installed correctly.

Further strace against dbca reveals there are ORA-600 reported before dbca display the option page:

21999 02:45:37.053231 open("/u01/app/oracle/product/11.2.0/dbhome_1/rdbms/log/dbua4535717_ora_21999.trc", O_WRONLY|O_CREAT|O_TRUNC, 0660) = 3  
...  
21999 02:45:37.058573 write(3, "ORA-00600: internal error code, "..., 139) = 139  
21999 02:45:37.214275 --- SIGSEGV (Segmentation fault) @ 0 (0) ---  

checking /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/log/dbua4535717_ora_21999.trc, shows:  
  ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []  

While:  
  # id grid
uid=1100(grid) gid=1000(oinstall) groups=1000(oinstall),1200(asmadmin),1201(asmdba),1202(asmoper),1300(dba)
# id oracle
uid=1101(oracle) gid=1000(oinstall) groups=1000(oinstall),1201(asmdba),1300(dba),1301(oper)

$ ls -l /u01/app/oracle/product/11.2.0
  drwxrwxr-x 75 grid oinstall 4096 Aug  8 14:21 dbhome_1  << this is wrongly owned by grid user

Case V

The problem is caused by an prior abnormal termination of dbca which left a test instance DBUA0 running on 1 of the cluster nodes, this prevents a new test instance to be created before the RAC option is displayed. dbca trace shows:

[main] [ 2011-03-11 21:21:08.473 GMT ] [OracleHome.initOptions:1226] Initializing Database Options with  for dummy sid=DBUA0 using initfile=/data/oracle/product/11.2.0.2/dbs/initDBUA0.ora using pwdfile=/data/oracle/product/11.2.0.2/dbs/orapwDBUA0  
[main] [ 2011-03-11 21:21:08.899 GMT ] [OracleHome.initOptions:1240]  executing: startup nomount pfile='/data/oracle/product/11.2.0.2/dbs/initDBUA0.ora'  
[main] [ 2011-03-11 21:21:32.871 GMT ] [SQLEngine.done:2167]  Done called  
[main] [ 2011-03-11 21:21:32.872 GMT ] [OracleHome.initOptions:1247]  ORA-304: requested INSTANCE_NUMBER is busy <<<<  
oracle.sysman.assistants.util.sqlEngine.SQLFatalErrorException: ORA-304: requested INSTANCE_NUMBER is busy   
        at   
oracle.sysman.assistants.util.sqlEngine.SQLEngine.executeImpl(SQLEngine.java:1  
...  
[main] [ 2011-03-11 21:21:33.499 GMT ] [OracleHome.isDatabaseOptionOn:1129] Checking Database Option: Real Application Clusters  
[main] [ 2011-03-11 21:21:33.500 GMT ] [OracleHome.isDatabaseOptionOn:1186] Database Option Real Application Clusters is false <<<<

Case VI

The problem is caused by $ORACLE_HOME/rdbms/audit directory missing or not writable.

$ORACLE_BASE/cfgtoollogs/dbca/trace.log_<ORACLE_HOMENAME>_<date> shows:

[main] [ 2014-06-27 00:15:15.835 EST ] [OracleHome.initOptions:1236]  Initializing Database Options with  for dummy sid=DBUA1500252 using initfile=/u01/app/oradb1/product/11.2.0/dbhome_1/dbs/initDBUA1500252.ora using pwdfile=/u01/app/oradb1/product/11.2.0/dbhome_1/dbs/orapwDBUA1500252
...
[main] [ 2014-06-27 00:15:38.793 EST ] [OracleHome.initOptions:1285]  ORA-09925: Unable to create audit trail file

 

Case VII

The problem is caused by insufficient semaphores configed which leads to test instance startup failure.
dbca trace.log shows:

[main] [ 2015-08-06 09:28:58.544 EDT ] [OracleHome.initOptions:1253]  executing: startup nomount pfile='/oracle/product/11.2.0.4.5/db11g/dbs/initDBUA2857847.ora'  
[main] [ 2015-08-06 09:29:00.158 EDT ] [SQLEngine.done:2189]  Done called  
[main] [ 2015-08-06 09:29:00.159 EDT ] [OracleHome.initOptions:1262]  ORA-27154: post/wait create failed  
ORA-27300: OS system dependent operation:semget failed with status: 28  
ORA-27301: OS failure message: No space left on device  
ORA-27302: failure occurred at: sskgpbitsper

 

 

Other Possible causes
1. RDBMS software corrupted, oracle binary shows different size for different nodes

2. Grid_home is not relinked properly

 

SOLUTION

Case I

1.  To fix the issue, pin all nodes within the cluster using:

as root user:  
crsctl pin css -n<node1> <node2>

2. To list all pinned nodes:

olsnodes -t -n

After this, dbca shows Oracle Real Application Clusters database option correctly.

For 10.x and 11.1.x dbca to work with 11.2 listener, patch 8288940 needs to be applied. Please refer to Note 948456.1 for other potential issues in such mixed environments.

Case II

   Fix the inventory.xml via detaching the problem ORACLE_HOME, then re-attaching with correct information.

1. To detach ORACLE_HOME, on each cluster node, issue:

$ORACLE_HOME/oui/bin/runInstaller -detachHome -silent -local ORACLE_HOME=<$ORACLE_HOME path>  

eg:  
$ORACLE_HOME/oui/bin/runInstaller -detachHome -silent -local ORACLE_HOME=/opt/oracle/product/11.2.0/racdb

2. To attach ORACLE_HOME, on each cluster node, issue:

$ORACLE_HOME/oui/bin/runInstaller -silent -local -ignoreSysPrereqs -attachHome ORACLE_HOME=<$ORACLE_HOME path> ORACLE_HOME_NAME=<$ORACLE_HOME_NAME> CLUSTER_NODES=<local-node>,<remote-node> LOCAL_NODE=<local-node>

For example, on node 1, issue:

$ORACLE_HOME/oui/bin/runInstaller -silent -attachHome -local ORACLE_HOME="/opt/oracle/product/11.2.0/racdb" ORACLE_HOME_NAME="OraDb11g_home" "CLUSTER_NODES={racnode1,racnode2}" LOCAL_NODE='racnode1'

 on node 2, issue:

$ORACLE_HOME/oui/bin/runInstaller -silent -attachHome -local ORACLE_HOME="/opt/oracle/product/11.2.0/racdb" ORACLE_HOME_NAME="OraDb11g_home" "CLUSTER_NODES={racnode1,racnode2}" LOCAL_NODE='racnode2'

Case III

1. Relink Oracle binary to include RAC option.

cd $ORACLE_HOME/rdbms/lib  
make -f ins_rdbms.mk rac_on ioracle

2. Fix the inventory using the same procedure as case II.

Case IV

Change RDBMS ORACLE_HOME to be owned by oracle user:

As root user:  
cd /u01/app/oracle/product/11.2.0  
chown oracle dbhome_1


Case V

Manually shutdown the instance DBUA0.
Please ensure there is no instance with name DBUA0 running on any node of the cluster before start dbca. This is not required if there is no earlier abnormal termination of dbca happened.


Case VI

Create the default audit directory under $ORACLE_HOME/rdbms/audit if it is missing, otherwise correct the permission and ownership:

$ mkdir $ORACLE_HOME/rdbms/audit  
It should have permission 755 (drwxr-xr-x) and ownership <oracle>:oinstall

 

Case VII

Either increase the semaphores setting or clear up left over semaphores. Please refer to Note 825001.1 for solution.


Others

1. Reinstall RDBMS oracle software

2. Relinking Grid Infrastructure binaries, refer to
Oracle? Grid Infrastructure Installation Guide
11g Release 2 (11.2) for Linux
6.3 Relinking Oracle Grid Infrastructure for a Cluster Binaries 


文章分类
后端
版权声明:本站是系统测试站点,无实际运营。本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至 XXXXXXo@163.com 举报,一经查实,本站将立刻删除。
相关推荐