ORA-48189: OS command to create directory failed while trying to start database using srvctl on Exadata
So, yesterday, during a normal troubleshooting and maintenance at one of my customer, we wanted to restart a database running on Exadata. While the database was successfully brought down, we encountered ORA-48189 when we tried to bring the database up using srvctl. See full error message below:
[orcldb1]-[oracle@exadata-01]-[/u02/app/oracle]-[12:20:39] $ srvctl start database -d orcldb PRCR-1079 : Failed to start resource ora.orcldb.db CRS-5017: The resource action "ora.orcldb.db start" encountered the following error: ORA-48189: OS command to create directory failed Linux-x86_64 Error: 1: Operation not permitted Additional information: 2 . For details refer to "(:CLSN00107:)" in "/u01/app/oracle/diag/crs/exadata-01/crs/trace/crsd_oraagent_oracle.trc". CRS-2674: Start of 'ora.orcldb.db' on 'exadata-01' failed CRS-2632: There are no more servers to try to place resource 'ora.orcldb.db' on that would satisfy its placement policy [orcldb1]-[oracle@exadata-01]-[/u02/app/oracle]-[12:21:50] $
This is a 02 node RAC and while checking the logs, I found that the database instance on node-02 got started successfully however, it encountered the aforementioned error while starting instance on node-01.
Details of the error and a possible solution has been documented under Oracle Doc ID: 2791944.1. Well, I checked and found that the permissions for $ORACLE_BASE/diag folder were indeed 775. So, what is the issue?
What “might” have happened?
Unlike a traditional setup, where one would usually find just one oracle_home owner/ user i.e. “oracle”, the customer had multiple oracle_home owners defined, for example, oracle, oracledev, and oracletest for ease of identification and segregation of environments on the same exadata machine.
The $ORACLE_BASE owner “oracledev:oinstall” of the database that we were trying to start was different from the $ORACLE_HOME owner “oracle:oinstall”. Even with all the oracle_home owners being a part of oinstall group and with the right permissions of 775 set on $ORACLE_BASE, we still encountered the error. Strange..
What I think might have happened is that during the process of installing a new 19c Oracle_home, the permissions of some files/ folders got changed under $ORACLE_BASE/diag folder that prevented us from starting the database.
We checked and found that the permission on $ORACLE_BASE on node-02 were set to 777 where the database got started without any issues. I usually do not recommend world executable permissions hence we updated the permissions again on $ORACLE_BASE/diag folder and tried to restart the database which came back clean without any issues this time.
#chmod 775 -R $ORACLE_BASE/diag
I am still researching on the which all files/ folder permissions got changed and caused this and will let you know once I find something. Meanwhile, let me know if you have encountered this in the past and already know which all files/ folders might have gotten the permissions updated and resulted into the issue.