Intermittent ORA-12533 / ORA-12521 errors in a RAC instance alert log

“Fatal NI connect error” which floods the alert log of a RAC instance

Create an entry similar to following in the tnsnames.ora’ Note the IP Addresses must be the same as the IP Address assigned to the SCAN Name. To get these, do a nslookup <SCAN_NAME>

 

SCAN_LSNR=

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = <Scan IP Address 1>)(PORT = <port #>))

(ADDRESS = (PROTOCOL = TCP)(HOST = <Scan IP Address 2)(PORT = <port #>))

(ADDRESS = (PROTOCOL = TCP)(HOST = <Scan IP Address 3)(PORT = <port #>))

)

Connect to the instance  as sysdba

 

 

 

SQL> show parameter listener

 

NAME                                 TYPE                             VALUE

———————————— ——————————– ——————————

listener_networks                    string

local_listener                       string                           (ADDRESS=(PROTOCOL=tcp)(HOST=<Host VIP Address>)(PORT=1521)), (ADD

RESS=(PROTOCOL=tcp)(HOST=<Host Public IP Address>)(PORT=1521))

remote_listener                      string                           <scan name>:<port #>

SQL>

SQL>

SQL> alter system set remote_listener=’SCAN_LSNR’ scope=both sid=’*’;

alter system set remote_listener=’SCAN_LSNR’ scope=both sid=’*’

*

ERROR at line 1:

ORA-32008: error while processing parameter update at instance <instance name>

ORA-02097: parameter cannot be modified because specified value is invalid

ORA-00119: invalid specification for system parameter REMOTE_LISTENER

ORA-00132: syntax error or unresolved network name ‘SCAN_LSNR’

 

 

SQL> show parameter listener

 

NAME                                 TYPE                             VALUE

———————————— ——————————– ——————————

listener_networks                    string

local_listener                       string                           (ADDRESS=(PROTOCOL=tcp)(HOST=<Host VIP IP Address>)(PORT=1521)), (ADD

RESS=(PROTOCOL=tcp)(HOST=<Host Public IP Address>)(PORT=1521))

 

remote_listener                      string                           SCAN_LSNR

 

 

SQL>

SQL> alter system set remote_listener='<scan name : port #>’ scope=both sid=’*’;

 

System altered.

 

SQL> show parameter listener

 

NAME                                 TYPE                             VALUE

———————————— ——————————– ——————————

listener_networks                    string

local_listener                       string                           (ADDRESS=(PROTOCOL=tcp)(HOST=<Host VIP IP Address>)(PORT=1521)), (ADD

RESS=(PROTOCOL=tcp)(HOST=<Host Public IP Address>)(PORT=1521))

remote_listener                      string                           <scan name>:<port #>

SQL>

SQL> Alter system register;

 

System altered.

Advertisements
Posted in RAC | Leave a comment

Oracle Database Administration is my passion

When you think you can’t go on, force yourself to keep going. Your success is based on persistence, not luck

Posted in Uncategorized | Leave a comment

Why is a Virtual IP Necessary in an Oracle RAC Environment

Why is a Virtual IP Necessary in an Oracle RAC Environment

By: Scott JesseBill BurtonBryan Vongray


Why must we use a VIP address in an Oracle RAC environment? The simple answer to this question is TCP timeouts. Let’s discuss this a bit more.

TCP timeouts, believe it or not, play a huge part in the perceived availability of applications. When a node in an Oracle RAC environment goes down, or in any MAA environment with multiple addresses to attempt, there may be no way for the client to know this. If a client is connecting using a TNS alias or a service that allows connection to multiple nodes, the client may unknowingly try its first connection attempt to the node that is down. This in and of itself is not a problem, as multiple addresses should be in the list, so that when the client fails to get a response from the first address in the list, the next address will be tried, until the connection succeeds. The problem lies with the time that it takes to go to the next address in the list.

How long does the client wait to determine that the host it is trying to reach is not accessible? The time can range anywhere from a few seconds, to a few minutes, and in some environments this is simply unacceptable. If a node goes down for several hours, days, or weeks, the database may still be humming along just fine, with x number of nodes still accessing the database. However, some clients may always be trapped into making the initial connection attempt to the down node, and will therefore be stuck in front of a (seeming) interminable hourglass while the connection is timing out, before being rerouted to the next address in the list.

Reigning in TCP timeouts at the OS

Unfortunately, this time is something that is generally outside of the control of Oracle. In addition, it varies from client to client, and operating system to operating system. It is controlled by the operating system timeout values, on the client side, so making modifications to all clients can be cumbersome, since there may be many clients and many variations to configuration changes need to be made. In addition, changing the timeout values may also result in adverse consequences on other applications that the clients are running, if other applications rely on a higher TCP timeout value, for whatever reason.

To make matters worse, the behavior may not be consistent. If client-side load-balancing is enabled, it is possible that some connections will succeed immediately on their first attempt, because they just happened to connect randomly to a node that is available. At other times, however, the connection time increases, because the client randomly and unwittingly picks the down node for its first connection attempt. The result of this is confusion and frustration at the client side, even though from the database’s perspective everything is functioning as it should.

Giving the MAA DBA control over TCP timeouts

Enter the VIP address. By using a VIP address, Oracle eliminates the problem with TCP timeouts on the initial connection, without the need to make any changes to a single client machine. This is done by enforcing client connections first to come in on the VIP address for all connections. When all nodes are functioning properly, each VIP is running on its assigned node, and connections are directed to the appropriate listener and service. When the unthinkable happens, and a node fails (gasp!), CRS will kick in, and the VIP for that node will actually be brought online on one of the remaining nodes of the cluster, where it can respond to a ping and also to connection attempts. Note that this VIP will not be accepting connections to the database at this time. However, since the IP address is available, it will be able to respond to a connection attempt immediately.

The response given to the client will generally be in the form of an ORA-12541, advising that no listener is available. This is because the node where the VIP now resides has its own listener, but it is listening on its own VIP, not the VIP of any other nodes. The client, receiving the message that there is no listener, will then immediately retry, using the next IP in the ADDRESS_LIST, rather than waiting up to 2 minutes for the timeout we would normally expect. Thus, a connect-time failover has still occurred, but the connection attempt succeeds within a second or even faster. Even though a client uses the SCAN, the local listener is still listening on the VIP.

Why a SCAN virtual IP?

Having answered the question of why we need VIPs, your next logical question might be, “Why a SCAN VIP?” and subsequently, “Why does Oracle recommend three of them?” The SCAN VIP works a bit differently from the “normal” VIP. We will call the “normal” VIP a local VIP for the moment, as these VIPs are associated to the local listener. We know that if a node fails, the VIP fails over to another node. So far, the SCAN VIP and local VIP act the same. The difference comes into the game as the failed over local VIP replies to a ping, but its local listener does not listen, for SQL*Net connections because it cannot failover. The SCAN listeners in comparison can run on any node of the cluster. So the SCAN VIP not only has the task of avoiding a wait for the TCP timeout, but it must also ensure that the SCAN LISTENER associated with the SCAN VIP can be started on every available node in the cluster, if needed.

Why does Oracle recommend that you set up the SCAN name with three IP addresses, thus having three SCAN VIPs and three SCAN listeners? The answer is related to the subject of this book: Maximum Availability. You can configure your cluster with only one SCAN VIP/listener, but this would make a responsible MAA DBA very nervous. An MAA DBA could not sleep at night, concerned that her SCAN listener would fail. Redundancy is the answer to the question, but it’s not the whole story, because the question of Why not two? or Why not one SCAN per node? can still be asked. Having two would cover the redundancy requirement, but having three ensures that the connect load of the SCAN listener would not get exhausted, would reduce the CPU cost per node, and would require the least amount of decision on the part of the cluster. Having three is plenty.

NOTE

The benefit of using SCAN is that the network configuration files on the client computer do not need to be modified when nodes are added to or removed from the cluster.

Posted in RAC | Leave a comment

Oracle RAC Basics

RAC Basics : CRS, Voting Disk, OCR, Cache Fusion ….

Oracle Clusterware (Cluster Ready Services in 10g/ Cluster Manager in 9i) – provides infrastructure that binds multiple nodes that then operate as single server. Clusterware monitors all components like instances and listeners. There are two important components in Oracle clusterware, Voting Disk and OCR (Oracle Cluster Registry).

.

Voting Disk – is file that resides on shared storage and Manages cluster members.  Voting disk reassigns cluster ownership between the nodes in case of failure.

OCR (Oracle Cluster Registry) – resides on shared storage and maintains information about cluster configuration and information about cluster database. OCR contains information like which database instances run on which nodes and which services runs on which database.

CRS Resource – anything that Oracle Clusterware manages is classified as CRS resourcelike database, instance, service, listener, VIP address and so on.

.
Cluster-Aware Storage – is storage solution for Oracle RAC like RAW deviceOCFSASM… to know more about storage option in RAC click here
.
Interconnect – is private network that connects all the servers in cluster. Interconnect uses switch that only nodes in cluster can access. Instances in cluster communicate to each other via interconnect.
.

Cache Fusion – is disk less cache coherency mechanism in Oracle RAC that provides copies of data blocks directly from one instance’s memory cache (in which that block is available) to other instance (instance which is request for specific data block).  Cache Fusion provides single buffer cache (for all instances in cluster) through interconnect.

In Single Node oracle database, an instance looking for data block first checks in cache, if block is not in cache then goes to disk to pull block from disk to cache and return block to client.

In RAC Database there is remote cache so instance should look not only in local cache (cache local to instance) but on remote cache (cache on remote instance). If cache is available in local cache then it should return data block from local cache; if data block is not in local cache, instead of going to disk it should first go to remote cache (remote instance) to check if block is available in local cache (via interconnect)

This is because accessing data block from remote cache is faster than accessing it from disk.

.

Cache Fusion Model
Cache fusion Model is dependent on three services
— Global Resource Directory (GRD)
— Global Cache Service (GCS)
— Global En-queue Service (GES) and —

More on Cache Fusion Model coming soon …..
.
SSH User Equivalency – means assigning same properties (usernameuseridgroupgroup id and same password) to operating system user (installing & owning RAC database) across all nodes in cluster

CVU (Cluster Varification Utility) – is utility to verify that system meets all the criteria for Oracle Clusterware Installation.

 

Posted in RAC | Leave a comment

This blog contains information about my passion for Oracle Database Administration, Unix Shell scripting, ……

Aside | Posted on by | Leave a comment