My Journey as an Oracle DBA: January 2012

Tuesday 31 January 2012

RMAN - adclone.pl failed with error

Issue:
Cloning failed with error.
On EBS 12.1.2 during restoration of RMAN Backup using adclone,we got the error:
StackTrace -
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at oracle.apps.ad.clone.util.CloneProcessor.run(CloneProcessor.java:67)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.NoClassDefFoundError: oracle/jdbc/OracleDriver

Solution:
1 - Copy $RDBMS_HOME/jdbc/lib/ojdbc6.jar to $RDBMS_HOME/appsutil/clone/jlib/ojdbc6.jar
2 - Run the adclone.pl again, it will now find the ojdbc6.jar and will also continue

Monday 30 January 2012

ORA-609 and TNS-12537 in 11g Alert.log

Issue: ORA-609 and TNS-12537 in 11g Alert.log

Findings:
1) lsnrctl status hanged
2) conn user/pwd@orcl hanged
3) Database is up and running
4) Error in alert log
nt OS err code: 0
Fatal NI connect error 12537, connecting to:
(LOCAL=NO)
VERSION INFORMATION:
TNS for Linux: Version 11.1.0.6.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.6.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.6.0 - Production
Time: 25-JAN-2012 12:38:49
Tracing not turned on.
Tns error struct:
ORA-609 : opiodr aborting process unknown ospid (4018_47693892033648)
ns main err code: 12537
TNS-12537: TNS:connection closed
ns secondary err code: 12560
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
ORA-609 : opiodr aborting process unknown ospid (4016_47051970669680)

Cause:
The ORA-609 error is thrown when a client connection of any kind failed to complete or aborted the connection
process before the server process was completely spawned.
Beginning with 10gR2, a default value for inbound connect timeout has been set at 60 seconds.

Solution:
Increase the values for INBOUND_CONNECT_TIMEOUT at both listener and server side sqlnet.ora file as a preventive measure.
If the problem is due to connection timeouts,an increase in the following parameters should eliminate or reduce the occurrence of the ORA-609s.

Sqlnet.ora: SQLNET.INBOUND_CONNECT_TIMEOUT=180
Listener.ora: INBOUND_CONNECT_TIMEOUT_listener_name=120

I had faced this issue on Linux Server.In may case the issue got resolved after changing one of the Kernel Parameter.
Change the value of net.ipv4.ip_forward from 0 to 1 in /etc/sysctl.conf file.
# Controls IP packet forwarding
net.ipv4.ip_forward = 1

Sunday 29 January 2012

Digital Certificate Expired Details

Digital Certificate Expired Details:
The digital certificate used to sign Oracle Forms deployment related JAR files has expired on Thu Mar 24 19:59:59 EDT 2011.
So, a dialog box appears whenever form application opens.

Impact on the Application:
The expiration of the JAR signature does not prevent the use of the product or applications created with it . It simply means that inspection of the certificate details will show the expiration date. It also means that end users may be presented with a dialog at startup which indicates that the ce! rtificate has expired, but are offered an option to run the application regardless.

Choosing Cancel or Deny, will prevent the Forms applet from running.

Workaround:
Clicking on yes on the dialog box, the application runs fine.

Solution:
Apply the following patches:
1.) Forms bundle patch: 9593176 (Size :5.9MB)
2.) Patch:11710576 (Size :4.7MB)

Validity of the solution:
These patches are available to extend the jar file expiration date to Sat Jul 06 19:59:59 EDT 2013.

Approach:
The patch will be applied on the Pre-prod environment first. Testing will be done in preprod first and if successful the patch will be applied to the Production Environment.

Thursday 26 January 2012

Listener Services stops automatically every 2-3 minutes

Issue:
Listener Services stops automatically every 2-3 minutes.

Error in listener log:
Started with pid=9907
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.1)(PORT=1521)))

TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
26-FEB-2011 11:50:47 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=test01)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=LISTENER)(VERSION=169870080)) * status * 0
26-FEB-2011 11:51:39 * service_register * unixpi * 0
26-FEB-2011 11:51:58 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=test01)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=LISTENER)(VERSION=169870080)) * status * 0
No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.1)(PORT=1521)))
26-FEB-2011 11:52:01 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=test01)(USER=oracle))(COMMAND=stop)(ARGUMENTS=64)(SERVICE=LISTENER)(VERSION=169870080)) * stop * 0

Findings:
As listener it seems that indeed listener is stopped "automatically every 2-3 minutes" but this as a result of COMMAND=STOP
issues by USER=oracle (it's OS user here) and this is issued from HOST=test01 ... and command comes from a Oracle
software having as client libraries VERSION=169870080 (A200300 in hex which means 10.2.0.3) ... the only suggestion we can
offer is to check on the machine test01 for a script or (in case it's a cluster node) a rule in cluster software ... about a sequence
where it is checked the status,probably the script or the rules are having some bad conclusion about that ... and the consequence
is that the command issued is COMMAND=STOP. It is visible the same sequence all the time ....

(COMMAND=status)
(COMMAND=status)
(COMMAND=STOP)...

So there is a check about something expected to be found in the status command, the status is considered not the one expected
and as a result the listener is restarted. If this is a cluster then you will need the CLUSTER admin to check about the crs scripts...
The CRS scripts could provide the same results / symptoms if the checks for the as named "cluster resources" (listener is one)
will not be interpreted well. The script is checking is a listener status is OK in base of some scripts logics (specific to your script)
and if not then the listener is restarted. Usually such would require manual (human) interventions. Other possible source is a DBA
script running in in there ... performing the same check in loop. LSNRCTL STATUS (2 times) ... the result is parsed and if there is a
certain condition meed the listener is stopped using LSNRCTL STOP command... This is not a listener issue, some DBA have to
check and fix the condition checked or to simple disable such check ... Please verify and confirm this ...

Exact Issue:
The Application(hard coded) is checking the status of default listener and stop it.

Solution
Change the Name of the Listener.

Sunday 22 January 2012

ORA-00600: [kpnatdm-1], [2501], [533]

Errors in file /orahome/oracle/product/10.2.0/admin/orcl/udump/orcl2_ora_26752.trc:
ORA-00600: internal error code, arguments: [kpnatdm-1], [2501], [533], [], [], [], [], []

Findings:
The error encounters when the below mentioned SQL is fired by program oracle@test02 from machine test02 using operating system user orauser.

SELECT :1||"A1"."ERR_DESC"||',' FROM "DIR"."DI_ERROR_LIST" "A1" WHERE "A1"."ERR_CD"=SUBSTR(:1,1,3) AND "A1"."MESG_ID"='ALCHI01'

This type of argument [kpnatdm-1], [2501], [533] for ORA-00600 is unknown in the bug database.

Recomendation:
It is suggested to Upgrade the database to 10.2.0.5

Tuesday 17 January 2012

New Jobs in Oracle 11g

1. ORA$AUTOTASK_CLEAN

The job is created by the 11g script catmwin.sql - Catalog script for
Maintenance Window catmwin.sql defines maintenance window and stats collection job.This job is an autotask repository data ageing job and deletes obsolete AUTOTASK repository data. The delete of this old data will be performed by the procedure ora$age_autotask_data which will be executed by the job.

2. HM_CREATE_OFFLINE_DICTIONARY

The job is created by the 11g script catmwin.sql which mentions that this is a job for creation of offline dictionary for Database Repair Advisor.

3. DRA_REEVALUATE_OPEN_FAILURES

The job is created by the 11g script catmwin.sql which mentions that this is a job for reevaluate open failures for Database Recovery Advisor. In other words it is used to get assistance in case of loss or corruption of datafiles, redo log files or controlfiles.The job executes the procedure dbms_ir.reevaluateopenfailures.

4. MGMT_CONFIG_JOB

Comes with the OCM(Oracle Configuration Manager) installation.
This is a configuration collection job.The job is created by the script ocmjb10.sql by running procedure 'ORACLE_OCM.MGMT_CONFIG.collect_config'.

5. MGMT_STATS_CONFIG_JOB

This is an OCM Statistics collection job created in ocmjb10.sql by running
'ORACLE_OCM.MGMT_CONFIG.collect_stats'.

6. BSLN_MAINTAIN_STATS_JOB

This job is a compute statistics job. This job runs the BSLN_MAINTAIN_STATS_PROG program on the BSLN_MAINTAIN_STATS_SCHED schedule. The program BSLN_MAINTAIN_STATS_PROG will keep the default baseline's statistics up-to-date.

7. XMLDB_NFS_CLEANUP_JOB

The job is created in xdbu102.sql and runs the procedure dbms_xdbutil_int.cleanup_expired_nfsclients.

8. RSE$CLEAN_RECOVERABLE_SCRIPT

This job is STREAMS related. It is created in execstr.sql and is new in 11.2. The script execstr.sql executes anonymous blocks for STReam. The job is responsible with the cleaning of recoverable scripts and calls ' 'sys.dbms_streams_auto_int.clean_recoverable_script;'.

9. SM$CLEAN_AUTO_SPLIT_MERGE

This job is STREAMS related. It is created in file execstr.sql and is new in 11.2 too. The job is responsible with the cleaning of auto split merge views.

Monday 16 January 2012

Oracle PCTFREE and PCTUSED Parameters

PCTFREE and PCTUSED are the block storage parameters

PCTFREE- This parameter is used to specify how much space should be left in the block for updates.The default value is 10.

For Eg, if The PctFree for a block is 30, then Oracle keeps on adding new rows to the block until the block is 70 % full.
It leaves the 30% for future updates. That is, in case, any row is updated and requires some more space, it is taken
out from the 30% remaining in the Block.Specify the value of this parameter HIGH if future updates in the rows of
the table are going to need more space.In case ur table is less prone to updates, then this value can be specified LOW.

PCTUSED : Oracle stops INSERTING new Rows in a table when the space usage reaches the PCTFREE limit and removes it from the free list.
The data block is again added to the free list when the used space in it falls below the percent value set by the PCTUSED parameter.The default value is 40.
Once the data block is added to the free list, Oracle again starts inserting new rows into it.
Hence, if a large value is set for PCTUSED, the data blocks are more frequently returned to the free list and vice versa.

For Eg,Suppose u have specified PCTUSED as 40 %. and PCTFREE as 20 %.
1. Oracle will keep on inserting new rows till the space is 80 % used. It will reserve the remaining 20% for future updates.
2. To start adding new rows again to the block, Oracle will check the space in the Block and the PCTUSED parameter.
3. When the space falls below 40 %, Oracle will start adding new rows to the block.

Saturday 14 January 2012

ORA-48913 - error message in Oracle 11g alert log

Error Message in Alert Log:

On 11.2.0.1, encountered the following error in the alert.log.
Wed Jan 04 11:23:10 2012
Non critical error ORA-48913 caught while writing to trace file "/orahome/app/oracle/diag/rdbms/vis/vis2/trace/vis2_lmd0_991420.trc"
Error message: ORA-48913: Writing into trace file failed, file size limit [10485760] reached
Writing to the above trace file is disabled for now on...

Cause:
Parameter MAX_DUMP_FILE_SIZE is set too low

Solution:
Increase the setting for the parameter MAX_DUMP_FILE_SIZE or set it to unlimited

Friday 13 January 2012

nt secondary err code in Oracle 11g Alert log

Error message in 11g database alert log:

Wed Jan 04 18:56:39 2012
*********************************************************
Fatal NI connect error 12170.
VERSION INFORMATION:
TNS for Solaris: Version 11.2.0.1.0 - Production
Oracle Bequeath NT Protocol Adapter for Solaris: Version 11.2.0.1.0 - Production
TCP/IP NT Protocol Adapter for Solaris: Version 11.2.0.1.0 - Production
Time: 04-JAN-2012 18:56:39
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12560
nt main err code: 505

TNS-00505: Operation timed out
nt secondary err code: 145
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=10.159.34.117)(PORT=1092))

Changes:
No change are necessary, but may have recently upgraded the database to 11g release 1 or higher, or installed a new Oracle11g database.
Note: Prior to 11gR1 these same 'Fatal NI connect error 12170' are written to the sqlnet.log

Cause:
These time out related messages are mostly informational in nature.
The messages indicate the specified client connection (identified by the 'Client address:' details) has experienced a time out. The 'nt secondary err code' identifies the underlying network transport, such as (TCP/IP) timeout limits after a client has abnormally terminated the database connection.

The 'nt secondary err code' translates to underlying network transport timeouts for the following Operating Systems:

For the Solaris system: nt secondary err code: 145:
ETIMEDOUT 145 /* Connection timed out */

For the Linux operating system: nt secondary err code: 110
ETIMEDOUT 110 Connection timed out

For the HP-UX system: nt secondary err code: 238:
ETIMEDOUT 238 /* Connection timed out */

For AIX: nt secondary err code: 78:
ETIMEDOUT 78 /* Connection timed out */

For Windows based platforms: nt secondary err code: 60 (which translates to Winsock Error: 10060)

Description: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

The reason the messages are written to the alert log is related to the use of the new 11g Automatic Diagnostic Repository (ADR) feature being enabled by default. See (Doc ID 454927.1).

Suggested Actions:

- Search the corresponding text version of the listener log located on the database server for the corresponding client connection referenced by the Client address details referenced in the alert log message.

For the message incident below you would search the listener log for the 'Client address' string:

(ADDRESS=(PROTOCOL=tcp)(HOST=10.159.34.117)(PORT=1092))

The search of the listener log should find the most recent connection before the time reference displayed in the alert log message, e.g. '04-JAN-2012 18:56:39'.

-Corresponding listener log entry:

04-JAN-2012 21:20:12 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=AMN11264.us.oracle.com)(CID=(PROGRAM=D:\app\mcassady\product\11.2.0\dbhome_1\bin\sqlplus.exe)(HOST=mcassady-lap)(USER=mca
ssady))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.159.34.117)(PORT=1092)) * establish * AMN11264.us.oracle.com * 0

Note the time of the client corresponding client connection(s) in the listener log. Here you may find a particular client, set of clients or particular applications that are improperly disconnecting causing the timeout errors to be raised and recorder in the database alert log.

You may choose to revert from the new Automatic Diagnostic Repository (ADR) method to prevent the Oracle Net diagnostic details from being written to the alert log(s) by setting the following Oracle Net configuration parameters:

To revert to Oracle Net Server tracing/logging, set following parameter in the server's sqlnet.ora :

DIAG_ADR_ENABLED = OFF

Also, to back out the ADR diag for the Listener component, set following parameter in the server's listener.ora:

DIAG_ADR_ENABLED_<listenername> = OFF

- Where the <listenername> would be replaced with the actual name of the configured listener(s) in the listener.ora configuration file. For example, if the listener name is 'LISTENER', the parameter would read:

DIAG_ADR_ENABLED_LISTENER = OFF

-Reload or restart the TNS Listener for the parameter change to take effect.

Thursday 12 January 2012

Change Static Parameter in RAC Environment

1) Check Parameter is static or dynamic
11:49:35 orcl1 > select name, value, issys_modifiable from v$parameter where name='processes';

NAME                 VALUE                ISSYS_MOD
-------------------- -------------------- ---------
processes            400                  FALSE

1 row selected.

The Value is False.It means the parameter is static.

2) Check Whether database Instance Started with PFILE or SPFILE?
11:49:44 orcl1 > select count(1) from v$spparameter where isspecified = 'TRUE';

COUNT(1)
----------
81

1 row selected.

The view returns NULL values if a PFILE was used to start up the instance.

or,

11:50:26 orcl1 > select name, value from v$parameter where name = 'spfile';

NAME                 VALUE
-------------------- --------------------
spfile               /ora_data/db/apps_st/data/spfileorcl.ora

1 row selected.

or,

11:50:45 orcl1 > show parameter spfile;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      /ora_data/db/apps_st/data/spfileorcl.ora

Check the same on node 2 also.If the valus is same it means the instance is running with common spfile.

3) Change the Parameter
ALTER SYSTEM SET processes=500 scope=spfile sid='*';

4) Bounce the Instances

Wednesday 11 January 2012

Why we need VIP in Oracle 10g RAC

In Oracle RAC database if user process is trying to get connection with some listener and found listener is down or node is down then Oracle RAC automatically transfer this request to another listener on another node. Up to Oracle 9i we use physical IP address in listener configuration. Means if requested connection gets failed then it will be diverting to another node using physical IP address of another surviving node. But during this automatically transfer, connection should need to wait up to get error message of node down or listener down using TCP/IP connection timeout. Means session should need to wait up to getting TCP/IP timeout error dictation. Once error message is received oracle RAC automatically divert this new connection request to another surviving node.Using physical IP address there is biggest gap to get TCP/IP timeout for failover suggestion. Session should need to wait for same timeout. High availability of Oracle RAC depends on this time wasting error message.

In Oracle 10g we have a concept of Virtual IP.Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service maintains communication between each nodes and listeners. Once ONS found any listener down or node down, it will notify another nodes and listeners with same situation. While new connection is trying to establish connection to failure node or listener, virtual IP of failure node automatically divert to surviving node and session will be establishing in another surviving node. This process doesn't wait for TCP/IP timeout event. Due to this new connection gets faster session establishment to another surviving nodes.

Virtual IP (VIP) is for fast connection establishment in failover dictation. Still we can use physical IP address in Oracle 10g in listener if we have no worry for failover timing. We can change default TCP/IP timeout using operating system utilities or commands and kept smaller. But taking advantage of VIP (Virtual IP address) in Oracle 10g RAC database is advisable.

Using VIP configuration, client can be able to get connection fast even fail over of connection request to node. Because vip automatically assign to another surviving node faster and it can't wait for TNS timeout old fashion.

Tuesday 10 January 2012

If One Voting Disk is Corrupted - RAC

Voting disks are used in a RAC configuration for maintaining nodes membership. They are critical pieces in a cluster configuration. Starting with ORACLE 10gR2, it is possible to mirror the OCR and the voting disks. Using the default mirroring template, the minimum number of voting disks necessary for a normal functioning is two.

Scenario Setup
In this scenario it is simulated the crash of one voting disk by using the following steps:
1.   Identify votings:
crsctl query css votedisk
0. 0 /dev/raw/raw1
1. 0 /dev/raw/raw2
2. 0 /dev/raw/raw3
2.   corrupt one of the voting disks (as root):
dd if=/dev/zero /dev/raw/raw3 bs=1M

Recoverability Steps
1.   check the “$CRS_HOME/log/[hostname]/alert[hostname].log” file. The following message should be written there which allows us to determine which voting disk became corrupted:
[cssd(9120)]CRS-1604:CSSD voting file is offline: /opt/oracle/product/10.2.0/crs_1/Voting1. Details in /opt/oracle/product/10.2.0/crs_1/log/aut-arz-ractest1/cssd/ocssd.log.

2.   According to the above listing the Voting1 is the corrupted disk. Shutdown the CRS stack:
srvctl stop database -d fitstest -o immediate
srvctl stop asm -n aut-vie-ractest1
srvctl stop asm -n aut-arz-ractest1
srvctl stop nodeapps -n aut-vie-ractest1
srvctl stop nodeapps -n aut-arz-ractest1
crs_stat -t
On every node as root:
crsctl stop crs

3.   Pick a good voting from the remaining ones and copy it over the corrupted one:
dd if=/dev/raw/raw4 of=/dev/raw/raw3 bs=1M

4.   Start CRS (on every node as root):
crsctl start crs

5.   Check log file “$CRS_HOME/log/[hostname]/alert[hostname].log”. It should look like shown below:
[cssd(14463)]CRS-1601:CSSD Reconfiguration complete. Active nodes are aut-vie-ractest1 aut-arz-ractest1 .
2011-11-10 15:19:53.954
[crsd(14268)]CRS-1012:The OCR service started on node aut-vie-ractest1.
2011-11-10 11:20:53.987
[evmd(14228)]CRS-1401:EVMD started on node aut-vie-ractest1.
2011-11-10 11:20:55.861 [crsd(14268)]CRS-1201:CRSD started on node aut-vie-ractest1.

6.   After a couple of minutes check the status of the whole CRS stack:
[oracle@aut-vie-ractest1 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM2.asm application ONLINE ONLINE aut-...est1
ora....T1.lsnr application ONLINE ONLINE aut-...est1
ora....st1.gsd application ONLINE ONLINE aut-...est1
ora....st1.ons application ONLINE ONLINE aut-...est1
ora....st1.vip application ONLINE ONLINE aut-...est1
ora....SM1.asm application ONLINE ONLINE aut-...est1
ora....T1.lsnr application ONLINE ONLINE aut-...est1
ora....st1.gsd application ONLINE ONLINE aut-...est1
ora....st1.ons application ONLINE ONLINE aut-...est1
ora....st1.vip application ONLINE ONLINE aut-...est1
ora....test.db application ONLINE ONLINE aut-...est1
ora....t1.inst application ONLINE ONLINE aut-...est1
ora....t2.inst application ONLINE ONLINE aut-...est1

Sunday 8 January 2012

Benefits of Automatic Storage Management

a) Automatic IO load balancing as data is striped across disks and better performance.
b) Available for both Single Instance as well as RAC instance.
c) Free built in 10g kernel – Available even in Standard Edition.
d) Prevents Fragmentation and hence no need to relocate data to reclaim space.
e) Overcomes file system size limitations.
f) Prevents accidental file deletion.
g) Backup with RMAN.

Friday 6 January 2012

BSLN_MAINTAIN_STATS_JOB is getting failed

Issue:
BSLN_MAINTAIN_STATS_JOB is getting failed

Error in Alert Log:
Sun Jan 01 13:30:01 2012
Errors in file /orahome/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_j000_1781806.trc:
ORA-12012: error on auto execute of job 353101
ORA-06550: line 1, column 780:
PLS-00201: identifier 'DBSNMP.BSLN_INTERNAL' must be declared
ORA-06550: line 1, column 780:
PL/SQL: Statement ignored

Trace file:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /orahome/app/oracle/product/11.2.0/dbhome_1
System name:    AIX
Node name:      PROD
Release:        1
Version:        6
Machine:        00C8A8E54C00
Instance name: orcl1
Redo thread mounted by this instance: 1
Oracle process number: 56
Unix process pid: 1781806, image: oracle@PROD (J000)
*** 2012-01-01 13:30:01.195
*** SESSION ID:(308.19730) 2012-01-01 13:30:01.195
*** CLIENT ID:() 2012-01-01 13:30:01.195
*** SERVICE NAME:(SYS$USERS) 2012-01-01 13:30:01.195
*** MODULE NAME:(DBMS_SCHEDULER) 2012-01-01 13:30:01.195
*** ACTION NAME:(BSLN_MAINTAIN_STATS_JOB) 2012-01-01 13:30:01.195
ORA-12012: error on auto execute of job 353101
ORA-06550: line 1, column 780:
PLS-00201: identifier 'DBSNMP.BSLN_INTERNAL' must be declared
ORA-06550: line 1, column 780:
PL/SQL: Statement ignored

Findings:
SQL> select object_name from dba_objects where object_id=353101;

OBJECT_NAME
-----------------------------------
BSLN_MAINTAIN_STATS_JOB

09:58:16 orcl1 > select log_date,status
from dba_scheduler_job_run_details
where job_name='BSLN_MAINTAIN_STATS_JOB'
order by log_date desc;

LOG_DATE                                                                    STATUS
------------------------------------------------------------------------ ------------
01-JAN-12 01.30.01.224309 PM +05:30                                         FAILED
25-DEC-11 01.30.00.855555 PM +05:30                                         FAILED
18-DEC-11 01.30.00.761203 PM +05:30                                         FAILED
11-DEC-11 01.30.00.583605 PM +05:30                                         FAILED
04-DEC-11 01.30.00.450731 PM +05:30                                         FAILED

Role of BSLN_MAINTAIN_STATS_JOB:
This job is a compute statistics job. This job runs the   BSLN_MAINTAIN_STATS_PROG program on the
BSLN_MAINTAIN_STATS_SCHED schedule. The program BSLN_MAINTAIN_STATS_PROG will keep the default
baseline's statistics up-to-date.

Cause:
Table DBSNMP.BSLN_BASELINES contains inconsistent information. After database cloning , existing records in table “DBSNMP.BSLN_BASELINES” conflict with new baseline information inserted in the cloned database. The DBSNMP user needs to be dropped and re-created.

Fixed:
SQL> @?/rdbms/admin/catnsnmp.sql

SQL> @?/rdbms/admin/catsnmp.sql

Wednesday 4 January 2012

Why should Apps & Applsys passwords always be the same

The need to have the same password for Apps and Applsys is because when you sign on to apps, intially it connects to a public schema called APPLSYSPUB.
This validates AOL username and password that we enter (operations/welcome) using guest user account. Once this is verified we select responsibility, this is validated by APPLSYS schema and then it connects to APPS schema.
Since it uses both applsys and apps during signon process this expects both the password to be identical.

Apps is a universal schema has synonyms to all base product tables and sequences.This also has code objects for all products (triggers, views, packages, synonyms etc.).
Applsys schema has applications technology layer products like FND and AD etc.

Monday 2 January 2012

Check Oracle High Redo Generation

Step 1: Get the segment that experienced the most changes 
during  the period
SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') snap_time,
        dhsso.object_name,
        sum(db_block_changes_delta)
  FROM dba_hist_seg_stat dhss,
         dba_hist_seg_stat_obj dhsso,
         dba_hist_snapshot dhs
  WHERE dhs.snap_id = dhss.snap_id
    AND dhs.instance_number = dhss.instance_number
    AND dhss.obj# = dhsso.obj#
    AND dhss.dataobj# = dhsso.dataobj#
    AND begin_interval_time
BETWEEN to_date('2008_05_02 12','YYYY_MM_DD HH24')
       AND to_date('2008_05_04 12','YYYY_MM_DD HH24')
  GROUP BY to_char(begin_interval_time,'YYYY_MM_DD HH24:MI'),
           dhsso.object_name
 
Step 2: Run the following query to examine the changes during 
the period to what they were historically.
SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') snap_time,
        sum(db_block_changes_delta)
  FROM dba_hist_seg_stat dhss,
         dba_hist_seg_stat_obj dhsso,
         dba_hist_snapshot dhs
  WHERE dhs.snap_id = dhss.snap_id
    AND dhs.instance_number = dhss.instance_number
    AND dhss.obj# = dhsso.obj#
    AND dhss.dataobj# = dhsso.dataobj#
    AND dhsso.object_name = 'OT_PENDING'
  GROUP BY to_char(begin_interval_time,'YYYY_MM_DD HH24:MI')

 Step 3: Gather all SQL statements that were run during the 
period that referenced the table, using what is below
SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24:MI'),
         dbms_lob.substr(sql_text,4000,1),
         dhss.instance_number,
         dhss.sql_id,executions_delta,rows_processed_delta
  FROM dba_hist_sqlstat dhss,
         dba_hist_snapshot dhs,
         dba_hist_sqltext dhst
  WHERE upper(dhst.sql_text) LIKE '%OT_PENDING%'
    AND dhss.snap_id=dhs.snap_id
    AND dhss.instance_Number=dhs.instance_number
    AND dhss.sql_id = dhst.sql_id

 Step 4: Take the SQL_ID for the statement and searched 
through DBA_HIST_ACTIVE_SESS_HISTORY, using what is below
SELECT instance_number, to_char(sample_time,'yyyy_mm_dd hh24:mi:ss'),
         user_id,
         program
  FROM dba_hist_active_sess_history
  WHERE sql_id = 'bxjz5d4p6x02u'
    AND snap_id BETWEEN 43212 AND 43220

One of the best document to check the high generation of redo:
http://appcrawler.com/wordpress/2009/04/15/who-is-generating-all-the-redo/