Database Diva Presents Random Thoughts on Oracle Database Administration and Security » Random

Random


Random16 Apr 2010 02:44 pm

I recently needed to duplicate a database to point in time more than a week earlier.  I had all of the files from the backupset available on disk.  When I ran a crosscheck, I could see they were available, but when I ran a list, the backup was only a few MB in size, rather  than a few GB, and there was only one file associated.  There were no data files listed, and when I tried to run the restore, I got the following message:

RMAN-01005: not all datafiles have backups than can be recovered to scn 9878899591331;

This was extremely frustrating, because I could see all of the files.  Fortunately, an Oracle support rep told me about the ‘catalog backuppiece’ command, which registers missing backup data in the control file.

rman target /
catalog backuppiece ‘/path/to/backup/file’;

After that I ran a list and the information for the backup was correct in size, and the datafiles were once again associated with it.

It took me a week to get there, but I was able to recover everything.  While it was a very educational week, it was rather frustrating for the people who were waiting for the database to be ready.

Random16 Apr 2010 02:27 pm

Have you ever needed to “get back in time”, only to discover that the Delorean wouldn’t start?  I recently had a customer that needed a schema restored from a day where we didn’t take an export.  We use RMAN to provide disaster recovery, but our SLA doesn’t require us to provide daily “I shot myself in the foot” insurance. To make matters worse, they waited over a week to ask for the restore.

The system in question uses RMAN to disk, so I restored the files from the time frame and attempted to duplicate the database to that point in time, only to get an error that there were no backups on record for that SCN.  Huh?  I could see the backups when I ran “list backup”, and when I did the crosscheck, I could see they were available.  Further inspection showed that the backup was in the catalog, but the pieces were not.  I’ll cover that in a separate post, for search purposes.  Once you have everything you need, this is the syntax that will get you a point in time clone:

  1. Create a pfile for the clone
    create pfile ‘initclone.ora’ from spfile
    If the clone is on the same host, be sure to edit it to include the path to the control, log and data files
    Add the following lines to force filename conversion during the restore
    db_file_name_convert=(/path/to/source/,/path/to/clone/)
    log_file_name_convert=(/path/to/source/,/path/to/clone/)
  2. Create the bdump, udump and cdump directories
  3. Add the clone to oratab
  4. Create a password file for the clone.  You will need one for the source database, if you don’t already have one.
    orapwd file=$ORACLE_HOME/dbs/orapwclone password=syspassword
  5. Perform a ’startup nomount’ of the clone
  6. Making sure that the clone is the current ORACLE_SID, run
    rman target sys@<source> nocatalog auxiliary /
    duplicate target database to ‘<clone>’ until time ‘<timestamp>’;

You may need to recover files from tape if all of the backup pieces aren’t on disk.  If they are on disk, but rman doesn’t recognize them, see my next post.

Once you’ve got your clone open, you can export the missing schema objects and save the day.

Random & Security19 Feb 2009 10:14 pm

This is only security related in that the underlying issue is permission, and for a time I thought I had created the problem by changing the permissions to secure the environment.

I upgraded my RAC environments to 11g late last year, and I have found very little about 11g that I do not love.  There is one small thing which is extremely unlovable, but once you know what is wrong, it is easy to fix.

Our process for moving to 11g was intended to be slow and methodical.  We upgraded the clusterware, and then waited a week before installing the 11g DBMS in preparation for the database upgrade.  The first sign of trouble came when I was preparing to shut things down for the database upgrade.  I did a srvctl stop nodeapps and got a VIP error on the remote node.  Alas, I was a Diva with a mission (as in DEADLINE), so I continued with my cold backup, and 11g database build.  I still didn’t make the connection when the 11g database would not start on the remote node.  Eventually my outage window, and my patience expired and I reverted to the cold backup of the 10g database.  Imagine my surprise when it wouldn’t run on the remote node, either.

I started browsing the crsd logs when the following messages caught my eye

… [ CREVT][518] CAAMonitorHandler  0:Could not execute /opt/oracle/product/clusterware/bin/rcgwrap(check) for ora.hostname.vip
category: 1234, operation: scls_canexec, loc: , OS error: 0, other: no exe permission, file /opt/oracle/product/clusterware/bin/racgwrap
… [ CRSAPP][518] CheckResource error for ora.hostname.vip error code = -1

This doesn’t look good.  I checked the permissions for racgwrap and they were set at 750.  I check the other node, and they were 751.  The other node works and the message says “other no exe permission”.  Let’s try setting the permission to 751.

This improves the situation but does not restore full functionality.  I ultimately end up doing an ls -l of the clusterware $ORACLE_HOME/bin directory on each node and find that there are 115 files with 751 permission on the functional node, and only 8 on the problem node. Still thinking that I’ve been overzealous in my effort to secure binaries, I restore the permissions and make plans to upgrade the database another evening.

When a peer had the very same problem, and I knew he hasn’t been making permission changes, I started to wonder if we have a defective root.sh.  I had another RAC pair to upgrade.  The clusterware was done, but the DBMS was not.  Throughout the DBMS install process I ran

ls -lc | grep -c — -rwxr-x–x

on the remote node.  During the “Remote operations” part of the install, the count dropped from 115 to 8.  I then ran

find . -ctime -1 -exec chmod 751 {} \;

This left me with 116 binaries, so I ran the command

ls -lc | grep -v — -rwx-rx–x

on each node and discovered that onsctl should be set to 711.

I was feeling pretty smug at this point.  This never works for me. Without fail I end up discovering I’ve got dog poop on my shoe.  In this case, the dog poop was racgwrap, which is a symbolic link that points to  $ORACLE_HOME/racg/admin/racgwrap.  Once this was set back to 751, my humility was restored and so was the functionality of the remote node.

I have since been told that this problem is documented somewhere in Metalink, but I have not been able to locate it.

While removing world execute from binaries is considered a good security practice, you really don’t want to apply it to your clusterware binaries!

Random & Tuning11 Aug 2007 02:29 am

I got a call from a customer who was concerned because he was using a sequence to order the activities he was tracking, and there were gaps in the list. He was concerned that the trigger was firing but transactions weren’t being committed. I did some quick research courtesy of Google, and discovered that Oracle typically caches 20 sequences in memory. The fact that there were gaps should have tipped me off to the underlying SGA issue, but I was still obsessed with the sequence. You know how it is, some days we are just so obsessed with putting out the immediate fire that we can’t see the leaky gas can sitting next to the open flame. Sequences take up very little memory, and this one was used so often that it shouldn’t have been aging out. But rather than ask why is this happening, I decided to “pin” the sequence into memory. It is very easy

exec dbms_shared_pool.keep(’schema.sequence’,'Q’);

You can verify this with the following query

select owner, name, type from gv$db_object_cache
where kept = ‘YES’ and owner = ’schema’;

I even followed it up with a nifty trigger to pin the sequence every time the database is started

create trigger pin_on_startup
after startup on database
begin
dbms_shared_pool.keep(’schema.sequence’,'Q’);
end;
/

The fire is out. I’m happy. Hey, what’s that smell? While basking in the glow of another job well done, I happened to fire up Enterprise Manager, and there on page 1 is an urgent warning telling me that the SGA is too small, and I need to increase it to 160M. Gee, do you suppose that was why the tiny little sequence that was frequently used was still getting tossed out of memory?

The sequence is still pinned in memory while we work out some issues with a new report that is running most inefficiently, and consuming all of the available resources. So the next time someone tells you there are gaps in their sequences, you might want to check your SGA size.

Oh, and a big thank you to Dan Morgan and his amazing library on the PSOUG site (http://psoug.org/library.html), as  I would never have figured out how to verify that the sequence was pinned.  As far as I’m concerned, Dan is one of the rock stars of Oracle database administration.

Random & Security08 Aug 2007 09:38 pm

I can’t believe it has been 14 months since I finished my certificate in Data and Internet Security at the University of Washington. This site was my final project. I was so anxious to make Oracle security my main focus, and grow the site into a comprehensive set of security tutorials for busy DBAs. I created a Drupal blog and announced my intentions in my first post, which was also my last post. Life got in the way. There was shoulder surgery, followed by more software releases and block points than I can even remember, followed by RAC and Data Guard. In short, I’ve been keeping busy, which is what DBAs do, and why we need information that is direct and to the point.

While database security will continue to be the main focus of this site, the blog may occasionally wander into other aspects of Oracle database administration. This is mostly because of my failing memory, as it will give me a place to store information I think I will need again. As a bonus, it gets the information into Google, for other people who are asking the same questions. Google is quite possibly the most important tool I use in analyzing problems, and I am happy to give something back. I hope you find something useful while you are here.