Thursday, July 25, 2013

ASM... Some Mysterious and important information

ASMCMD is very slow. How can I speed it up?

The asmcmd utility appears to be very slow. This slowness is a result of queries against the v$asm_diskgroup view. To solve this problem edit the $ORACLE_HOME/bin/asmcmdcorescript and change all v$asm_diskgroup references to v$asm_diskgroup_stat.
V$asm_diskgroup and v$asm_diskgroup_stat provides exactly the same information, but the %_stat view operates from cache, while v$asm_diskgroup rescans all disk headers. This method is also used by Oracle in their Enterprise Manager product.

[edit]ASM disk header/superblock backups?

ASM disk headers (superblocks) cannot be backed up and restored in Oracle 10g. By implication, if you use EXTERNAL REDUNDANCY and a single disk's header is accidentally overwritten, the entire disk group will have to be restored. To solve this problem, Oracle introduced the md_backup and md_restore asmcmd commands in Oracle 11g. In Oracle 10g, the only viable method to prevent logical corruption of ASM header block is to add failgroup, storage vendor has no responsibility to verify/checksum ASM disk header blocks (EXTERNAL REDUNDANCY is not going to help). There is a kfed utility to backup ASM disk headers and restore them for LIMITED scenario. It is best to be executed under guidance of a few elite support engineers. Oracle did not advertise the utility due to the potential damage it could cause. For those unrecoverable (tedious manual fixes) cases, restoring disk group is the last resort.


In an Oracle RAC database there are multiple nodes all going to the same ASM instance. If you use normal mirroring in an ASM diskgroup, the behavior of the access to the disks may not be what you assume.
Suppose you have a diskgroup called DG3 with two failgroups (DG2_0000 and DG2_0001), each with a separate disk, as shown in the figure below:
When something is written to the diskgroup DG2, the extents are written in a round robin manner: the first one goes to DG2_0000 with a copy going to DG2_0001, the second extent to DG2_0001 with a copy on DG2_0000, the third one back to DG2_0000 with copy on DG2_0001, and so on. In this manner ASM maintains the copy of one disk on the other.
But when the extents are read, they are always read from the primary failgroup (DG2_0000, in this case); not from the secondary (DG2_0001). The secondary is read only when the primary is not available.
This works well in most cases but sometimes may not be desirable. In Oracle Database 11g, you can configure a node to read from a specific failgroup. For instance, in the above example, if you want to configure the instance 1 to read from failgroup DG2_0000 and instance 2 to read DG2_0001, you can set the preferred read group for those diskgroups. The following command, executed in instance 1, causes the failgroups DG2_0000 and DG3_0000 in diskgroup DG2 and DG3 respectively to be preferred on instance 1:
SQL> alter system set asm_preferred_read_failure_groups = 'DG2.DG2_0000','DG3.DG3_0000'
Similarly, on the other instance, you can issue the following to make the other failgroups the preferred disks:
SQL> alter system set asm_preferred_read_failure_groups = 'DG2.DG2_0001','DG3.DG3_0001'        
After the statements are executed, when some session from instance 1 wants to read from the diskgroup DG2, the disk DG2_0000 will be read. If the disk is not available, the other disk, DG2_0001, is read. Similarly when a session connected to the instance 2 reads the data, the disk DG2_0001 is read.
If you want to check how the different disks of the diskgroups are utilized, you can refer to a new dictionary view, V$ASM_DISK_IOSTAT, which simulates the IOSTAT utility found in UNIX systems:
select
instname,
dbname,
group_number,
failgroup,
disk_number,
reads,
writes
from v$asm_disk_iostat
order by 1,2,3,4,5,6
/
Here is sample output:
INSTNAM DBNAME   GROUP_NUMBER FAILGROUP  DISK_NUMBER      READS     WRITES
------- -------- ------------ ---------- ----------- ---------- ----------
PRONE31 PRONE3 2 DG2_0000 0 4450 910
PRONE32 PRONE3 2 DG2_0001 1 2256 910
PRONE31 PRONE3 3 DG3_0000 0 300 29
PRONE32 PRONE3 3 DG3_0001 1 560 29
This output shows instances PRONE31 and PRONE32 have their preferred failgroups DG2_0000 and DG2_0001 respectively. Note the WRITES column; they are identical at 910. This is because the writes occur uniformly to both disks. Now note the READS column. It's 4450 and 2256 for the instances PRONE31 and PRONE32 respectively. Why? Because the instance PRONE3 issued more reads and the reads came from its preferred failgroup, DG2_0000. As for diskgroup DG3, the instance PRONE32 issued more reads which came from its preferred failgroup (DG3_0001) and thus that disk shows more reads.
Preferred reads are especially helpful in "stretch" clusters (clusters with a large geographic distance between nodes). The preferred reads make it faster for reads by isolating reads to specific disks.

What happens when a disk is no longer present (or destroyed beyond repair)? You want to drop the diskgroup completely and recreate it or add the disks of the diskgroup to other diskgroups. The diskgroup is not mounted yet. Since one of the disks is missing, you can't even mount it. To drop the diskgroup you have to mount it but you can't mount it because the disk is missing—a perfect "catch-22" situation. What should you do?
In Oracle Database 10g you can use a workaround by erasing the header of the disk using the dd command:
$ dd if=/dev/zero of=/dev/raw/raw13 bs=1024 count=4
This puts zeroes in the header of the disk /dev/raw/raw13, erasing all information. While this is effective it erases the information in the disk header completely, along with the fact that the disk used to be a part of the diskgroup.
In Oracle Database 11g, you don't need to resort to this workaround. All you have to do is issue the drop command with a force option:
SQL> drop diskgroup dg7 force including contents;
This drops the diskgroup even if the disks are not mounted. The available disks show up as FORMER; that is, they used to be part of some diskgroup. (Note: You have to use the "including contents" clause.)

Many people think of ASM as a database with its own storage. This is not at the case at all—ASM does not store data; the database does. The ASM instance, however, maintains metadata such as the diskgroup names, the disks in them, the directories, and so on. This metadata is stored in the disk headers.
Suppose all the disks crash and the header information disappears. What do you do? Of course, you have taken the backup of the database using RMAN and you can restore it. But you can restore it only after you have created all the diskgroups and directories. Hopefully you kept a record of all that. (Right?) Even if you did, this process takes time.
What if you had a backup? In Oracle Database 11g, you can backup the metadata of the ASM instance through the ASM command line option (ASMCMD), using the command md_backup.
$ asmcmd -p

ASMCMD [+] > md_backup
It creates a file named ambr_backup_intermediate_file. Here is a portion of that file from the top:
@diskgroup_set = (
{
'DISKSINFO' => {
'DG1_0000' => {
'DG1_0000' => {
'TOTAL_MB' => '103',
'FAILGROUP' => 'DG1_0000',
'NAME' => 'DG1_0000',
'DGNAME' => 'DG1',
'PATH' => '/dev/raw/raw5'
}
}
},
'DGINFO' => {
'DGTORESTORE' => 0,
'DGCOMPAT' => '10.1.0.0.0',
'DGNAME' => 'DG1',
'DGDBCOMPAT' => '10.1.0.0.0',
'DGTYPE' => 'EXTERN',
'DGAUSZ' => '1048576'
},
'ALIASINFO' => {},
'TEMPLATEINFO' => {
'6' => {
'DGNAME' => 'DG1',
'STRIPE' => 'COARSE',
'TEMPNAME' => 'ASM_STALE',
'REDUNDANCY' => 'UNPROT',
'SYSTEM' => 'Y'
... and more ...
I have not shown the entire file here to save space. It records all the diskgroups, disks, directories, the disk attributes and so on. By default, this file records all the diskgroups. If you want to backup only a specific diskgroup, you can use the -g option. In addition, you can use -b option to create a specific named file.
ASMCMD [+] > md_backup -g dg1 -b prolin3_asm.backup
This backs up the metadata of the diskgroup DG1 to a file named prolin3_asm.backup instead of the default ambr_backup_intermediate_file. This file must be new so you should remove it, if it exists, prior to the generation.
Now let's see how the restore works. There are several types of restore. The easiest use is to restore a diskgroup along with the directories that was dropped earlier. First create a directory on the diskgroup:
ASMCMD [+] > cd DG7
ASMCMD [+DG7] > mkdir TEST
ASMCMD [+DG7] > ls
TEST/
The diskgroup has a directory called TEST. Now, back up the diskgroup:
ASMCMD [+] > md_backup -g dg7 -b g7.backup
Once backed up, suppose you drop the diskgroup to simulate an accidental drop:
SQL> drop diskgroup dg7;     

Diskgroup dropped.
Now the diskgroup DG7 is gone from the ASM instance and you want to restore it from the earlier backup. You can reinstate it using the md_restore command:
$ asmcmd md_restore -b dg7.backup -t full  

Current Diskgroup being restored: DG7
Diskgroup DG7 created!
System template TEMPFILE modified!
System template FLASHBACK modified!
System template ARCHIVELOG modified!
System template BACKUPSET modified!
System template XTRANSPORT modified!
System template DATAGUARDCONFIG modified!
System template CONTROLFILE modified!
System template AUTOBACKUP modified!
System template DUMPSET modified!
System template ONLINELOG modified!
System template PARAMETERFILE modified!
System template ASM_STALE modified!
System template CHANGETRACKING modified!
System template DATAFILE modified!
Directory +DG7/TEST re-created!
Look at the output; it creates the diskgroup as well as the templates and the directory. If there were any data, of course it will be lost. The md_backup is not a backup of the data but rather the metadata of the ASM instance. The data, ostensibly, is being backed up by RMAN. After the diskgroup is created, along with all the directories, you can restore the RMAN backup to this diskgroup.
Another option, -f, allows you to place the commands in a script file, rather than execute them:
ASMCMD [+] > md_restore -b dg7.backup -t full -f cr_dg7.sql
It creates an SQL script with the name cr_dg7.sql that crates the diskgroup and all other objects. You can manually run this in the ASM instance. Here is how the file looks:
create diskgroup DG7 EXTERNAL redundancy  disk '/dev/raw/raw14' name DG7_0000 size 100M ;
alter diskgroup /*ASMCMD AMBR*/DG7 alter template TEMPFILE attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template FLASHBACK attributes (UNPROTECTED FINE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template ARCHIVELOG attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template BACKUPSET attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template XTRANSPORT attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template DATAGUARDCONFIG attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template CONTROLFILE attributes (UNPROTECTED FINE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template AUTOBACKUP attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template DUMPSET attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template ONLINELOG attributes (UNPROTECTED FINE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template PARAMETERFILE attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template ASM_STALE attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template CHANGETRACKING attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR*/DG7 alter template DATAFILE attributes (UNPROTECTED COARSE);
alter diskgroup /*ASMCMD AMBR */ DG7 add directory '+DG7/TEST';
One of the most useful applications of this feature is to document the metadata of the ASM instance. You can take backups at regular intervals, or after some major change, such as after adding a diskgroup, adding/removing disks or creating directories.

One of the biggest complaints from the ASM users accustomed to a traditional volume manager is the ability to check many things through command line. The ASM Command Line option (ASMCMD) closes this gap to a large extent. In Oracle Database 11g, some additional commands in the ASMCMD prompt makes it extremely easy to manage the ASM instance. One such example is the metadata backup you saw earlier. The other notable is the command to check disks managed by the instance. The command is lsdsk.
ASMCMD> lsdsk
Path
/dev/raw/raw10
/dev/raw/raw11
/dev/raw/raw13
... snipped ...
Without any flag, the command merely lists all the disks available to the instance. There are several flags that modify the output. The most common is -k as shown below:
ASMCMD> lsdsk -k
Total_MB Free_MB OS_MB Name Failgroup Library Label UDID Product Redund Path
103 41 103 DG4_0000 DG4_0000 System UNKNOWN /dev/raw/raw10
103 41 103 DG5_0000 DG5_0000 System UNKNOWN /dev/raw/raw11
... snipped ...
Another flag, -s, shows various I/O related statistics of the disks:
ASMCMD> lsdsk -s
Reads Write Read_Errs Write_Errs Read_time Write_Time Bytes_Read Bytes_Written Path
207485 207916 0 0 245.820323 159.634398 851251200 /dev/raw/raw10
207481 207912 0 0 229.996931 144.73954 851234816 /dev/raw/raw11
To quickly check the status of the disks, use the -p flag:
ASMCMD> lsdsk -p
Group_Num Disk_Num Incarn Mount_Stat Header_Stat Mode_Stat State Path
4 0 3915926174 CACHED MEMBER ONLINE NORMAL /dev/raw/raw10
5 0 3915926175 CACHED MEMBER ONLINE NORMAL /dev/raw/raw11
6 0 3915926193 CACHED MEMBER ONLINE NORMAL /dev/raw/raw13
Finally, the -t flag shows repair related information (described later in this installment):
ASMCMD> lsdsk -t
Create_Date Mount_Date Repair_Timer Path
27-SEP-07 28-SEP-07 0 /dev/raw/raw10
27-SEP-07 28-SEP-07 0 /dev/raw/raw11
28-SEP-07 28-SEP-07 0 /dev/raw/raw13
So far, the ASMCMD option extracted the values from the various V$ views in the ASM instance. But the metadata is stored on the disk itself. If the instance is not available, there should be a way to extract that information from the disks. In Oracle Database 11g, the lsdsk command has a flag "I" (uppercase "I", not lowercase "L") that extracts the information from the disk headers instead of V$ views. Here is an example of the -k flag extracting the information from the disk headers.
ASMCMD> lsdsk -Ik
Total_MB Name Failgroup Path
103 DG4_0000 DG4_0000 /dev/raw/raw10
103 DG5_0000 DG5_0000 /dev/raw/raw11
102 DG6_0000 DG6_0000 /dev/raw/raw13
To list disks of a specific diskgroup, say, DG1, you can use the -d flag as shown below:
ASMCMD> lsdsk -t -d dg1
Create_Date Mount_Date Repair_Timer Path
28-SEP-07 28-SEP-07 0 /dev/raw/raw5

You can also give a pattern for the disks:
ASMCMD> lsdsk -t /dev/raw/raw1*
Create_Date Mount_Date Repair_Timer Path
27-SEP-07 28-SEP-07 0 /dev/raw/raw10
27-SEP-07 28-SEP-07 0 /dev/raw/raw11
28-SEP-07 28-SEP-07 0 /dev/raw/raw13
28-SEP-07 05-OCT-07 0 /dev/raw/raw14
This shows only the disks that match the pattern. Finally, no need to remember these options; the help command shows all the options:
ASMCMD> help lsdsk
lsdsk [-ksptcgHI] [-d <diskgroup_name>] [pattern]
The lsdsk command brings the more advanced volume management like functionality to the ASM world.
Suppose you added a disk to a diskgroup. ASM immediately starts the rebalance operation. This operation is online so ASM must coordinate with the RDBMS instance the blocks accessed and changed, through a complex system of locks. In a RAC database, this process is exacerbated since the locks must be managed not just within the database but across multiple instances now.
What if you are adding the disks to a diskgroup no one is using? If ASM could somehow know that, it could eliminate the locking mechanism and make the process faster.
In Oracle Database 11g, a new way of mounting diskgroup makes it possible. The diskgroup can be mounted with a RESTRICT clause as shown below:
alter diskgroup dg7 mount restricted;
When the diskgroup is mounted this way, the ASM instance is aware of the exclusivity of the operations on the underlying disks and minimizes the locking mechanism. This, in turn, affects the disk operations such as rebalance faster.
Consider a diskgroup DG2 with two failgroups each with a single disk. When a certain area of one of the disks is damaged, it's not fatal for the diskgroup. Since they are mirrored, the damaged extents are reads from the other, intact disk and the operation goes through. But what happens to the damaged portion of the disk?
In Oracle Database 10g, the damaged disk is made offline and either the same disk or another one must be presented to the diskgroup. When the new disk is added, it must be completely cloned from the surviving disk to be used as a mirror. But if only a few blocks are damaged, copying the contents of a 34GB disk (for instance) is not efficient.
Therefore, in Oracle Database 11g, the damaged portions of the disks are repaired instead of copying the whole disk. This feature uses a new attribute of diskgroups, disk_repair_time, which specifies how long the ASM instance should tolerate a disk with errors before dropping it from the diskgroup. Here is how you set the attribute of the diskgroup DG2 to two hours:
SQL> alter diskgroup dg2 set attribute 'disk_repair_time'='2H';
Suppose DG2 has two disks, DISK1 and DISK2, and a few blocks on DISK2 suddenly go bad. Since the repair time is two hours, the ASM instance will not drop the disk immediately but rather will wait. If you fix the issue on DISK2 and make it online once again, then those bad blocks will be repaired from the surviving disk.
Let's see that with an example. Suppose the diskgroup DG2 has two failgroups. First, check the diskgroup configuration:
ASMCMD [+dg2] > lsdg dg2
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Name
MOUNTED NORMAL N 512 4096 1048576 206 78 0 39 0 DG2/
You can also confirm that from the du command:
ASMCMD [+dg2] > du  
Used_MB Mirror_used_MB
11 22
The du command confirms that the diskgroup has 22MB but only 11MB is useful. Now check the disks for the group dg2:
ASMCMD [+dg2] > lsdsk -d dg2
Path
/dev/raw/raw6
/dev/raw/raw7
You can also confirm that as well as get the names of the disks by:
SQL> select name, path 
2 from v$asm_disk
3 where group_number = 2
4 /

NAME PATH
-------- -----------------------------
DG2_0000 /dev/raw/raw7
DG2_0001 /dev/raw/raw6
This shows that the diskgroup has two disks and shows their names. Now to simulate a block corruption, put some characters in the middle of the device:
$ dd if=/dev/zero of=/dev/raw/raw7 bs=1024 skip=10 count=1
This makes one of the disks of the diskgroup corrupt. Now force a check of the diskgroup using the new command ALTER DISKGROUP ... CHECK.
SQL> alter diskgroup dg2 check
If you check the alert log of the ASM instance, you will notice, among several other lines, the following:
...
NOTE: starting check of diskgroup DG2
WARNING: cache read a corrupted block gn=2 fn=3 indblk=1 from disk 0
...
NOTE: cache successfully reads gn 2 fn 3 indblk 1 count 15 from one mirror side
kfdp_checkDsk(): 89
...
NOTE: cache initiating offline of disk 0 group 2
WARNING: initiating offline of disk 0.3915926170 (DG2_0000) with mask 0x7e
...
WARNING: Disk (DG2_0000) will be dropped in: (7200) secs on ASM inst: (1)
...
The last line says it all. The disk you just corrupted will be dropped from the diskgroup in 7200 secs, which corresponds to two hours you set earlier as the repair timer. The message will be repeated in the alert log as the time drops:
...
WARNING: Disk (DG2_0000) will be dropped in: (5550) secs on ASM inst: (1)
GMON SlaveB: Deferred DG Ops completed.
Sat Oct 06 00:25:52 2007
WARNING: Disk (DG2_0000) will be dropped in: (5366) secs on ASM inst: (1)
GMON SlaveB: Deferred DG Ops completed.
Sat Oct 06 00:28:55 2007
WARNING: Disk (DG2_0000) will be dropped in: (5183) secs on ASM inst: (1)
GMON SlaveB: Deferred DG Ops completed.
Sat Oct 06 00:31:59 2007
WARNING: Disk (DG2_0000) will be dropped in: (5000) secs on ASM inst: (1)
GMON SlaveB: Deferred DG Ops completed.
...
Finally, the countdown will reach 0 and the disk will be dropped, unless you fix the issue and the disk goes through fast failure repair. If you know the disk is not repairable and should be dropped sooner, you can expedite its demise by issuing:
SQL> alter diskgroup dg2 offline disks in failgroup dg2_0000 drop after 1m
This drops the failgroup dg2_0000 after 1 minute allowing you to either take it physically out or move a different disk to the diskgroup. To drop the disk forcibly:
SQL> alter diskgroup dg2 drop disks in failgroup dg2_0001 force;
When the disk failure is resolved, you can initiate the fast repair by issuing:
SQL> alter diskgroup dg2 online disks in failgroup dg2_0001;
This will start the process of syncing the damaged and changed blocks on the disks of the failgroup DG2_0001 from the other surviving diskgroups. As the entire disk is not replicated but a few blocks alone, this dramatically reduces the time it takes to synchronize the disks after a small failure.

Sunday, July 21, 2013

Oracle Table Fragmentations and Defragmentation

Symptom
    1. What does space utilization and fragmentation mean?
    2. Which types of fragmentation are there?
    3. Which other factors are important with regard to space utilization?
    4. What are the disadvantages of fragmentation and poor space utilization?
    5. How can I determine the scope of the different fragmentation types and the unused space?
    6. How can I reduce fragmentation and improve space utilization?
Other terms
FAQ, frequently asked questions
Solution
    1. What does space utilization and fragmentation mean?
The term 'space utilization' refers to the percentage of available space that is actually filled with user data.
You can evaluate the space utilization on different levels (file system, tablespace, segment, block).
For detailed information about the logical components of the Oracle database, 
such as tablespaces, segments, and extents, refer to Note 666061.

The term 'fragmentation' refers to unused space in the database that cannot be used for technical reasons.
Frequently, the effects of fragmentation are responsible for poor space utilization.
    2. Which types of fragmentation are there?
  • Block fragmentation ("internal fragmentation")
The term 'block fragmentation' refers to the average space utilization in an Oracle block.
Depending on the segment type in question, the following subtypes occur :
    • Index fragmentation
                    The term 'index fragmentation' refers to block fragmentation in an index tree.
Index fragmentation is caused by repeatedly deleting and entering data, in particular if you usually delete the entries
on one side of the index and make new entries on the other side. For more information, refer to Note 771929 .
Index fragmentation also occurs when you delete a large number of table entries and therefore the
index entries are deleted as well.
    • Table fragmentation
The term 'table fragmentation' refers to block fragmentation in a table.
Table fragmentation mainly occurs when you delete a large amount of data from a table.
After the deletion, the blocks that are still allocated contain considerably less data than before,
which leads to poorer space utilization.
    • LOB segment fragmentation
                    LOB columns are generally paged out into separate LOB segments (see Note 500340).
Unfavorable space utilization in an LOB segment is often caused by the fact that a lot of undo information is stored
in the segment itself. To limit the scope of retained undo information, make sure that PCTVERSION storage
parameter is not set too high. Default value 10 is usually sufficient.
  • Tablespace fragmentation ("external fragmentation")
           Tablespace fragmentation occurs if a tablespace contains a large number of small freespace areas
that cannot be reused due to their size.
           Tablespace fragmentation is mainly a DMTS problem. Due to the AUTOALLOCATE and UNIFORM
assignment strategies, fragmentation effects rarely occur in LMTS  (see Note 214995).
  • File system space utilization
           Space utilization in the file systems of the Oracle data files does not affect the datafiles directly,
but it may be critical due to the cost of disk space. If the file system is configured considerably larger than
the datafiles contained in the file system, a lot of space is wasted.
    3. Which other factors are important with regard to space utilization?
  • By default, indexes are created with PCTFREE set to 10, which means that 10% of every index block is 
  • reserved for future updates. Frequently, this 10 % is permanently unused space in index blocks.
  • Due to the "PCTFREE = 10" and "PTCUSED = 40" storage parameters used in the SAP environment, 
  • the fill levels of Oracle blocks vary between 40 % and 90 %, even if you do not delete large data volumes. 
  • Therefore, the space that is actually used is significantly less than 100 % simply because of these storage parameters. 
  • PCTUSED no longer has a function if you use ASSM (see Note 620803). However, PCTFREE is still taken into account.
  • Due to an increase in INITRANS (in accordance with Note 84348, for example), more space for the transaction slots 
  • is required in every block. Each INITRANS slot requires about 20 bytes. 
  • The higher the INITRANS selected, the less space is available for user data.
  • If a segment is created with Parallel Query, each Parallel Query Thread allocates separate extents. 
  • As a result, more space may be allocated than during the sequential creation of a segment.
  • The last extent of each segment is usually only filled partially. The block that was filled last is specified by the 
  • High Water Mark. Usually, all blocks that are located before the High Water Mark have been filled at some point.  
  • However, when you use ASSM (Note 620803), several FREELISTs or several FREELIST GROUPs, blocks that have 
  • not been used yet may be located before the High Water Mark.
  • The larger the last extent, the more space may be wasted. To avoid empty tables and indexes using up space 
  • unnecessarily, SAP creates segments with a very small INITIAL extent size (for example, 16K for DMTS, see also 
  • tables TGORA and IGORA). If segments with an INITIAL or NEXT size that is too big (due to a handling or tool error) 
  • are created , a lot of space is lost due to unused extent areas that are unnecessarily large. 
  • The space lost due to unused extent areas can also be considerable if there is a large number of small objects. 
  • If a tablespace is created as LMTS or AUTOALLOCATE, the segments are always created with extents of at least 64 K. 
  • As a result, LMTS or AUTOALLOCATE lead to a slightly higher space utilization in empty tables and indexes than DMTS.
  • The space currently unused in a tablespace depends on the tablespace size and the segments contained in 
  • the tablespace. If a tablespace is defined as relatively large from the start, the space utilization is relatively small.
  • If you use long LONG RAW column values, extreme fragmentations may occur in tables such as DDLOG or VBDATA 
  • as a result of the Oracle response described in Note 830965.
  • As part of a Unicode conversion to Basis Release 7.00, the LOB columns of the DYNPSOURCE table may grow 
  • significantly. For more information, see Note 1010237.
    4. What are the disadvantages of fragmentation and poor space utilization?
  • Increased space requirements
           Due to the poor utilization of the existing space, the database requires more space.
  • Longer runtimes for Backup and Restore
           The more space required by the database, the longer the runtimes for operations such as Backup and Restore.
  • Performance problems due to index fragmentation
           If indexes are fragmented, SEVERE performance problems may occur when you access the indexes.
  • Performance problems due to table fragmentation
           Table fragmentation may also have a negative effect on system performance when the relevant tables
are accessed. However, in general, the consequences are less severe than in the case of index fragmentations.
  • Global performance reduction due to buffer pool displacements
           The poorer the space in index and table blocks is utilized, the more blocks need to be loaded into the Oracle buffer
pool during the data access. As a result, more blocks of other segments are displaced.
These blocks must be reloaded into the buffer later, which generally impairs database performance.
  • Higher costs for the disk space used
           Often, the disk space used incurs costs. These costs increase with the amount of disk space required.
    5. How can I determine the scope of the different fragmentation types and the unused space?
  • To determine the degree of fragmentation for the different fragmentation types, proceed as follows:
    • Index fragmentation
 You can determine the index fragmentation by determining the index storage quality or the deleted
leaf rows as described in Note 771929.
    • Table fragmentation
 You can use the following statement to determine the tables with the - in absolute figures - largest unused space:

SELECT * FROM
(SELECT
    SUBSTR(TABLE_NAME, 1, 21) TABLE_NAME,
    NUM_ROWS,
    AVG_ROW_LEN ROWLEN,
    BLOCKS,
    ROUND((AVG_ROW_LEN + 1) * NUM_ROWS / 1000000, 0) NET_MB,
    ROUND(BLOCKS * (8000 - 23 * INI_TRANS) *
      (1 - PCT_FREE / 100) / 1000000, 0) GROSS_MB,
    ROUND((BLOCKS * (8000 - 23 * INI_TRANS) * (1 - PCT_FREE / 100) -
      (AVG_ROW_LEN + 1) * NUM_ROWS) / 1000000) "WASTED_MB"
  FROM DBA_TABLES
  WHERE
    NUM_ROWS IS NOT NULL AND
    OWNER LIKE 'SAP%' AND
    PARTITIONED = 'NO' AND
    (IOT_TYPE != 'IOT' OR IOT_TYPE IS NULL)
  ORDER BY 7 DESC)
WHERE ROWNUM <=20;
The above query is based on statistics values such as BLOCKS or NUM_ROWS.
For correct results, you must therefore ensure that the statistics are up-to-date.
The system ignores tables without statistics.
Statistics created with DBMS_STATS did not take LONG, LONG RAW, and LOB columns into account correctly.
For this reason, these tables can be displayed as highly fragmented. This is incorrect.
For segments in ASSM tablespaces, you can create a segment dump that displays the levels to which
the individual segment blocks are filled (this level is defined in the bitmap blocks). To do this, proceed as follows:

SELECT HEADER_FILE, HEADER_BLOCK FROM DBA_SEGMENTS
  WHERE SEGMENT_NAME = '<segment_name>';
EXEC DBMS_SPACE_ADMIN.SEGMENT_DUMP('<tablespace_name>',
  <header_file_from_above>, <header_block_from_above>);
You can also use BRCONNECT to create statistics for space utilization. For more information, see Note 554031.
As of Oracle 10g, the DBMS_SPACE function (described in Note 1295200) for determining
table fragmentation is available. The Oracle Segment Advisor (Note 927813) is also based on this function and
therefore can be used to determine fragmentation.
    • LOB segment fragmentation
You can use DBMS_LOB.GETLENGTH to determine the space used within an LOB segment.
You can use the following script to determine the net allocation of the existing LOB segments:

connect <sapuser>/<password>
SET HEADING OFF
SET FEEDBACK OFF
SET ECHO OFF
SET PAGESIZE 50000
SET SQLPROMPT ''
SPOOL lob.sql
SELECT
  'SELECT
    ''' || TABLE_NAME || ''' TABLE_NAME,
    ''' || COLUMN_NAME || ''' COLUMN_NAME,
    ROUND(SUM(DBMS_LOB.GETLENGTH("' || COLUMN_NAME || '"))
      / 1024 / 1024) "NET (MB)"
  FROM
    "' || USER || '"."' || TABLE_NAME || '";'
FROM
  USER_LOBS;
SPOOL OFF
COLUMN TABLE_NAME FORMAT A30
COLUMN COLUMN_NAME FORMAT A25
SET HEADING ON
@lob
SET FEEDBACK ON
SET SQLPROMPT 'SQL>'
In the next step, you can compare the net demand determined in this way with the BYTES value from
DBA_SEGMENTS, if you suspect the space is being utilized poorly.
    • Tablespace fragmentation
To determine the number of unused space fragments in a table space and their breakdown into different
sizes, proceed as follows:

SELECT BLOCKS, COUNT(*) FROM DBA_FREE_SPACE
WHERE TABLESPACE_NAME = '<tablespace_name>'
GROUP BY BLOCKS ORDER BY BLOCKS;
  • Provided that the statistics are up-to-date, you can determine the number of blocks on the other side 
  • of the High Water Mark as follows:

    SELECT EMPTY_BLOCKS FROM DBA_TABLES
    WHERE TABLE_NAME = '<table>';
You can use the following statement to determine all tables that have more than 1,000 unused blocks:

SELECT TABLE_NAME FROM DBA_TABLES WHERE EMPTY_BLOCKS > 1000;
  • You can use the DBMS_SPACE.UNUSED_SPACE function to determine the number of blocks on the 
  • other side of the High Water Mark of an index:

    VARIABLE TBL NUMBER;
    VARIABLE TBY NUMBER;
    VARIABLE UBL NUMBER;
    VARIABLE UNUSED_BYTES NUMBER;
    VARIABLE LUEFI NUMBER;
    VARIABLE LUEBI NUMBER;
    VARIABLE LUB NUMBER;
    BEGIN
      DBMS_SPACE.UNUSED_SPACE('<owner>', '<index_name>', 'INDEX',
        :TBL, :TBY, :UBL, :UNUSED_BYTES, :LUEFI, :LUEBI, :LUB);
    END;
    /
    PRINT UNUSED_BYTES;
  • The percentage of space in a tablespace used by extents can be determined as follows:

    SELECT TO_CHAR(
      (1 - (SELECT SUM(BYTES) FROM DBA_FREE_SPACE
      WHERE TABLESPACE_NAME = '<tablespace_name>') /
      (SELECT SUM(BYTES) FROM DBA_DATA_FILES
      WHERE TABLESPACE_NAME = '<tablespace_name>'))
      * 100,
      999999999999.99) "% (USED)"
    FROM DUAL;
  • To determine the freespace in file systems, you can use operating system tools (for example, "df -m" in UNIX).
    6. How can I reduce fragmentation and improve space utilization?
  • You can reduce the different fragmentation types as follows:
    • General segment fragmentation
As of Oracle 10g, you can use the Oracle Segment Advisor for a general analysis of segments,
such as tables, indexes, or LOBs. For more information, see Note 927813.
    • Index fragmentation
You can use REBUILD, COALESCE or DROP and CREATE to eliminate index fragmentation. If you want to use these functions, refer to Notes 771929 and 332677 for more information.
    • Table fragmentation
Allocated blocks are usually poorly utilized after you have deleted large volumes of data from the table
(for example, while archiving). If you know that the size of the table will in future be similar to its size before
you deleted the data, you do not have to intervene. However, if the table will permanently have less data, you
can reconstruct it during a reorganization to reduce table fragmentation (see Note 541538). As of Oracle 10g, 
you also have the option of using Segment Shrinking (Note 910389).
Refer to Note 572060 for more information.
If the fragmentation has LONG RAW fields due to the Oracle response described in Note 830965,
refer to the solutions listed there.
If an ASSM segment requires an unnecessary amount of space because the levels of filled space
defined in the bitmap blocks do not reflect reality, and because they accept blocks that are far too full
(for example, due to Oracle bugs), you can correct the inconsistencies as follows:

EXEC DBMS_REPAIR.SEGMENT_FIX_STATUS('<owner>','<segment_name>');
After you have done this, all free areas can be used again productively. For more information, see Note 896717.
    • LOB segment fragmentation
You can clean up LOB segment fragmentation by carrying out a reorganization (of the relevant table) and,
where necessary, by reducing PCTVERSION (recommended: <= 10).
    • Tablespace fragmentation
If tablespaces are fragmented in DMTS due to a large number of small areas of unused space,
you can use COALESCE to combine adjacent areas of unused space:

ALTER TABLESPACE <tablespace_name> COALESCE;
Note that this operation may cause problems with the ST enqueue if there are a lot of areas with
unused space (see Note 745639). If the number of areas with unused space remains high after a
COALESCE, you can defragment the tablespace by reorganizing it (see Note 541538).
To reduce tablespace fragmentation, we recommend that you use the "brconnect -f next" function to
choose adequate sizes for the NEXT extent for growing segments. This way, you can keep fragmentation to a minimum.
In the medium-term and the long-term, you can reduce tablespace fragmentation by changing to
LMTS (see Note 214995).
  • The space utilization of an index can be improved by decreasing the value of PCTFREE for the relevant index. 
  • However, this may lead to an increased number of index block splits and should therefore only be used in 
  • exceptional cases.
  • To avoid space losses due to PCTUSED, consider switching to ASSM (Note 620803), where this 
  • parameter is no longer relevant.
  • Occasionally (see, for example, Note 70513), you can also adjust PCTFREE and PCTUSED of a table during a 
  • reorganization for a better utilization of the blocks and to reduce the size of the table.
  • In individual cases, the unused space on the other side of the High Water Mark of the segment can be released 
  • by using DEALLOCATE UNUSED as described in Note 570832. In general, you do not have to intervene and remove 
  • the currently unused areas. You should only use this option if too much space was inadvertently allocated to the 
  • last extent, or as a temporary alternative to avoid carrying out a reorganization.
  • If a tablespace requires a large amount of space because the data file is too large, you can reduce the 
  • size of the data file by using RESIZE:

    ALTER DATABASE DATAFILE '<file>' RESIZE <new_size>;
Alternatively, you can also reduce the size by reorganizing the tablespaces (with data files).
  • To optimize space utilization in file systems with the Oracle data files, consider the following:
    • The freespace in the file system should be specified so you can carry out expected AUTOEXTEND operations 
    • or manual tablespace extensions without causing a file system overflow.
    • The fewer file systems you use for the data files, the smaller the "safety margin" you require in total on file system 
    • level. You can use one single file system for all data files. (You should distribute the file system over as 
    • many disks as possible.)
    • However, depending on the file system used, it is difficult to reduce an existing file system. 
    • If a reduction makes sense to avoid unused space, but you cannot carry out this step for technical reasons, 
    • you can move data files or reorganize tablespaces to move to a new, smaller tablespace.
Related Notes
1295200Oracle 10g or higher: Space statistics based on DBMS_SPACE
1010237Growth of DYNPSOURCE table after Unicode migration
948197Merge fix for DBMS_STATS package on Oracle 9.2.x and 10.2.x
927813Oracle 10g: Using Oracle Segment Advisor to optimize space
910389FAQ: Oracle Segment Shrinking
896717Problems with tables in ASSM tablespaces
832343FAQ: Clustering factor
771929FAQ: Index fragmentation
745639FAQ: Oracle enqueues
666061FAQ: Database objects, segments and extents