Search This Blog

Monday, August 30, 2010

Shortcut To Load XML file in Oracle table

There are many ways to load the XML file in the Oracle table. Many of the users are facing the Time constraint issue while loading/extracting data from XML file.
Many ETL tools also provide automated solution . But if the size of the XML data file is large than 60 MB then a lot of time is consumed. Over here there is a solution for this & in this solution the loading of XML file in Oracle tables is divided in two parts.
For example FAMILY.XML is an raw file.
So first we will transform the XML file into the Text file and this would be done through the XSLT. To more know about the XLST please refer to the link http://www.w3.org/TR/xslt .
In second phase we load the TEXT file with the help of Control file /Sqlloader in Oracle database tables.

The Design of the workflow would be like this :-


  
Suppose FAMILY.XML  file contains attributes ;


<Family>
<FAMILY> 1</FAMILY>
<VHCL_TYPE>C</VHCL_TYPE>
<FAMILY_DESC>Colt</FAMILY_DESC>
<DESC_ORIGIN_ID>10200</DESC_ORIGIN_ID>
<DESC_TAG>0</DESC_TAG>
<ACTIVE_IND>1</ACTIVE_IND>
<UPDATE_DATE>2000-01-11T09:59:26</UPDATE_DATE>
</Family>

Add this code to the beginning of the XML file:


<?xml version="1.0" standalone="yes"?>
<?xml-stylesheet type='text/xsl' href='FAMILY.xslt'?>
<Family xmlns="http://www.ibm.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com FAMILY.xsd">


-- Over here in the above statement  href='FAMILY.xslt'?  &  xsi:schemaLocation="http://www.ibm.com FAMILY.xsd"> 
 is the xslt file name  & xsd file name which is to be created fpr this Family.xml file.


After appending this code to the XML , file appears like this : ---

<?xml version="1.0" standalone="yes"?>
<?xml-stylesheet type='text/xsl' href='FAMILY.xslt'?>
<Family xmlns="http://www.ibm.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com FAMILY.xsd">
<Family>
<FAMILY> 1</FAMILY>
<VHCL_TYPE>C</VHCL_TYPE>
<FAMILY_DESC>Colt</FAMILY_DESC>
<DESC_ORIGIN_ID>10200</DESC_ORIGIN_ID>
<DESC_TAG>0</DESC_TAG>
<ACTIVE_IND>1</ACTIVE_IND>
<UPDATE_DATE>2000-01-11T09:59:26</UPDATE_DATE>
</Family>

Now the second task is to create XSLT file i.e FAMILY.XSLT

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" 
xmlns:lxslt="http://xml.apache.org/xslt" xmlns:sL="http://www.ibm.com" 
exclude-result-prefixes="sL">

  
  <xsl:variable name="vrFamily"/>
  <xsl:variable name="vrFAMILY"/>
  <xsl:variable name="vrVHCL_TYPE"/>
  <xsl:variable name="vrFAMILY_DESC"/>
  <xsl:variable name="vrDESC_ORIGIN_ID"/>
  <xsl:variable name="vrDESC_TAG"/>



 <xsl:template match="sL:FamilyCollection">    

     <xsl:for-each select="sL:Family">
        <xsl:variable name="vrFAMILY">
<xsl:value-of select="sL:FAMILY"/>
        </xsl:variable>
<xsl:variable name="vrVHCL_TYPE">
<xsl:value-of select="sL:VHCL_TYPE"/>
        </xsl:variable>

<xsl:variable name="vrFAMILY_DESC">
<xsl:value-of select="sL:FAMILY_DESC"/>
        </xsl:variable>
        <xsl:variable name="vrDESC_ORIGIN_ID">
<xsl:value-of select="sL:DESC_ORIGIN_ID"/>
        </xsl:variable>
        <xsl:variable name="vrDESC_TAG">
<xsl:value-of select="sL:DESC_TAG"/>
        </xsl:variable>
        <xsl:variable name="vrACTIVE_IND">
<xsl:value-of select="sL:ACTIVE_IND"/>
        </xsl:variable>
        <xsl:variable name="vrUPDATE_DATE">
<xsl:value-of select="sL:UPDATE_DATE"/>
        </xsl:variable>
<xsl:value-of select="$vrFAMILY"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="$vrVHCL_TYPE"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="$vrFAMILY_DESC"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="$vrDESC_ORIGIN_ID"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="$vrDESC_TAG"/>

         <xsl:text>###</xsl:text>
</xsl:for-each>

  </xsl:template>
</xsl:stylesheet>

Once the XSLT file is created then keep both the FAMILY.XSLT & FAMILY.XML file on the same location in the directory folder on your local system.
Double click the on the FAMILY.XML file then the TXT file would be automatically  opened in the internet explorer window . You can save this by the name of FAMILY.TXT file.

This process can be automated......................... according to various requirements.


Now  create a control file  FAMILY.CTL file

LOAD DATA
CHARACTERSET UTF8
INFILE '%FileName' "STR '###'"
REPLACE
INTO TABLE FAMILY
FIELDS TERMINATED BY '|' TRAILING NULLCOLS

(
FAMILY CHAR TERMINATED BY '|',
VHCL_TYPE CHAR TERMINATED BY '|',
FAMILY_DESC CHAR TERMINATED BY '|',
DESC_ORIGIN_ID CHAR TERMINATED BY '|',
DESC_TAG CHAR TERMINATED BY '###',
FEEDFile_ID CONSTANT '$$FF'




%FileName : In the above pass /write  the directory path of the FAMILY.TXT file above.



Then run the Sql Loader command to load the data from TXT file to Oracle tables

sqlldr username@server/password control='FAMILY.CTL'  log = 'FAMILY.log'

control = path of the control file.


Through this approach any user reduce it' costs by at least 3 times of another approach.
Execution time of loading XML data into Oracle tables would be reduced by three times of another approach