Migrating data from Oracle to Teradata:
the "no frills" method
All kinds and sizes of companies are migrating their business
data to Teradata warehouses these days. Some are accomplishing
this migration through the use of third-party ETL tools written
to manage the data movement between Teradata and Oracle. But
even if your company doesnt purchase third-party tools
for this purpose, you can still migrate the data with the
tools provided by Teradata and Oracle.
The "no frills" method of data migration, as it
is sometimes called, is not as elegant or as high performing
as the approach provided by ETL tool vendors. This is because
it uses Oracle's SQL*Plus rather than the low-level access
methods used by third-party products that dig into the internals
of Oracle. But, you can still get the job done. There are
even some optimizations that can be used with SQL*Plus if
the Oracle system can support them.
With SQL*Plus, you have control over the column formats for
the extracted data. The most neutral extract format is externally
readable ASCII character formatssomething you can read
if printed out in a report. By doing this, you eliminate the
issue of database-unique, internal formats for different data
types used in tables. Formats must be considered because databases
store dates, times and numeric data in their own internal
formats. You have to be the interpreter for Teradata and translate
Oracle's date, time, and number formats into something Teradata
can understand and store.
The issues of date handling provide good examples of some
of the challenges you might face when migrating data from
Oracle. Teradata can store date and time as different columns
or together in a timestamp. Oracle stores date and time together
in the same column. You have to determine if a date column
in Oracle only contains significant dates, only significant
time values, or whether date and time are both important.
Each column in the Oracle table must be reviewed, and an appropriate
data type and format must be chosen for the extraction so
the data can be loaded into Teradata.
Another benefit to the ASCII-character extract approach is
that there will be no character translation errors if the
extracted files are FTPed from one platform type to another.
For instance, when data is extracted on an EBCDIC host computer,
character translation from EBCDIC to ASCII is easy for most
basic platform software to perform. In order to keep things
simple, creating the extracts from Oracle using fixed-length
records eliminates another possible area of errors and confusion
in scripting the data-migration process. Depending upon whether
you extract data on a UNIX host, a mainframe or a Microsoft
Windows host, you might need to account for carriage-return
and end-of-record markers on each record.
If there is enough disk space or enough tape drives on your
Oracle source system for parallel data extraction from an
Oracle table, you can partition your SQL*Plus extracts. This
might speed up the Oracle extract process. Parallel extracts
are accomplished by designing separate SQL*Plus scripts that
select different ranges of data specified by the WHERE clauses
of the SELECT statements. Each SELECT script can write to
its own file. If the Oracle system can support parallel query
execution, parallelism can be used to speed up each of the
SELECT scripts. The extracted files can then be used as data
sources in the FastLoad, MultiLoad, or TPump script that is
used to insert the data into Teradata.
The following are sample scripts for migrating table data
from Oracle to Teradata. There is a SQL*Plus script that extracts
the data and places it in a file. This script is followed
by a Teradata FastLoad script that creates a new table and
loads the data. The "set" and "column"
commands in the SQL*Plus script document the formats of the
columns in the ASCII extract file. The name of the file to
which SQL*Plus will write the extracted data is documented
in the SPOOL command. The Oracle SELECT statement is coded
so that all the columns are named. Since the order of the
columns in the SELECT statement is the order that the data
will be in the output record, this is a form of documentation
for those who might need to maintain these scripts.
The "No Frills" Method Code
for Migrating Data from Oracle to Teradata
Sample Oracle SQL*Plus Script:
set termout off
set newpage 0
set space 0
set pagesize 0
set echo off
set feedback off
set heading off
set linesize 86
column CUST_NO format 099999990
column DATE_UPDATED format A10
column FIRST_NAME format A15
column MIDDLE_NAME format A15
column LAST_NAME format A20
column TOTAL_DEPOSITS format S0999999999999.90
spool customer.txt
SELECT
CUST_NO
,TO_CHAR(DATE_UPDATED,'YYYY/MM/DD')
,FIRST_NAME
,MIDDLE_NAME
,LAST_NAME
,TOTAL_DEPOSITS
FROM CUSTOMER
/
Sample Teradata FastLoad Script:
LOGON mytera/jack,jill;
DROP TABLE CUSTOMER;
DROP TABLE CUSTOMER_ERR1;
DROP TABLE CUSTOMER_ERR2;
CREATE TABLE CUSTOMER (
CUST_NO INTEGER FORMAT '9(9)',
DATE_UPDATED DATE FORMAT 'YYYY/MM/DD',
FIRST_NAME CHAR(15),
MIDDLE_NAME CHAR(15),
LAST_NAME CHAR(20),
TOTAL_DEPOSITS DECIMAL(15,2) FORMAT '-9(13).99')
UNIQUE PRIMARY INDEX(CUST_NO ) ;
SET RECORD UNFORMATTED;
DEFINE
CUST_NO (CHAR(9)),
DATE_UPDATED (CHAR(10)),
FIRST_NAME (CHAR(15)),
MIDDLE_NAME (CHAR(15)),
LAST_NAME (CHAR(20)),
TOTAL_DEPOSITS (CHAR(17)),
WINDOWS_EOR_CHARS (CHAR(2))
FILE=customer.txt;
begin loading CUSTOMER
errorfiles CUSTOMER_ERR1, CUSTOMER_ERR2
;
INSERT INTO CUSTOMER VALUES (
:CUST_NO,
:DATE_UPDATED,
:FIRST_NAME,
:MIDDLE_NAME,
:LAST_NAME,
:TOTAL_DEPOSITS
)
;
end loading;
QUIT;
Download
a Microsoft Word document containing the previous code.