Register | Log in


Subscribe Now>>
Home News Tech2Tech Features Viewpoints Facts & Fun Teradata.com
Applied Solutions
Download PDF|Send to Colleague

How secure is your data?

Enhancing security with views, external stored procedures and user-defined functions.

by Geoffrey C. Plummer and Adriaan W. Veldhuisen

The introduction of privacy regulations has brought about new concerns for the security of every organization's data. How a company protects confidential customer, partner and employee data is often given the same attention by investors as the company's profit.

Furthermore, improving existing data security to meet compliance and shareholder demands will require an aggressive and potentially costly corporate strategy in the coming years.

The ideal location to secure a company's data is at the source—the detail level within the data warehouse. Protecting the data at this level adds a more extensive and seamless firewall between the data and the potential hacker, while still allowing appropriate users access to the data. Teradata readily accomplishes this layered protection and select accessibility through the use of views, external stored procedures (XSPs) and user-defined functions (UDFs).

What about data masking?
Data masking, or obfuscation, involves obscuring data elements during the retrieval process. While data is stored in the underlying tables, data masking changes the presentation such that, for example, a Social Security number (SSN) stored in the customer table as "123-45-6789" will appear as "***-**-****" when retrieved via a query against the database. Many structures can be put in place to mask data. However, most organizations develop view definitions to enable data masking such as the one below:

	REPLACE VIEW CUSTOMER (Customer_Name,SSN) as
	SELECT Customer_Name, '***-**-****' FROM Customer_Table;

A user querying the Customer_Table using the CUSTOMER view will see only the customer's name and not his or her SSN. This access control option may be adequate depending on the policy requirements. But what if users require multiple levels of access to the data, with some users needing to see the SSN while the system continues to mask the SSN for others?

The easy solution would be to create another view on the Customer_Table that does not mask the SSN and grants limited access to only those authorized users. While this strategy may initially seem easy to implement, the cost of maintaining multiple views for the Customer_Table may prove expensive. A better solution would be to enable conditional data access via a single view described below:

	REPLACE VIEW CUSTOMER (Customer_Name,SSN) as
	SELECT Customer_Name, 
	CASE WHEN Protection_Table.ProtectionNeeded = 'Y' 
		THEN '***-**-****' ELSE SSN
	FROM Customer_Table JOIN Protection_Table 
		ON Protection_Table.User_ID = USER;

In this scenario, a user querying the CUSTOMER view can see the customer's SSN if a row exists in the Protection_Table with the user's User_ID, and if that row has the ProtectionNeeded column set to 'N.'

Conversely, a row in the Protection_Table for the user with ProtectionNeeded set to 'Y' will mask the SSN. The Protection_Table must have a row for every User_ID. Eliminating a user's access to the tables is done by removing the user's row from the Protection_Table. Enhancing the above structures to include row-level data masking is similarly straightforward.

For a large user community, managing a significant-sized User_ID table incurs greater effort and increased expense. And with many columns needing protection from various users, the issue multiplies. Applying Teradata roles and profiles functionality to the tables will mitigate this issue.

Initial database-level encryption
Although data masking may protect data from a variety of applications users, there is no separation of duties as the data itself remains in the database. "Super Users" with access rights to base tables could bypass the view layer and access the data directly. While best practices dictate that users not be allowed direct access to the base tables, most DBAs have unlimited access rights.

Early database encryption practices were implemented within the data extract, transform and load (ETL) processes. Since many load processes were based on custom-created code, ETL designers created transformation logic using SQL-like SUBSTRing logic. So an SSN of 123456789 in an input file was transformed into 587314926 and stored in the database. It wasn't long before some users discovered, by viewing known SSNs, that all the 4s were turned to 3s, 5s turned to 1s, 6s turned to 4s, etc. Besides, the DBA and ETL designers likely had access to the ETL code and the data transformation rules.

One point to note is that when the encrypt function was used, every time the SELECT code (see below) was executed, a different response would result:

	SELECT Encrypt('123456789','SSN1') "SSN1";

To circumvent hackers, this randomization built into the encrypt code (based on certain dynamic parameters) produced a different encryption sequence every time.

After the initial encryption practices, database architects enabled more procedural logic. Much effort went into developing these coding structures and enabling access to ETL processes. The ETL designer simply made a call to the program to retrieve the encrypted data, which was then loaded into the database. Database architects described the processes required, and companies built these solutions—until users wanted immediate access to the decrypted data. Although extracting a customer's SSN involved additional processing for the extracts, it was a simple process and decrypted data never had to be stored in the data warehouse.

Eventually other users, such as customer service department employees, started to require access to decrypted data. Additionally, data elements not originally considered sensitive—such as birth dates, driver's license numbers and credit card numbers—were increasingly incorporated into the new regulations. To accommodate these requirements, database functionality expanded to include enhanced data protection.

External stored procedures and user-defined functions
The Teradata Database enables access to external processing structures through the use of XSPs and UDFs. These structures can be written in many programming languages and are accessible to ETL designers and users.

A "C" program created to encrypt a customer's SSN can now be made into an XSP and made available to end users through the procedure creation. The following steps bypass encrypted data and allow user accessibility:

CREATE PROCEDURE EncryptSSN
The "C" program DecryptSSN is used to reverse the encryption process. The ETL designer calls the EncryptSSN XSP [CALL EncryptSSN('123456789');] and stores the encrypted SSN in the database table. The user needing immediate access to the decrypted data can call the DecryptSSN XSP and immediately retrieve the customer's decrypted SSN.

CREATE PROCEDURE DecryptSSN
UDFs provide another option for users needing access to SSNs. Whereas an XSP returns only a single value via the call request, a UDF creates a function that is available to a user within a SELECT statement. The UDF is created through a creation process similar to an XSP. The "C" code within the UDF can be the same as within the XSP.

CREATE FUNCTION Encrypt (EncryptStr VARCHAR(30), EncryptSeq VARCHAR(30))
CREATE FUNCTION Decrypt (DecryptStr VARCHAR(30), DecryptSeq VARCHAR(30))
ETL processes can utilize the XSP while users utilize the UDF. This is important because ETL design strategies may involve using packaged ETL solutions, such as Ab Initio or Informatica, to deliver data to the warehouse. The flexibility of having multiple access methods for the same encryption code is a powerful database extension.

Once the UDF is created, the user may access the function within his or her application. Multiple data elements (within a VARCHAR(30)) can be encrypted and decrypted using the same XSP and UDF. The first parameter, EncryptStr, is the string to be encrypted, and the second parameter, EncryptSeq, is the specific encryption sequence employed. Various encryption sequences can be specified and made available to the program through define statements in the "C" include files. Passing the same EncryptStr to be encrypted and specifying different encryption sequences will bring different results when using the UDF as shown below:

	SELECT Encrypt('220384294','SSN1') "SSN1";
	SELECT Encrypt('220384294','SSN2') "SSN2";
	SELECT Encrypt('4264827290262739','CCD1') "Credit Card Number";

Multiple encryption sequences can be established, used and buried within the coding structures to enhance data protection. For instance, different encryption sequences can be employed by the column being protected, the month in which the data is loaded, the state in which the customer lives, etc.

While offering additional data protection, these encryption sequences allow immediate access to those who need it—provided the user has the correct encryption code available. In addition to functions shown in this article, other packaged applications are available to manage encryption sequences, easily establish and maintain flexible access rules, and track and audit changes in the controls.

Integrating the solution
With encryption and decryption functions in place, the database architect may now integrate the solution into the data warehouse structure. A combination of the aforementioned functions and view layers allows for the decrypt operation to happen with the view layer—but only for those users who are allowed decrypt functionality.

The following section of code further develops the SSN encryption/decryption example:

	REPLACE VIEW CUSTOMER (Customer_Name, SSN) AS
	SELECT Customer_Name,
	CASE WHEN Protection_Table.ProtectionNeeded = 'Y' 
		AND Protection_Table.DecryptAllowed = 'N' THEN '***-**-****'
	CASE WHEN Protection_Table.ProtectionNeeded = 'Y' 
		AND Protection_Table.DecryptAllowed = 'Y' THEN Decrypt(SSN)
	ELSE Decrypt(SSN) END
	FROM Customer_Table JOIN Protection_Table 
		ON Protection_Table.User_ID = USER;
Teradata, Protegrity partner to keep data secure

The Protegrity solution supports full separation of duties as required by the Payment Card Industry Data Security Standard (PCI-DSS). The security administrator enforces policy-based authorization through the policy manager and assigns access rights to the individual users. The policy manager can also provide encryption at the table or database level as part of Teradata's backup, archive and restore (BAR) solution.

Protegrity data at rest or via the Teradata BAR encryption, and the Teradata Database and Client encryption of authentication and network traffic, provide organizations with protection from the time data enters the data warehouse to the time it is aged out and deleted from the data warehouse and archive copies.

With these encryption tools, an organization can leverage existing data protection functionality and protect its most valuable asset against data security breaches.

—G.C.P. and A.W.V.

Because the interface to the decryption algorithms is placed within the view layer, the encryption and decryption processes are invisible to the user. The Protection_Table is augmented for all users to have the column DecryptAllowed. Only those users enabled to see the decrypted data will be able to view it.

Safety now
Companies lacking a data security strategy will be pushed to implement one in the next few years. Regulatory bodies are increasing data protection requirements, pressuring data warehouse vendors to support the changes, and organizations with data warehouses to implement them.

Employing an aggressive strategy with the proper encryption tools and solutions is the only way organizations can keep their data secure, their customers protected and their investors satisfied. T

Geoffrey C. Plummer is a Teradata Certified Master and principal consultant with Teradata. He has worked with Teradata for 20 years within the financial, healthcare and communications industries.

Adriaan W. Veldhuisen, also a Teradata Certified Master, is a member of Teradata's Research & Development, Product Management Team. He is co-inventor of three privacy patents, is a privacy architect at Teradata and is responsible for setting development requirements for privacy and security requirements in Teradata releases.

Teradata Magazine-March 2008

More Applied Solutions

Related Links

Reference Library

Get complete access to Teradata articles and white papers specific to your area of interest by selecting a category below. Reference Library
Search our library:
Manthan
Trillium
Protegrity
Teradata.com | About Us | Contact Us | Media Kit | Subscribe | Privacy/Legal | RSS
Copyright © 2008 Teradata Corporation. All rights reserved.