Executive Summary
This white paper describes the Queue Tables feature that was introduced in Teradata® Database V2R6. This feature enables new and improved ways of building applications to support event processing use cases.
Introduction
A prerequisite to a successful implementation of an active data warehouse in Teradata Database is the ability to kickoff tactical event-driven business processes and applications based on recent activity. Queue tables enable new ways of building these event processing applications. These provide significant performance and usability improvements over capabilities in prior releases that were not optimized for event processing.
The Need for Queue Tables
Here are some examples of architectural approaches to event processing before queue tables functionality was available:
Event Alert
When an event is detected by application code running in the Teradata Database, data are inserted into an event table (e.g., by a trigger) for later processing. Before queue tables, an external event processing application pulled these events out of the database using a polling loop in a stored procedure. In the following example, the anyevent stored procedure is an example of a polling loop, repeating a SELECT statement to determine if any events are present.
CREATE PROCEDURE anyevent ()
BEGIN
anyrows:
LOOP
FOR checkevents AS checkcount
CURSOR FOR
SELECT * FROM eventtab
DO
IF ACTIVITY_COUNT > 0 THEN
LEAVE anyrows;
END IF;
END FOR;
END LOOP anyrows;
END;
Once an eventtab row is present, those event data are returned to the external application for further processing. The external application then proceeds back to the anyevent polling loop to find more events. The polling loop and repeated SELECT processing are an inefficient use of CPU resources.
A retail inventory management system is an example of an Event Alert application. Point-of-sales data capture can automate the process of recognizing low inventory level events and replenishing retail store items.
Load with Asynchronous Processing
A typical case is data being loaded into a table from an external source. Each piece of data may require a significant amount of processing, e.g., complex logic, business rules, analytic modeling, or XML shredding. Instead of processing the incoming data immediately upon arrival, it needs to be buffered in a table so that the data loading can be considered complete. Before queue tables, the data were processed directly by a group of stored procedures using a polling loop. The stored procedures shared the work, i.e., each piece of data was given to one of the stored procedures for processing. As in the Event Alert case, the polling loop is an inefficient mechanism to scan for incoming data.
Message Processing
When an airline needs to change a flight schedule, different systems need to know about it. The booking system handles rebooking transactions. Call centers work through a queue of customers who must be called. Before queue tables, the input of each of these systems would be modeled as regular base tables. Messages in one table represent schedule changes. Stored procedures reading that table could put messages into another table for the booking system and call centers. Call center agents are assigned to customers from the call center table. Application logic must ensure that messages are not dropped or duplicated and are processed in order. Messages must be deleted after they're processed from the message queue.
What is a Queue Table?
Teradata Database is differentiated from competitive implementations by the queue table functionality introduced in the Teradata Database V2R6. It supports the usual properties of a persistent database table, and now also the characteristics that enable queue-oriented applications. Teradata Database queue tables are implemented as a Teradata extension to ANSI SQL at the database/ table level. This provides greater flexibility, power, and functionality, and leverages the natural performance characteristic of Teradata Database.
The SQL CREATE TABLE statement is extended to support queue tables. The queue table definition is different from a standard base table in that a queue table always contains a user-defined insertion timestamp (QITS) as the first column of the table. The QITS contains the time the row was inserted into the queue table as the means for approximate FIFO ordering.
Teradata Database queue tables support asynchronous first-in-first-out (FIFO) push, pop, and peek queue operations:
- The FIFO push operation is defined as a SQL INSERT operation to store rows into a queue table.
- The FIFO peek operation is defined as an SQL SELECT operation to retrieve rows from a queue table without deleting them. This is also referred to as browse mode.
- The FIFO pop operation is defined as an SQL SELECT AND CONSUME operation to retrieve a row from a queue table and delete that selected row upon completion of the read. This is also referred to as consume mode. A consume mode request goes into a delayed state when a SELECT AND CONSUME finds no rows in the queue table. The request remains idle until an INSERT to that queue table awakens the request; that row is returned, and then it's deleted. Consumed rows are returned in FIFO order.
The default FIFO ordering of a queue table may be altered before consumption by altering the QITS. This is done with the SQL UPDATE, UPSERT form of UPDATE, or MERGE statements. Rows may also be removed from a queue table with the SQL DELETE.
Since a Teradata Database queue table is a persistent base table, the basic features for transactions are provided. Queue table transactions are isolated, atomic, and recoverable. Consumed rows are rolled back when a transaction containing a SELECT AND CONSUME does not complete successfully.
Queue table operations may be triggered. The triggered statements may access queue tables through SQL, an executed macro, or a stored procedure.
The Teradata Database maintains a memory cache on every PE to track information about, and the FIFO order of, non-consumed rows for each queue table. The cache is updated as rows are inserted, updated, deleted, dropped, restored, or consumed in the queue tables.
Event Processing with Queue Tables
Now we look at event processing cases explaining how queue tables are used to simplify logic and improve performance of event-driven applications.
Event Alerts are handled more efficiently when using queue tables. A SELECT AND CONSUME is used by the event processing application to retrieve the event rows in order. The oldest event row (based on the QITS value) from the queue table is the next row to be popped. If the event queue table is empty, the SELECT AND CONSUME request will wait for the table to be populated. When data arrive in the queue table, the SELECT AND CONSUME statement resumes processing, returning the result row to the application that processes the event. The event row that was retrieved is deleted from the queue table as part of the SELECT AND CONSUME.
The event detecting application populates the event queue table with a SQL INSERT operation. A trigger may be used to push rows into the queue table. The event processing application may repeat the pop sequence with another SELECT AND CONSUME statement to wait for more events.
Queue tables solve the complex and inefficient polling loop problem. A repeated SELECT statement is no longer needed. Numerous DML statements of the 'anyevent' stored procedure are replaced with a single SQL SELECT AND CONSUME statement.
SELECT AND CONSUME TOP 1 *
FROM eventtab;
Consider Load with Asynchronous Processing using queue tables. A load application buffers all incoming rows into a queue table with SQL INSERTs. Rows may arrive and be inserted at different rates. An asynchronous processing application pops the oldest data row using SELECT AND CONSUME and calls a series of stored procedures to perform the analysis on the loaded data. The time it takes to analyze the queue table data may be longer than the time it takes to load one incoming row. Hence, rows may back up in the queue table before they are analyzed. Using a queue table, SELECT AND CONSUME solves the polling loop problem in the asynchronous processing application.
Controlling the number of asynchronous processing applications can smooth out the resource consumption in response to varying insertion rates to the queue table. It is also possible to control the maximum rate of resource consumption and the maximum throughput rate. For example, during a short time period with a high inflow rate, the data are buffered in the queue table and processed by the stored procedures at their limited rate.
Now let's see how message processing in the airline booking system handles rebooking transactions when using queue tables. Messages representing booking changes are first pushed into a schedule change queue table using a triggered INSERT. Stored procedures waiting for a schedule change event to occur awaken from their SELECT AND CONSUME on the schedule change queue table and INSERT messages into other queue tables for use for the booking system or call centers. Data integrity is maintained when these operations are processed in a single transaction. For example, a row consumed from a schedule change queue table is rolled back when an INSERT to the booking or call center queue tables fails.
The booking system pops messages from a queue table of rebooking transactions using SELECT AND CONSUME. Similarly, the call center application will pop messages from a queue table of customers who need to be called. Call center agents are assigned customers by consuming from the call center queue table.
Application logic is simplified since messages, as queue table rows, are not dropped, duplicated, or processed out of order. Messages are automatically deleted after they are consumed from each queue table. The use of transactions when accessing the queue tables allows isolation and atomic operations in the message processing sequence.
Competitive Advantage
The queue table functionality in Teradata Database differs from competitive implementations in the following areas:
- Queue tables are persistent database tables with the properties of an asynchronous FIFO queue.
- Queue tables are administered with SQL DDL. The CREATE TABLE syntax is the only DDL extended to support the queue tables feature.
- Competitive implementations provide a Stored Procedure, or UDF SQL interface. In Teradata Database, queue tables are accessed with SQL DML. A new SELECT AND CONSUME statement provides the FIFO pop capability.
Conclusion
The new Teradata Database V2R6 Queue Tables feature simplifies the implementation of asynchronous, event-driven applications. Performance is improved by saving CPU and network resources.
Queue tables possess all the properties of regular Teradata Database tables: persistence, scalability, parallelism, ease of use, reliability, availability, recovery, and security. In addition, queue tables possess FIFO queue properties. These properties allow queue tables to provide flexibility, power, functionality, and leverage the natural performance characteristic of the Teradata Database.
With queue tables, the Teradata Database provides the capability to run real-time, event-driven active data warehouse activity while still having access to historical data providing the capability to make more effective business decisions in realtime processing environments.
David Craig is a senior member of the Teradata Corporation's technical staff. He has been with Teradata since 1990. During this time, David has architected and developed such features as Queue Tables, Enhanced Cultural Data Type Formatting, and International Character Set Support.