Manage risk factors with Teradata Support Services.
by Aaron Dalton and Amy Rukamp
The trend is clear—mainstream data warehousing customers are migrating to and developing mission-critical applications in their Teradata environment. In addition
to driving technical requirements such as data freshness, near real-time access and mixed-workload management, mission-critical applications require an evaluation
of support needs in order to minimize the impact of planned and unplanned system outages.
While reactive support may have provided reasonable response times for strategic reporting and ad hoc queries, mission-critical workloads require proactive
support—the ability to detect incidents before they impact users. To address these needs, Teradata created Business Critical Support, a program designed for
systems that will experience continued growth and evolve toward the highest levels of availability.
Teradata Business Critical Support includes several features and services that work together to prevent costly downtime and promote system availability. These
features include:
|
An experienced and reliable support team that clearly defines a support process and plan to ensure requirements are met |
|
A monthly software patch review and change control management to guarantee appropriate software patches are recommended and installed as necessary |
|
Automatic availability reporting and routine system health checks that report on hardware conditions, identify components impacting availability and highlight problem areas requiring corrective action |
In the end, Teradata Business Critical Support promotes implementing an ounce of prevention, rather than sacrificing a pound of cure.
An ounce of prevention
When patches are not closely monitored and inappropriate or misapplied patches are installed, a system is left vulnerable to unplanned downtime as well as
inaccurate or incomplete testing and documentation. By providing monthly recommendations and critical patch reviews, patch management can play a crucial role in
preventive maintenance. In concert with an assigned support team, patch management provides documented change control plans for hardware and software patches as
well as minor releases and maintenance releases.
Ongoing feedback is a management requirement, and system management is no exception to this rule. Teradata Business Critical Support customers receive valuable
proactive and preventive reports on how their systems are performing. The system health check analyzes error logs to identify patterns and trends, as well as
hardware conditions that may require corrective actions. A Teradata Support representative reviews this on a bi-weekly basis to assist customers in minimizing any
impact to system availability.
Another proactive and preventive measure available only to Teradata Business Critical Support customers is system availability reporting. With this feature, users
can track database availability by site ID and date for both planned and unplanned downtime. It also identifies components impacting availability, highlights
problem areas requiring corrective action and enables users to make decisions based on trend analysis.
To tie everything together, a proactive support approach requires dedicated personnel. Teradata offers one-source support: one team that knows the system inside
and out; one team that schedules and conducts changes. This knowledgeable and experienced Teradata support team utilizes proprietary tools to recognize patterns
of potential system problems and take corrective action to prevent them.
Proactive support software available to all customers
Teradata Vital Infrastructure, the next generation of the Customer Care Link, is built-in support software available to all customers. This software plays a key
role in proactive, preventive and predictive support by continually collecting, retaining and analyzing information about a customer's system. When a fault event
is detected, the event data is recorded, automatic incident reports are created and alert notifications are sent to the Teradata support staff and tracked.
The software can solve some problems without human intervention, much like how an anti-virus software program automatically eliminates hidden viruses. However, if
Teradata Vital Infrastructure is unable to resolve a problem on its own after recognizing that the customer's system is reaching a dangerous instability threshold,
it will issue an automatic alert to support personnel.
In any of these cases, the diagnostic information collected by Teradata Vital Infrastructure assists support personnel in identifying and quickly resolving
problems; conversely, omission of this proactive software may prolong the time it takes to resolve incidents. In fact, internal Teradata studies indicate that
it uncovers 62% to 70% of all system incidents.
To take full advantage of Teradata Vital Infrastructure, customers must implement remote monitoring services, such as a secure Virtual Private Network. These
monitoring services greatly reduce incident resolution time by providing multiple support personnel with the ability to work on high-priority incidents at the same
time.
To ensure data cannot be accessed or modified as it crosses the network, Teradata provides industry-standard security transport protocols: 128-bit encryption,
Secure Socket Shell (SSH) and Secure File Transfer Program (SFTP). Furthermore, Teradata support personnel can only view the state and event data, not the customer
or network data. This level of database access is controlled at each customer site.
Proactive support with an interface
Another service available to all Teradata support customers that reinforces individually paced self-help and self-learning is Teradata @ Your Service. Many customers
with mission-critical systems log into Teradata @ Your Service, whether it is to review technical product documentation and performance guides, search the newly
improved knowledge base to find solutions to problems, join a discussion forum or check the status of an incident. The Web portal provides easy access to a
plethora of pertinent information and is available to customers at all support levels. (See "The search is on" below.)
Growth and market demands are helping to push more and more systems into becoming mission critical. With more riding on the bottom line, mission-critical customers
are taking advantage of the full suite of proactive, preventive and predictive features and services available through Teradata Business Critical Support to
further reduce the risk of degraded performance and downtime. Teradata Business Critical Support has the capacity to grow with businesses and provide them with
system stability and peace of mind. T
| The search is on |
|
When David Belcher, Teradata Database Analyst (DBA) team leader at British Airways, was frustrated with the unpredictability and the
length of time it took to back up large tables of data, he turned to the Teradata @ Your Service Web portal. He knew about the cluster
dumps solution, but was under the impression it wasn't configured to work on the British Airways system. After a Teradata backup
specialist informed him that this solution could, in fact, be transferred to their Disaster Recovery system, Belcher searched the Web
portal and found an article with supporting information. With this newfound knowledge, Belcher implemented a strategy that reduced the
time it took to back up the table from 18 hours to six hours.
It's not the first time Belcher has benefited from the knowledge base
stored within the Teradata @ Your Service online site. Belcher often utilizes its easy-to-access performance guides and information on
tools like Teradata Priority Scheduler that impact his system's operation.
"I am interested in having up-to-date information on software release dates and de-support dates," says Belcher. "[Having this knowledge]
can be helpful in planning the exact dates and times of software upgrades." With his findings on Teradata @ Your Service, Belcher
seldom needs to request information from his Teradata Account Manager. "The information contained in the roadmaps relates to all
customers looking to upgrade, so it's much better if it is fully available to all, rather than each site having to request the same
info from their account manager."
| enlarge |
|
The new Teradata @ Your Service search engine helps users find what they need faster and more efficiently.
|
|
Of course, any informational online site is useful only to the extent that it can be easily accessed and manipulated. Belcher says that
the search engine within Teradata @ Your Service easily clears this hurdle with its significantly improved use of its natural search
technology. After typing in search words or phrases, Belcher is instantly rewarded with numerous results. Within each result, multiple
lines are displayed and the search words highlighted, making it easier to determine the usefulness of the returned items.
Furthermore, Belcher says he appreciates the ability to conduct secondary searches within search results—winnowing down, for
instance, a 300-plus item return on the word "partition" to just 16 items with a secondary search on the word "index." Belcher
notes that the relevancy ratings and the categorization of search returns into product areas and subsets also helps him get the
most out of Teradata @ Your Service.
Meanwhile, Teradata developers are constantly working to make Teradata @ Your Service even more useful. For instance, searchable manuals,
including Orange Books, will be accessible through the proactive platform in the future. This new feature will make it easier than ever
for users to efficiently find the product information that they are looking for without having to download an entire document.
—A.D.
|
|
| Teradata Vital Infrastructure enables success |
|
Major retailers "A" and "B" are competitors with comparable systems. Both will be launching a multi-tiered promotional marketing
campaign that will include:
| > |
Promotional product pricing on select retail items |
| > |
National 30-second ad spots opposite evening network news and syndicated shows |
| > |
Radio ads in specific geographic markets |
| > |
Full-scale consumer print campaign, with coupons, in major daily newspapers across the country |
| > |
Direct-mail effort, with coupons, targeting roughly 500,000 to 800,000 consumers |
| > |
Full-scale national press relations tour to coincide with scheduled promotions |
Each promotional component is scheduled to launch at specific dates and times to support an in-store pricing promotion. The failure to
launch any part of the campaign could mean a loss of customers and revenue. Both retailers' campaigns are dependent upon holiday
promotional pricing, and each retailer needs to complete its final analysis to arrive at the discounted pricing.
The final analysis includes different queries to the database to determine current inventory levels, product pricing, market-basket
analysis and year-over-year comparison at the individual store level. Business analysts from both retailers are charged with pulling
the latest numbers and assigning the final promotional pricing.
The tables (shown below) illustrate how retailers A and B handle an unexpected problem and how Teradata Support
Services enables one retailer to meet its deadline. The timeline shown begins as both business analysts attempt to
run a query. Each analyst faces the same fault, but one resolves the problem more quickly and moves forward with
pricing his or her promotional items.
—A.R.
|
|
Major Retailer "A"
System: 56 Node, 5380 co-existent system
Teradata Vital Infrastructure: Active
|
9:30 a.m.
|
Teradata Vital Infrastructure detects multiple degraded internal hard drives, alert sent to Auto Incident Create (AIC).
|
|
9:32 a.m.
|
AIC opens an incident and reports degraded internal drives and marks "OUT OF SERVICE." Also recognizes that a pattern of alerts is developing and
events need to be "tagged" to show their relationship.
|
|
9:33 a.m.
|
AIC e-mails specific incident details to Teradata Customer Support Rep (CSR):
|
Vdisk path failed due to I/O errors
|
|
Items taken out of service
|
|
Check stream logs for corresponding SCSI errors
|
|
May be problem with internal mirrored disk drives in the node
|
|
|
9:44 a.m.
|
CSR dials into the system via secure VPN and views details of the alerts.
|
|
9:45 a.m.
|
CSR is able to discern with report details that the database is being affected by the disk failures and follows the software's recommended action of
checking stream logs for SCSI errors and checking internal mirrored disk drives in the node.
|
|
9:48 a.m.
|
CSR engages Teradata Global Support Center (GSC) and determines that the internal disk drives need to be replaced. CSR determines spares are onsite.
|
|
9:53 a.m.
|
CSR notifies customer of the problem and the estimated time to fix is roughly one hour.
|
|
9:58 a.m
|
CSR notifies GSC that spares are on site and the new set of drives will be installed once the prep work is completed, with scheduled maintenance
activity at 12:30 p.m.
|
|
10:15 a.m.
|
Prep work completed for drive replacement.
|
|
12:30 p.m.
|
CSR replaces drives and incident is closed.
|
|
12:35 p.m.
|
CSR notifies customer that internal drives have been replaced and the incident is closed.
|
|
12:35 p.m.
|
Business analyst is able to complete analysis and appropriately price promotional items.
|
Retailer "A" is able to successfully complete final analysis and price promotional items prior to the scheduled launch of the promotional marketing campaign.
Major Retailer "B"
System: 40 Node, 5400
Teradata Vital Infrastructure: Inactive
*This scenario assumes customer is running an internally developed, custom-script every 10 minutes for viewing the fault log.
|
9:30 a.m.
|
Series of internal drives are failing and fault alerts are being sent to a fault log; notice sent to internal help desk: Vdisk path failed and marked
"OUT OF SERVICE."
|
No incident opened via Teradata Vital Infrastructure/Auto Incident Create (AIC)
|
|
No page sent to Customer Support Rep (CSR)
|
|
|
9:40 a.m.
|
Internal help desk analyst, Anne, is alerted via internal custom-script and opens an internal incident.
|
|
9:45 a.m.
|
Anne manually submits incident to internal service technician.
|
|
9:55 a.m.
|
Internal service technician begins following internal procedures to research failure.
|
|
10:15 a.m.
|
Internal service technician remotely dials into Administrative Work Station (AWS). Teradata Customer Services is still not engaged at this point.
|
|
10:17 a.m.
|
Duplicate fault alert comes in and is routed to internal help desk analyst Bob. Bob is unaware of first alert and moves to manually open another incident.
|
|
10:20 a.m.
|
A third internal help desk analyst, Caitlin, receives an alert regarding a database problem. Collectively, the analysts realize there is a problem but
do not know the incidents are related. Meanwhile, the failing disks are continuing to send alerts.
|
|
10:30 a.m.
|
Anne initiates conference call with Bob and Caitlin to engage Teradata Global Support Center (GSC).
|
|
10:45 a.m.
|
Business analyst, Tom, calls internal help desk regarding his unanswered queries and indicates he is looking for an estimated "fix" time for the
problem. Tom also reiterates to the help desk the time urgency of the queries.
|
|
10:47 a.m.
|
Anne lets the Tom know that the problem is being investigated as a priority and they will notify him as soon as they discover the root of the problem
and have an estimated fix time.
|
|
11:00 a.m.
|
Customer does not have a detailed report on where to start looking for the root of the problem; GSC and internal analysts are forced to start hunting
for symptoms.
|
|
4:30 p.m.
|
It is discovered that internal disk drives have failed and need to be replaced.
|
|
4:45 p.m.
|
GSC is alerted that spares are on site and the new set of drives will be installed after prep work is completed. CSR schedules maintenance activity and
change control for 7:30 p.m. that evening.
|
|
5:15 p.m.
|
Prep work is completed for drive replacement.
|
|
7:30 p.m.
|
Work begins on replacing the drives.
|
|
9:30 p.m.
|
Drives are replaced; incident is closed.
|
|
9:35 p.m.
|
Message is left for Tom that queries are ready to run.
|
Retailer "B" is able to complete analysis and promotional pricing by noon the following day. The marketing campaign launched at 6 a.m., which led to embarrassment,
loss of customers and revenue for the retailer; not to mention frustrated consumers.
Aaron Dalton, a New York-based writer, covers business and technology for publications including Popular Mechanics, Wired News and Linux Executive Report.
Amy Rukamp is a Teradata Customer Services Marketing Manager with more than 10 years of product and marketing experience.
Teradata Magazine-March 2007
|