Examples Of Analytical functions present in Teradata.

Let’s work out some examples of OLAP functions that are being discussed in the previous articles. Before that make sure you have a table and there is data present init. Follow the below instruction to create a temporary table and insert data into it. While inserting data you might encounter one of the following errors:

3520:  A constant value in a query is not valid for column sale_month.

3535:  A character string failed conversion to a numeric value.

2665:  Invalid date.

All of these errors are due to the date value that you are inserting. If you encounter any of these errors, try permutation and combination of the date value and see if you can get rid of the error. If you still can not then find out from your sys admin what date format in allowed while you are inserting dates. Here in this example the format is ‘yyyy-mm-dd’. All of the below statements are tested in SQL assistant. Create a Volatile Table “Sales” as follows:

CREATE SET VOLATILE TABLE <username>.sales, NO FALLBACK,CHECKSUM = DEFAULT,LOG               (                “store_no” INTEGER,                “sale_month” DATE,                “prod_code” CHAR (20) CHARACTER SET LATIN NOT CASESPECIFIC,                “projected_sale” INTEGER,                “actual_sale” INTEGER                )PRIMARY INDEX ( “store_no” )ON COMMIT PRESERVE ROWS;

Insert data as follows:

INSERT INTO sales VALUES(10,’2007-01-01′,’Apple’,15000,13500);

INSERT INTO sales VALUES(20,’2007-02-01′,’Banana’,10000,11000);

INSERT INTO sales VALUES(30,’2007-03-01′,’Orange’,11000,10500);

INSERT INTO sales VALUES(40,’2007-04-01′,’Apricot’,12500,12500);

INSERT INTO sales VALUES(50,’2007-05-01′,’Grapes’,15000,12500);

INSERT INTO sales VALUES(60,’2007-06-01′,’Apple’,12000,11500);

INSERT INTO sales VALUES(70,’2007-07-01′,’Banana’,15500,11500);

INSERT INTO sales VALUES(80,’2007-07-01′,’Grapes’,15000,15500);

INSERT INTO sales VALUES(90,’2007-06-01′,’Orange’,16000,15500);

INSERT INTO sales VALUES(10,’2007-05-01′,’Apricot’,14500,13500);

INSERT INTO sales VALUES(20,’2007-04-01′,’Mango’,15500,12500);

INSERT INTO sales VALUES(30,’2007-03-01′,’Mango’,15000,12500);

INSERT INTO sales VALUES(40,’2007-02-01′,’Guava’,14000,11500);

INSERT INTO sales VALUES(50,’2007-01-01′,’Guava’,12000,10500);

INSERT INTO sales VALUES(30,’2007-02-01′,’Strawberry’,12000,10000);

Now take the example of AVG () function. As you know AVG () computes the cumulative or moving average of a column, we will calculate something similar in the following scenario.

Scenario is to find out the actual sales in each store, averaged over the current month and the preceding month. Here preceding month would be understood as the previous month to the current month, now what is current month, it is the first row returned after the result set is ordered after partitioning. Following is the query.

SELECT store_no, sale_month, actual_sale, projected_sale,AVG(actual_sale) OVER (PARTITION BY store_noORDER BY projected_saleROWS 1 PRECEDING)FROM sales;

And it would return the following result set.

store_no     sale_month    actual_sale    projected_sale  Moving Avg(actual_sale)10                5/1/2007       13,500          14,500          13,500.00

10                1/1/2007       13,500          15,000          13,500.00

20                2/1/2007       11,000          10,000          11,000.00

20                4/1/2007       12,500          15,500          11,750.00

30                3/1/2007       10,500          11,000          10,500.00

30                2/1/2007       10,000          12,000          10,250.00

30                3/1/2007       12,500          15,000          11,250.00

40                4/1/2007       12,500          12,500          12,500.00

40                2/1/2007       11,500          14,000          12,000.00

50                1/1/2007       10,500          12,000          10,500.00

50                5/1/2007       12,500          15,000          11,500.00

60                6/1/2007       11,500          12,000          11,500.00

70                7/1/2007       11,500          15,500          11,500.00

80                7/1/2007       15,500          15,000          15,500.00

90                6/1/2007       15,500          16,000          15,500.00

Note that the moving average number is changing every 2nd row in case there is a difference in between the 1st and 2nd row figure within the same store number. Particularly for the store number 30, you can see that there is a slight increase in the moving average in spite of good increase in actual sales. Projected Sales is also added just to have an idea how the actual sale is performing over the project sale.

Advertisement

  1. Rastislav

    The scripts did not work for me initially; I had to create MULTISET table as I was getting error due to duplicates… Is TD checking only PRIMARY INDEX value or whole row?

    I got this when I created SET table:
    Code = 2802.
    Statement 1 – 2802: Duplicate row error in RSTANA.sales.
    Output directed to Answerset window

    changing table to MULTISET worked

    When are you getting those useful info? I am new to TD and hungry for info. Thanks for useful blog! Keep it up!

  2. Rastislav

    one more note:

    Particularly for the store number 30, you can see that there is a slight increase in the moving average in spite of good increase in actual sales.

    sounds incorrect; I would say there is a [b]decrease[/b] in moving average despite increase in actual sales

  3. readvitamin

    Hi Rastislav,

    Thank you for leaving a reply.

    First let’s see what are SET and MULTISET.

    These two claues are used to create Table in Teradata with characterstics to
    allow duplicate or not. If you specify SET, which adheres to relational data model,
    it will not permit duplicate rows in a table. On the other hand, MULTISET tables
    follows multidimensional data model and allows duplicate rows in a table.

    So what happens if you do not use either….it depends on the session.

    If your current session is ANSI, then by default clause would be SET.
    Otherwise if the session is Teradata, the default clause would be MULTISET.

    Secondly, regarding “increase in the moving average”.

    When I said increase, I meant increase from last month. As you see the actual sales in Feb’07
    is 10,000 and the moving avg is 10,250 and the following months have 10,500 and 12,500 as actual sales and the MAVG is 10,500 and 11,250 respectively. As I have ordered the result set by “projected sales”
    the figures are a bit confusing. But I appreciate your minute observation. Ordering the results with “sale month” as in the following query, would give a clear picture. Hope this helps.

    SELECT store_no, sale_month, actual_sale, projected_sale,
    AVG(actual_sale) OVER (PARTITION BY store_no
    ORDER BY sale_month
    ROWS 1 PRECEDING)
    FROM sales;

  4. Rastislav

    How do I know if my session is ANSI or Teradata? Is there a way to change it? (ALTER SESSION or sthg. similar?)

  5. readvitamin

    Check with the ‘help session’ command
    and look for the value of
    Transaction Semantics.




Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s



Follow

Get every new post delivered to your Inbox.