BITS WILP Data Warehousing Mid-Sem Exam 2017-H1
Data Warehousing (SSZG515) EC2R SEM2 2016-17 (Closed Book)
Nature of Exam : Closed Book
Data Warehousing (SSZG515) EC2R SEM2 2016-17 (Closed Book)
Birla
Institute of Technology & Science, Pilani
Work-Integrated
Learning Programmes Division
Second
Semester 2016-2017
Mid-Semester
Test
(EC-2
Regular)
Course No. : SS ZG515
Course Title : DATA WAREHOUSING
Nature of Exam : Closed Book
Weightage : 35%
Duration : 2 Hours
Date
of Exam : 26/02/2017 (AN)
No. of pages: 1
No. of quesetions: 6
Note:
1.
Please follow all the Instructions
to Candidates given on the cover page of the answer book.
2.
All parts of a question should be
answered consecutively. Each answer should start from a fresh page.
3.
Assumptions made if any, should
be stated clearly at the beginning of your answer.
Q.1. What
are the advantages and disadvantages of having finest granularity data in the
data warehouse and data marts? [6]
Q.2.
Why
it is recommended that we maintain separate systems for OLTP and Data
Warehousing in an organization? Does it make sense to have a single system?
Justify your answer. [6]
Q.3. Design
a data mart for analyzing attendance of students in courses in BITS - a
multi-campus university. Show all the dimensions involved and the arrange them
in a star schema. Show some examples tuples from fact table, assuming finest
granularity. [6]
Answer 3:
Data Warehouse Design
Steps:
Step
1: Identify the Business Process
Step
2: Declare the Grain
Step
3: Identify the Dimensions
Step
4: Identify the Facts
Q.4.
(a)
Explain how surrogate keys help in saving space in a Data Warehouse.
(b) Why there is
a need to store the natural keys along with the surrogate keys in the dimension
tables? [3 + 3 = 6]
Q.5.
Discuss
the role of views in a data warehouse environment. Give three situations where
we use views. [6]
Answer 5: (From L7-P1-Lec11)
Dimension
Role Playing:
l A single table
that plays multiple roles (using views) to create synonym dimension attributes.
l Most common role
playing dimension is the Date Dimension. I.e. separate role playing dimensions
for order date and ship date.
l What to do when
a single dimension appears several times in the same fact table?
l Consider a fact
table to record the status and final disposition of a customer order
l Dimensions of
this table could be Order Date, Packaging Date, Shipping Date, Delivery Date,
Payment Date, Return Date, Refer to Collection Date, Order Status, Customer,
Product, Warehouse, and Promotion.
l Note that the first
7 dimensions are all time
l 7 FKs from the
FT to the time dimension!!
l We can not join
these 7 FKs to the same table
l SQL would
interpret such a seven-way simultaneous join as requiring that all of the dates
be the same
l Is this what we
want?
l We need to make
SQL believe that there are 7 independent time dimension tables
l The column
labels in each of these tables should also be different!
l WHY?
l We will not be
able to tell the columns apart if several of them have been dragged into a
report
l How can we do
this?
l We cannot
literally use a single time table
l But we still
want to build & maintain single time table behind the scenes
l Create an
illusion for the user
l Make 7 identical
physical copies of the time table
l Make 7 “virtual”
copies using the SQL’s SYNONYM command
l Once these
clones are in place, we still need to define a SQL view on each copy in order
to make the field names uniquely different.
Use of ‘View’ as ‘Role
playing Dimension’:
…
Another
use of ‘Views’:
Figure
20-1 Cube and Views by Different Users
…
Q.6.
What
are mini-dimensions? Give two situations which warrant the use of
mini-dimensions. [5]
Answer
6:
Mini-Dimensions
¡ Single technique
to handle browsing-performance & change tracking problems
¡ Separate
out frequently analyzed or frequently changing attributes into a separate
dimension, called mini-dimension
¡
Demographic
Key AGE GENDER INCOME LEVEL
1 20-24 M < 20000
2 20-24 M 20K-24999
3 20-24 M 25K-29999
18 25-29 M 20K-24999
10 25-29 M 25K-29999
¡ Mini-dimension
can not be itself allowed to grow very large
¡ 5
demographic attributes
¡ Each
attribute can take 10 distinct values
¡ How
many rows in mini-dimension? Ans: 10,000
¡ Separate
out a package of demographic attributes into a demographic mini-dimension
¡ Age,
gender, marital status, no. of children, income level, etc.
¡ One
row in mini-dimension for each unique combination of these attributes.
¡ Example:
…
•
Advantages
– frequent ‘snapshoting’ of
customers profiles with no increase in data storage or data complexity
•
Drawbacks
– the demographic attributes are
clumped into banded ranges of discrete values (it is impractical to change the
set of value bands at a later time)
– the demographic dimension itself
can not be allowed to grow too large
– slows down the browsing
•
What if the fact table (connecting the demographic mini-dimension with the
customer dimension) is sparse?
•
What to do if the fact table (connecting the demographic mini-dimension with
the customer dimension) is sparse?
– Define a demographic transaction
event, i.e., introduce a new fact table
or
– Add a current demographic key to
the customer dimension table
********
Tags: BITS Pilani, Work Integrated Learning Program (WILP), Previous Year Question Papers, Data Warehousing
Tags: BITS Pilani, Work Integrated Learning Program (WILP), Previous Year Question Papers, Data Warehousing
No comments:
Post a Comment