When computing a cube of high dimensionality, we encounter the inherent curse of dimensionality problem: There exists

Question:

When computing a cube of high dimensionality, we encounter the inherent curse of dimensionality problem: There exists a huge number of subsets of combinations of dimensions.

a. Suppose that there are only two base cells, \(\left\{\left(a_{1}, a_{2}, a_{3}, \ldots, a_{100}ight)ight.\) and \(\left.\left(a_{1}, a_{2}, b_{3}, \ldots, b_{100}ight)ight\}\), in a 100-D base cuboid. Compute the number of nonempty aggregate cells. Comment on the storage space and time required to compute these cells.

b. Suppose we are to compute an iceberg cube from (a). If the minimum support count in the iceberg condition is 2 , how many aggregate cells will there be in the iceberg cube? Show the cells.

c. Introducing iceberg cubes will lessen the burden of computing trivial aggregate cells in a data cube. However, even with iceberg cubes, we could still end up having to compute a large number of trivial uninteresting cells (i.e., with small counts). Suppose that a database has 20 tuples that map to (or cover) the two following base cells in a 100-D base cuboid, each with a cell count of \(10:\left\{\left(a_{1}, a_{2}, a_{3}, \ldots, a_{100}ight): 10,\left(a_{1}, a_{2}, b_{3}, \ldots, b_{100}ight): 10ight\}\).

i. Let the minimum support be 10 . How many distinct aggregate cells will there be like the following: \(\left\{\left(a_{1}, a_{2}, a_{3}, a_{4}, \ldots, a_{99}, *ight): 10, \ldots,\left(a_{1}, a_{2}, *, a_{4}, \ldots, a_{99}, a_{100}ight): 10ight.\), \(\left.\ldots,\left(a_{1}, a_{2}, a_{3}, *, \ldots, *, *ight): 10ight\}\) ?

ii. If we ignore all the aggregate cells that can be obtained by replacing some constants with \(*\) 's while keeping the same measure value, how many distinct cells remain? What are the cells?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  answer-question

Data Mining Concepts And Techniques

ISBN: 9780128117613

4th Edition

Authors: Jiawei Han, Jian Pei, Hanghang Tong

Question Posted: