Q4. Chapter 3: Data Pre-processing [ PLO S3/CLO 2.1/502] [6.5 marks] Redundancy is an important issue...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Q4. Chapter 3: Data Pre-processing [ PLO S3/CLO 2.1/502] [6.5 marks] Redundancy is an important issue in data integration. An attribute may be redundant if it can be "derived (obtained)" from another attribute or set of attributes. Some redundancies can be detected by correlation analysis between attributes. For nominal data, we use the 2 (chi-square) test which assesses how one attribute's values vary from those of another. x - -25600 25600 6 1. What hypothesis does the x2 (chi-square) test? 2. Considering the following table, compute the x2 value. Answer: Like science fiction Not like science fiction Sum(col) 121.904 Answer: Male Female 250 (90) 200 (360) 50 (210) 1000 (840) 1200 300 3. Explain the meaning of a large x2 value. Answer: Sum (row) 450 1050 1500 (Observed - Expected) Expected [1.5 marks] (Expected count) 200x130-90* 1500 160 (250+ 90) -210) 210 [2 marks] +(5/70 + -160 (200-360) 360 160 (1000-940) 840 576 121.90 71 mark] 4. What is the alternate name used by Rapid Miner software tool to designate a Boxplot diagram? [2 Marks] am and to elevant blem and t ir relevant te a comp computin rogram's Score Q4. Chapter 3: Data Pre-processing [ PLO S3/CLO 2.1/502] [6.5 marks] Redundancy is an important issue in data integration. An attribute may be redundant if it can be "derived (obtained)" from another attribute or set of attributes. Some redundancies can be detected by correlation analysis between attributes. For nominal data, we use the 2 (chi-square) test which assesses how one attribute's values vary from those of another. x - -25600 25600 6 1. What hypothesis does the x2 (chi-square) test? 2. Considering the following table, compute the x2 value. Answer: Like science fiction Not like science fiction Sum(col) 121.904 Answer: Male Female 250 (90) 200 (360) 50 (210) 1000 (840) 1200 300 3. Explain the meaning of a large x2 value. Answer: Sum (row) 450 1050 1500 (Observed - Expected) Expected [1.5 marks] (Expected count) 200x130-90* 1500 160 (250+ 90) -210) 210 [2 marks] +(5/70 + -160 (200-360) 360 160 (1000-940) 840 576 121.90 71 mark] 4. What is the alternate name used by Rapid Miner software tool to designate a Boxplot diagram? [2 Marks] am and to elevant blem and t ir relevant te a comp computin rogram's Score
Expert Answer:
Related Book For
Probability And Statistics For Engineers And Scientists
ISBN: 9780495107576
3rd Edition
Authors: Anthony Hayter
Posted Date:
Students also viewed these programming questions
-
The clarity test is an important issue in Exercise 3.11. The weather obviously can be somewhere between full sunshine and rain. Should you include an outcome like cloudy? Would it affect your...
-
The knowledge of soil behavior is an important issue in civil engineering. When soil is subjected to a load, there is a change in the volume of the soil due to drainage of water. A consolidation test...
-
Which of the theoretical approaches are the most useful in explaining your organisations current IHRM policies and practices?
-
To test H0: = 35 versus H1: 35, a random sample of size n = 15 is obtained from a population that is known to be normally distributed. (a) If the sample standard deviation is determined to be s =...
-
Employee, a skydiving instructor, generally informs his customers that he is gay so that his female customers will not feel awkward when he is strapped to them. A females husband called the company...
-
Determine whether each of the following is true or false. In each case, assume \(P\) is located at \(t=0\) and \(F\) is located at \(t=n\), and the \(A\) s are spread uniformly over the planning...
-
Consider a three- firm supply chain consisting of a retailer, manufacturer, and supplier. The retailers demand over an 8- week period was 100 units each of the first 2 weeks, 200 units each of the...
-
Question 17 (1 point) If the credit to record the payment of an account payable is not posted Liabilities will be understated Expenses will be understated Cash will be overstated Which statement is...
-
If financial markets are efficient, which of the following is true? A. All public information is rapidly reflected in stock prices, so there are no profit opportunities that can be forecast. B. There...
-
4. The dipole moment of the HCl molecule is measured to be about 3.4 10-30 Cm. (a) If you assume the dipole can be modeled as H+ and Cl ions separated by a distances, what is s? (b) The bond length...
-
"In what ways do symbolic interactionists explore the nuanced dynamics of micro-level interactions and the construction of shared meanings to comprehend the formation and perpetuation of social...
-
n what ways do globalization and technological advancements impact social identities and cultural practices, and how do individuals and communities negotiate their sense of belonging and cultural...
-
ACC 111 Project Information - Project 1 - Why Accounting/Business/Human Resources, etc. This Project will consist of a discussion of why you chose your current major. In your conclusion, state the...
-
2.4. What is the intensity of radiation emitted by a hot desert (330 K) relative to that emitted by the strato- sphere over the South Pole during July (190 K)?
-
- The lateral stability quartic for and airplane is:- 24 + 1623 + 13.122 + 9.81 + 0.73 = 0 Extract the roots of this quartic. Obtain the time to double or halve the amplitude and period of the...
-
Which of the ocean zones shown would be home to each of the following organisms: lobster, coral, mussel, porpoise, and dragonfish? For those organisms you identify as living in the pelagic...
-
Consider again the random variable described in Problems 2.2.2 and 2.3.10 with a probability density function of F(x) = 1/x In (1.5) for 4 < x < 6 and f(x) = 0 elsewhere. (a) What is the variance of...
-
Recall Problem 3.1.9. in which a company receives 60% of its orders over the Internet. Within a certain period of time: (a) What is the probability that the fifth order received is the first Internet...
-
An experiment to compare k = 4 factor levels has n1 = 12 and = 1 = 16.09, n2 = 8 and 2 = 21.55, n3 = 13 and 3 = 16.72. and n4 = 11 and 4 = 17.57. The total sum of squares is SST = 485.53. Compute the...
-
Describe an example of resistance to change that you have observed. Why did it occur?
-
Discuss: The best organizational structure to generate innovative ideas might not be the best structure to implement those ideas.
-
You have been charged with staffing and organizing an R&D group in a new high-tech firm. What will you do to ensure that the group is innovative?
Study smarter with the SolutionInn App