Question: This question requires knowledge on Data Analytics . Please show clear and detailed explanations. Someone, please help to assist. I had to repost as the

This question requires knowledge on Data Analytics. Please show clear and detailed explanations.

Someone, please help to assist. I had to repost as the previous answers for parts (b) and (c) were not answered appropriately.

For details on description and predictive data mining to answer (b) and (c), please refer to the screenshots below.

This question requires knowledge on Data Analytics. Please show clear and detailed

explanations. Someone, please help to assist. I had to repost as the

previous answers for parts (b) and (c) were not answered appropriately. For

details on description and predictive data mining to answer (b) and (c),

Shared e-scooter has emerged as an affordable transportation means for short-distance trips. It also helps alleviate traffic congestion by reducing the number of trips made by vehicles. However, e-scooter sharing service providers face numerous operational challenges. One of the challenges is to ensure that riders can always rent and park the e-scooters at stations and the e-scooters can be sufficiently charged before the next trip. Ideally, at the start of a trip, stations should have sufficiently charged e-scooters for riders to rent; at the end of a trip, stations should not be full so that riders can park the e-scooters. From an operational perspective, this requires service providers to send trucks to redistribute e-scooters from full stations to empty stations so as to have balanced stations. Some providers have been spending a lot of money to perform overnight charging and re-allocation of e-scooters among stations. Assume that a dataset is collected from an e-scooter sharing service provider to perform data mining in an attempt to generate useful insights to solve the abovementioned rebalancing problem. The dataset contains details of each e-scooter trip made by individual riders. The variables in the dataset are described in Table 1. a) Give one (1) example of data quality issues that may potentially exist in the dataset described in Table 1 and propose a solution for it. (20 marks) Based on the dataset described in Table 1, give one (1) example of descriptive data mining and discuss how it might generate useful insights to solve the rebalancing problem. (20 marks) Based on the dataset described in Table 1, give one (1) example of predictive data mining and discuss how it might generate useful insights to solve the rebalancing problem. (20 marks) 1) Suggest two (2) additional variables (not in Table 1) that could be included in the analysis for solving the rebalancing problem. Explain the rationale of their inclusion and describe how you can collect them (e.g., deriving from existing variables, integrating other data sources, etc). (20 marks) When the rebalancing problem is not handled properly, one of the consequences is that some riders do not return the e-scooters to the designated stations if there is no empty parking slot at their desired end stations. Describe how this would bring negative impacts to the society. Descriptive data mining involves exploration of patterns and relationships that may exist in data. The objective is to understand what happened in the past and gain insights from the past. Basic descriptive functions include summarisation, association and clustering (Kksal et al., 2011). Summarisation is the presentation of general characteristics of a dataset. Descriptive statistics and graphical displays fall into this category. A variety of numerical measures, such as mean, median, mode, percentiles, range and standard deviation, are useful to summarise the data. For example, they can show information from total stock inventory to the progress of sales figures over years. Furthermore, data visualisation techniques can be used to reconfigure the data into easily-interpretable forms such as pie charts, scatter plots and histograms, making it easier to uncover patterns that lead to insights. Summarisation can be viewed as a starting point to inform or prepare data for further analysis down the line. Association is the identification of correlations among various variables from the dataset. Two variables are positively associated when the values of increase with the values of the other. They are negatively associated when the values of one decrease with the values of the other. For instance, income and educion are usually positively associated; student absenteeism is generally negatively associated with student achievement. Association can also be used to detect frequently occurring patterns between items. Assuming that a database consists of a set of records which contain a set of "items", most association algorithms identify correlations based on the co-occurrence of the items. For example, they are useful in detecting what products (i.e., items) are frequently purchased together by customers in a supermarket. Clustering is the grouping of data into classes of similar objects. The similarity among objects is usually measured by distance measures. The goal of clustering is that objects in a group will be similar to one another and different from objects in other groups (Agyapong et al., 2016). For instance, clustering can be used to divide customers into various groups based on historical spending behaviours. Each cluster may represent an individual target group for marketing. In short, descriptive data mining focuses on what has already happened in the past. It is useful when one needs to describe at an aggregate level what is going on in business operations, learn from the past behaviours and identify areas for business improvements. Generally, major predictive data mining methods can be classified into three groups: statistical-based, decision-tree based and neural network-based methods. Statistical-based methods use classical techniques which depend on statistics theory. Regression analysis is a statistical methodology that is most often used for numeric prediction. Decision-tree based method is a flowchart-like tree structure. Following a path of the tree generates "if-then" prediction rules that can be easily interpreted. Neural network-based methods consist of a set of connected input-output units each having a weight, which is updated by a learning algorithm. There are many other methods for constructing predictive models. This course focuses on decision trees for predictive data mining. The term prediction refers to both numeric prediction (i.e., estimation) and class label prediction (i.e., classification). The difference between classification and estimation lies in the type of the target variable (i.e., the variable to be predicted), which is either categorical or continuous. Classification refers to the prediction of a target variable that is categorical in nature. Examples of classification include predicting whether a transaction is fraud or non-fraud, whether a customer is a buyer or a non-buyer, and whether a loan approval is high-risk, medium-risk or low-risk. In classification, a value of the target (e.g., fraud, non-fraud) is called a class. Classification is the process of finding a model that distinguishes classes. A classification model is derived based on analysis of a set of training data (i.e., data objects for which the class labels are known). It is used to predict the class label of objects for which the class label is unknown. Estimation refers to the prediction of a target variable that is quantitative (i.e., continuous) in nature. Examples of estimation include predicting the amount a customer spends in an order, the duration of an international call a customer makes. The most common form of an estimation model is linear regrion variable is estimated at the point on which the line corresponds to the values of the independent variables

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

i want complete solution for my assignment and it should be without plagiarism COIT20274: Information Systems for Business Professionals, Term One 2016 Assignments 1 & 2 Requirements Assignment 1 -...

JBR-07575; No of Pages 12 Journal of Business Research xxx (2012) xxx-xxx Contents lists available at SciVerse ScienceDirect Journal of Business Research Organizational innovation as an enabler of...

Please read the following Harvard Business Review - MIT Sloan Management Review Article: Why IT Fumbles Analytics by Donald A. Marchand and Jose Peppard Harvard Business Review, January - February...

Please find the assignment on page 30 of the attached file(unit outline), which is: Task 2 - Research Focused Project - 25% I need $4000 words for this research paper. Please Use Academic journal...

Confirming Pages C H A P T E R 19 Analyzing Information and Writing Reports Chapter Outline Using Your Time Efficiently Analyzing Data and Information for Reports Identifying the Source of the Data...

A vendor is recommending a program to make supervisors better at 'dealing with difficult conversations' at work. How would you apply the concepts of optimisation and the Kirkpa- trick model to set up...

Is it possible to estimate the ROI of training for all training programs? Which are more or less susceptible to this calculation? 6 TRAINING AND DEVELOPMENT CFO asks CEO, 'What happens if we invest...

7 Writing in the Workplace If writing must be a precise form of communication, it should be treated like a precision instrument. It should be sharpened, and it should not be used carelessly. Theodore...

Repeat Exercise 3 using the Jacobi method. Repeat Exercise Use the QR Algorithm to determine, to within 105, all the eigenvalues for the matrices given in Exercise 1. In exercise a. b. c. d. e. f....

Assume Down.com was organized on May 1 to compete with Despair.com a company that sells de- motivational posters and office products. The following events occurred during the first month of Down.com...

Martinez Company issues a 4 - year, \ ( 7 . 7 \ % \ ) fixed - rate interest only, nonprepayable \ ( \ $ 1 , 0 8 0 , 0 0 0 \ ) note payable on December 3 1 , 2 0 2 4 . It decides to change the...

Tones acquire meaning Multiple Choice All of the choices are correct. in relation to things beyond themselves. as a phenomenon of culture. relative to one another.

Know the specific kinds of questions that can and cannot legally be asked in an employment interview

Know how to recognize and apply methods useful when probing for additional information that could be helpful in arriving at an employment decision

Discuss the objectives of discipline and appeals systems