Question: need only conclusions values, and as for the second, we used an operator Replace Missing Values which had replaced the missing instances with average value.

need only conclusions values, and as for the

need only conclusions

values, and as for the second, we used an operator Replace Missing Values which had replaced the missing instances with average value. After the completion of these steps, we got the final data set that has 120269 instances with no missing values From the original attributes showed in Table 1 we derived several new attributes in order to inspect if some aggregated attributes can provide more information to the algorithms, compared to original ones. These are: Table 2. Derived attributes Variable name Description Debr MonthlyIncome "DebiRatio CumulativeLateness Sum of all lateness WagesPerCapita MonthlyIncome/NumberoDependents Age Discreted Age is discretized in three categories: Young (under 35). Middle age (between 35 and 60). Old (above 60 years of age) 22 Algorithms Ensemble Random Because of the imbalanced nature of dataset we evaluated several ensemble algorithms, since they often showed good performance on this kind of data. Additionally, we compared algorithm performances with single algorithms Brief descriptions of algorithms used in this study is given in Table 3. Table 3: Types of used algorithms Type Algorithm Description Learns a set of random trees, Le for each split only a random subset of Forest attributes is available. The resulting model is a voting model of all trees. This operator trains an ensemble of assifiers for boolean target attributes. In Boosting each iteration the training set is roweighted, so that previously discovered patterns and other kinds of prior knowledge are "sampled out Ensemble Meta Cost Operator uses a given cost max to compute label predictions according to classification costs Ensemble Bagging Bagoing operator allowing a learners Single Naive Bayes Returns classification model using estimated normal distributions. Single Decision Tree Generales decision trees to classity nominal data Single Perceptron Single Perceptron finding separating hyperplane it one exists. Ensemble Bayesian 23. Evaluation Because of high class imbalance and binary class type, classification accuracy is not suitable for measuring performance of algorithms, since it gives misleading results, on this type of data. In our case, classification accuracy would give the accuracy of 93.05% forth majority algorithm, Leif algorithm predict always negative case (person will return credit). Even though this accuracy is high, it doesn't reflect the success of the model since in this case is more important to predict positive cases (identify persons that will not retum the credit) This is the reason we used Area under curve (AUC measure) for algorithm evaluation. It represents the area under Receiver Operating Characteristic (ROC) curve-graphical plot which mustrates the performance of a binary classifier system as its discrimination threshold is varied. ROC curve is created by plotting the fraction of true positives out of the total actual positives (TPR = true positive rate) divided by the fraction of false positives out of the total actual negatives (FPR = false positive rate). at various threshold settings. The true positive rate, or recall is calculated as the number of true positives divided by the total number of positives The false positive rate is calculated as the number of faise positives divided by the total number of negatives 24. Experiments and results In the first process, we evaluated performances of all algorithms on all available attributes. AUC is measured based on 10 fold cross validation with the stratified samping. It builds random subsets and ensures that the class distribution in the subsets is the same as in the whole dataset. In our case of a binominal classification stratified sampling builds random subsets in which each subset contains roughly the same proportions of the two values of class labels. Stratification is the process of dividing members of the population into homogeneous subgroups before sampling. Afterwards, simple random sampling or systematic sampling is applied within each stratum. This often improves the representativeness of the sample by reducing sampling error. This enables correct interpretation of AUC values Imbalanced class distributions can lead the algorithms to learn overty complex models that overft the data and have little relevance. It is known that class imbalance problem can be successfully solved with under-sampling techniques. In this experiment we evaluated every algorithm on different sub-samples Table 4: Evaluation of algorithms on basic data set Algorithm Sample (class class 1) 111911/8357 50000 8357 30000 8357 8357/8357 5000 8357 2000/8357 0.792 Random Forest 0.5 0.500 0.575 (0.667) 0.642 0.0 Bayesian Boosting 0.5 Decision stump 0.570 0.570 0.771 0.715 0.5 Meta Cost- 0.511 Decision stump 0.571 0.648 0.688 0.775 0.697 Bapping 0.615 0505 0 570 an DU CHI Decision stump 0.739 0.626 0.748) Naive Bayes 0.801 0.797 0.797 0.795 0.797 0.789 Decision Tree 0.719 0.500 0.500 0.773 0.715 0.701 Pepe 0.812 0.813 0.814 0.812 0.737 Q.775 195 After the conducted experiment with basic attributes, we started another analysis, but this time we utilized derived attributes, hoping to get better results. These results are presented in Table 5: Table 5: Evaluation of algorithms on data set that contained derived attributes Algorithm Sample 111911/8357 500003357 30000 8357 835718357 (class class 1) 50008357 2000/8357 Random Forest 0.803 0.5 0.5 0.593 0794 0.5 (0.737, 810) Bayesian Boosting 0.502 0.631 0.664 0.789(0.708) 0.737 0.5 Meta Cost 0.54 0 587 0.657 0.758 0.757 0.776 Bagging 0.513 0.579 0.625 0.692 0.743 0.579 Naive Bayes 0.70 0.789 0.782 0.782 (0.68) 0.784 0.789 Decision Tree 0.5 0.5 0546 0.753 0.753 0.753 Perceptron 0.809 0.809 0.807 0.809 0.811 0.81 (0.756, 803) It can be seen from Table 4 that different samples influence on algorithm performance and that Perceptron in all cases gives the best results (around 0.81). Perceptron is a type of artificial neural network. It can be seen as the simplest kind of feed-forward neural network a near classifier. From this we can conclude that Perceptron successfully solves imbalance data problem on all sub-samples. This indicates that perceptron could be trained on smaller sub-samples in order to reduce time cost without worsening the performance It can be also be noticed from Table that ensemble sigorthms showed the best performance when data is completely balanced (8.357 negative and 8 357 positive cases). As expected. Decision Tree algorithm showed the worst results especially when the data is imbalanced (511.911 positive and 8.357 negative cases) In the second experiment we used feature weighting techniques (Evolutionary and PSO) in order to identity

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related General Management Questions!

need only conclusions values, and as for the second, we used an operator Replace Missing Values which had replaced the missing instances with average value. After the completion of these steps, we...

Requirements Read the give information deeply and Drawing conclusions refers to information that is implied or inferred. ... Using these clues to give for deeper understanding And provide the details...

from the above mentioned study provide the conclusions values, and as for the second, we used as to Replace Missing Values which had replaced the missing instances with average valuter the completion...

Divvy Bikes Background Information: Divvy is a bicycle sharing system in the City of Chicago and two adjacent suburbs that is operated by Motivate for the City of Chicago Department of...

The product of your research and collaborative effort is a report that is to be submitted to the senior management group of ABC Equities Ltd (ABC), a very large client firm who is currently...

Read the above study and give summary conclusions 2. SOLVING THE PROBLEM OF MISSING DATA After deciding which indicators to include in an index, we collect the data. At this point we become aware of...

Object 1 WebAssign Welcome, jeremy.guillory@grandcanyon (log out) Friday, November 18 2016 03:11 AM MST Home My Assignments Grades Communication Calendar My eBooks My Ebooks Introduction to the...

Solutions from Montgomery, D. C. (2012) Design and Analysis of Experiments, Wiley, NY Chapter 6 k The 2 Factorial Design Solutions 6.1. An engineer is interested in the effects of cutting speed (A),...

A circular steel wire 2.00 m long must stretch no more than 0.25 cm when a tensile force of 400 N is applied to each end of the wire. What minimum diameter is required for the wire?

Two categories of survey questions are open and closed. An open question allows a free response, while a closed question allows only a fixed response. Here are examples: Open question: What do you...

Due to a CPA's responsibility to serve the public, _ _ _ _ _ _ is perhaps the most important concept in the AICPA Code of Professional Conduct. Multiple choice question. independence familiarity with...

When older adults have difficulty with transportation and communication, or may be reluctant to request needed assistance to which they are entitled, they may be in need of: Group of answer choices...