Germany's fi nancial regulator has warned that the country's banking system is undergoing a real-life stress test
Question:
Germany's fi nancial regulator has warned that the country's banking system is undergoing a real-life stress test amid the current
volatility, also predicting signifi cant weakness for the commercial property sector. The banking sector has been under the
spotlight since March 2023 with the collapse of Silicon Valley Bank and the rescue of several other embattled lenders. Pressures
facing the sector have intensifi ed as many central banks push up their benchmark rates, leading to specifi c market dislocations.
Federal Financial Supervisory Authority banking said that the system "has taken some pain," but highlighted that there is "no
systemic danger" and the fi nancial system has managed to absorb the impacts of higher rates well. Data released has showed
that in the euro zone, banks have started to tighten conditions for credit, while borrowers have also demanded less credit. These
dynamics could translate into a further economic slowdown. With the development of fi nancial consumption, demand for credit
has soared. Since the bank has detailed client data, it is important to build effective models to distinguish between high-risk
groups and low-risk groups. Progress in data mining and machine learning makes it possible to conduct accurate credit analysis.
Solve the experiment with classifi cation algorithm using WEKA tool using credit risk dataset provided. This dataset classifi es
two people described by a set of attributes: class { good, bad} as good or bad credit risks.
The credit risk data attribute information is given as below.
attribute : 'checking_status' { '<0', '0<=X<200', '>=200', 'no checking'}
attribute : 'duration'
attribute : 'credit_history' { 'no credits/all paid', 'all paid', 'existing paid', 'delayed previously', 'critical/other existing credit'}
attribute: 'purpose' { 'new car', 'used car', furniture/equipment, radio/tv, 'domestic appliance', repairs, education, vacation,
retraining, business, other}
attribute: 'credit_amount'
attribute : 'savings_status' { '<100', '100<=X<500', '500<=X<1000', '>=1000', 'no known savings'}
attribute : 'employment' { unemployed, '<1', '1<=X<4', '4<=X<7', '>=7'}
attribute : 'installment_commitment'
attribute : 'personal_status' { 'male div/sep', 'female div/dep/mar', 'male single', 'male mar/wid', 'female single'}
attribute : 'other_parties' { none, 'co applicant', guarantor}
attribute : 'residence_since'
attribute : 'property_magnitude' { 'real estate', 'life insurance', car, 'no known property'}
attribute : 'age'
attribute : 'other_payment_plans' { bank, stores, none}
attribute : 'housing' { rent, own, 'for free'}
attribute : 'existing_credits'
attribute : 'job' { 'unemp/unskilled non res', 'unskilled resident', skilled, 'high qualif/self emp/mgmt'}
attribute : 'num_dependents'
attribute : 'own_telephone' { none, yes}
attribute : 'foreign_worker' { yes, no}
The Data instances: Download this fi le to get the data https://drive.google.com/file/d/1_7EjFd77t-_Bmkhzye9xYKpG3kDLqDFa/view?usp=sharing
*Reminder: PLEASE USE THE PROVIDED DATA IN THE LINK ONLY
Answer ALL the following question.
i. convert an ARFF file format with the name relation credit-risk. Then, Load the data credit-risk.arff that you have created into
the WEKA Explorer Interface [Please upload the ARFF fi le format that you create to the attachment below]
**IMPORTANT to attach your ARFF File Format before proceed to the next sub question. If the student did not attach the ARFF
fi le, the answer for the next sub question will not be counted.
ii. Identify how many attributes are there in the dataset? List all the attributes and the types for each of attributes?
Attribute Types
iii. Find the value of min, max, mean and standard deviation of each attribute 'duration'; 'credit_amount';
'installment_commitment'; 'existing_credits' and 'num_dependents'
iv. Run the function classifi cation algorithm called NaiveBayes (weka.classifi ers.NaiveBayes). Use cross-validation to test its
performance, leaving the number of folds as the default value of 10. Recall that you can examine the classifi er options in the
Generic Object Editor window that pops up when you click the text beside the Choose button. The batchSize is set 300. This is a
preferred number of instances to process if batch prediction is being performed. What is the accuracy of Naïve Bayes?
*Noted: result must be four (4) decimal number
v. Run the function classifi cation algorithm called LibSVM (weka.classifi ers.function.LibSVM). Use cross-validation to test its
performance, leaving the number of folds as the default value of 10. Recall that you can examine the classifi er options in the
Generic Object Editor window that pops up when you click the text beside the Choose button. The default value of the batchSize
100. This is a preferred number of instances to process if batch prediction is being performed. The default setting for SVM Type
is nu-SVC (classifi cation) while the default Kernel Type is set with radial basis function. What is the accuracy of LibSVM?
*Noted: result must be four (4) decimal number
vi. Run the function classifi cation algorithm called MultilayerPerceptron (weka.classifi ers.function.MultilayerPerceptron). Use
cross-validation to test its performance, leaving the number of folds as the default value of 10. Recall that you can examine the
classifi er options in the Generic Object Editor window that pops up when you click the text beside the Choose button. The value
of the hiddenlayers set as 3 with batchSize 300. This is a preferred number of instances to process if batch prediction is being
performed.What is the accuracy of Multilayer Perceptron?
*Noted: result must be four (4) decimal number
vii. Run the function classifi cation algorithm called RandomForest (weka.classifi ers.trees.RandomForest). Use cross-validation
to test its performance, leaving the number of folds as the default value of 10. Recall that you can examine the classifi er options
in the Generic Object Editor window that pops up when you click the text beside the Choose button. The value of batchSize 300.
This is a preferred number of instances to process if batch prediction is being performed. What is the accuracy of
RandomForest?
*Noted: result must be four (4) decimal number
viii. Run and compare the result performance of AdaboostM1 (weka.classifi ers.meta.AdaboostM1) and SMO
(weka.classifi ers.function.SMO). For both algorithms, use cross-validation performance values with number of folds are = 5, 10
and 15. Please examine the classifi er options in the Generic Object Editor window that pops up when you click the text beside
the Choose button. The default value for both algorithm is set with the batchSize 500. This is a preferred number of instances to
process if batch prediction is being performed. For AdaboostM1, the setting for classifi er is ADTree, while Set Kernel for SMO
Algorithm with RBFKernel and calibrator with MultilayerPerceptron. What are the results of Accuracy, Weighted Avg ofthe TP
Rate, FP Rate, Precision and Recall of AdaboostM1 and SMO algorithm with 5, 10 and 15 cross validation
*Noted: result of accuracy must be four (4) decimal number and result of Weighted Avg for TP Rate, FP Rate, Precision and
Recall must be three (3) decimal number