Question: Dataset Description You will get two datasets. train.csv is for training your model, and test.csv contains the information to predict. The submission has to be

Dataset Description

You will get two datasets. train.csv is for training your model, and test.csv contains the information to predict. The submission has to be strictly in the format indicated in the sample_submission.csv.

Dataset description

Files

  • train.csv- the training set
  • test.csv- the test set

download dataset

https://drive.google.com/drive/folders/105jPIlN8sK-lprLibpC6iEMkfv2K135x?usp=sharing

(Note that the outcome has to be the class probabilities)

Columns

Client information

  • id- client id (numeric)
  • age- age of client (numeric)
  • job- type of job (categorical: "admin.","artisan","entrepreneur", "housemaid", "management", "retired", "self-employed", "services", "student", "technician", "unemployed", "unknown")
  • civil- marital status of client (categorical: "divorced", "married", "single","unknown"; note: "divorced" means divorced or widowed)
  • education- education of client (categorical: "4K", "6K", "K9", "K12", "illiterate", "apprenticeship", "university", "unknown")
  • credit- has credit in default? (categorical: "no","yes","unknown")
  • hloan- has housing loan? (categorical: "no","yes","unknown")
  • ploan- has personal loan? (categorical: "no","yes","unknown")

Campaign details

  • ctype- contact communication type (categorical: "cellular","telephone")
  • month- last contact month of year (categorical: "jan", "feb", "mar", ..., "nov", "dec")
  • day- last contact day of the week (categorical: "mon","tue","wed","thu","fri")
  • ccontact- current number of contacts performed during this campaign and for this client (numeric, includes last contact)
  • lcdays- number of days that passed by since client was last contacted by a previous campaign (numeric; 999 means client was not previously contacted)
  • pcontact- number of contacts performed before this campaign and for this client (numeric)
  • presult- outcome previous marketing campaigns (categorical: "failure","nonexistent","success")

Socioeconomic indicators

  • employment- employment variation rate - quarterly indicator (numeric)
  • cprice- consumer price index - monthly indicator (numeric)
  • cconf- consumer confidence index - monthly indicator (numeric)
  • euri3- euribor 3 month rate - daily indicator (numeric)
  • employees- number of employees - quarterly indicator (numeric)

Outcome variable (target)

  • outcome- has the client opened a saving account? (binary: 1 = "yes", 0 = "no")

Model Evaluation

We will evaluate models using Area under ROC (AUC).

AUC is commonly used to compare model accuracy. The maximum value that can be achieved is 1 (perfect model/classifier). An AUC value of 0.5 means that it performs equally to a random classifier. An AUC below a value of 0.5 means your model performs worse than a random one. You see the grading evaluation on details in the grading tab.

Submission for Models

Submission files must be .csv files. Every customer in the given dataset has a unique customer ID under theIdcolumn, as you can obtain it from the test.csv file.

The file should contain a header and have the following format:

Id, outcome

103024, \hat{y}

whereoutcomeis thepredicted probabilityof being class 1 (opened saving account) andidis the customer ID. You can combine your prediction with the test setidvalues, for example, using the command

submission <- cbind(test$id,my.prediction)

write.csv(submission, file ="submission.csv")

Kaggle will match the performance of eachid. This way, Kaggle can ensure correct error calculation even in case you change the order of the test set. There is a submission example in the data section.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!