Question: Overview Machine learning methods use effectively to detect malicious websites. In this assignment, you are required to classify malicious websites by using provided dataset (malicious_and_benign_websites1.csv).

Overview Machine learning methods use effectively to detect malicious websites. In

this assignment, you are required to classify malicious websites by using provided

Overview Machine learning methods use effectively to detect malicious websites. In this assignment, you are required to classify malicious websites by using provided dataset (malicious_and_benign_websites1.csv). The features have been extracted and clearly structured in CSV format, as summarized in Table 1. Table 1: features description of malicious and benign websites dataset COLUMN NAME description the anonymous identification of the URL analyzed in the study URL URL_LENGTH the number of characters in the URL NUMBER_SPECIAL_CHARACTERS the number of special characters identified in the URL, such as / %, .& - 1 - CHARSET SERVER the character encoding standard (also known as the character set) the operating system of the server obtained from the packet response. the content size of the HTTP header the country of the server the state of the country of the server (if known) CONTENT_LENGTH WHOIS_COUNTRY WHOIS_STATEPRO WHOIS_REGDATE the server date and time WHOIS_UPDATED_DATE the last update of the server TCP_CONVERSATION_EXCHANGE the number of TCP packets exchanged between the server and our honeypot client DIST_REMOTE_TCP_PORT the number of the ports detected and different to TCP REMOTE_IPS the total number of IPs connected to the honeypot APP_BYTES the number of bytes transferred SOURCE_APP_PACKETS packets sent from the honeypot to the server REMOTE_APP_PACKETS packets received from the server the total number of IP packets generated during the communication APP_PACKETS between the honeypot and the server PNS_QUERY_TIMES the number of DNS packets generated during the communication between the honeypot and the server is for malicious websites and is for benign websites TYPE Problem Statement This is an individual assessment task. Each student is required to submit a report of approximately 1000 words along with exhibits to support findings with respect to the provided malicious and benign websites. This report should consist of: Literature review in malicious websites detection Construction of datasets, data pre-processing and features Workflow of malicious website detection that describes the process of conducting malicious website detection Technical findings of classification results Justified discussion of the performance evaluation outcomes for different classifiers Overview Machine learning methods use effectively to detect malicious websites. In this assignment, you are required to classify malicious websites by using provided dataset (malicious_and_benign_websites1.csv). The features have been extracted and clearly structured in CSV format, as summarized in Table 1. Table 1: features description of malicious and benign websites dataset COLUMN NAME description the anonymous identification of the URL analyzed in the study URL URL_LENGTH the number of characters in the URL NUMBER_SPECIAL_CHARACTERS the number of special characters identified in the URL, such as / %, .& - 1 - CHARSET SERVER the character encoding standard (also known as the character set) the operating system of the server obtained from the packet response. the content size of the HTTP header the country of the server the state of the country of the server (if known) CONTENT_LENGTH WHOIS_COUNTRY WHOIS_STATEPRO WHOIS_REGDATE the server date and time WHOIS_UPDATED_DATE the last update of the server TCP_CONVERSATION_EXCHANGE the number of TCP packets exchanged between the server and our honeypot client DIST_REMOTE_TCP_PORT the number of the ports detected and different to TCP REMOTE_IPS the total number of IPs connected to the honeypot APP_BYTES the number of bytes transferred SOURCE_APP_PACKETS packets sent from the honeypot to the server REMOTE_APP_PACKETS packets received from the server the total number of IP packets generated during the communication APP_PACKETS between the honeypot and the server PNS_QUERY_TIMES the number of DNS packets generated during the communication between the honeypot and the server is for malicious websites and is for benign websites TYPE Problem Statement This is an individual assessment task. Each student is required to submit a report of approximately 1000 words along with exhibits to support findings with respect to the provided malicious and benign websites. This report should consist of: Literature review in malicious websites detection Construction of datasets, data pre-processing and features Workflow of malicious website detection that describes the process of conducting malicious website detection Technical findings of classification results Justified discussion of the performance evaluation outcomes for different classifiers

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!

Write Python code to solve this homework in detail with comments. eg of csv file contain: AREA Description AGR The course aims to introduce Rules and Regulations that are designated for undergraduate...

ITM 309: Business Information Technology and Systems Spring 2016 Watson and the new era of cognitive systems Jerry Haan IBM Cloud Ecosystem Development January 27, 2016 2013 International Business...

I am struggling to do the proper computations for some simple accounting. After i get this down, i will produce a paper on the importance of each number i solved for. Can someone please help me...

Using Analytic Tools for Strategic Decision-Makin MT460-5: Design a plan to implement a business strategy throughout an organization. PC-2.2: Formulate innovative solutions for identified...

Hello, I am struggling with this assignment, I am unable to determine the below questions based off the attached 10K. I have also attached a spreadsheet from another answer to a part of this question...

see case to answer question only you don't need no other reference. Case Overview Founded by Jeff Bezos, online giant Amazon.com, Inc. (Amazon), was incorporated in the state of Washington in July,...

IfyouhaveplayedaSimulationcalledProBankerIneedhelpansweringthesequestionsassoonaspossible from the pro bankerassignment attachment..please use spreadsheet and players manual for reference. Need...

4 easy accounting questions and a comfortable due date. Sorry I can't offer any more tutor credit. Thanks in advance! :-) Question 1: A few years ago, a publishing company in the fourth quarter had a...

Please attemp the question I have attached and I have guideline for the answers.For question 3,difference between tax evasion and avoiddance can be found in seminar 2 slides.Tax avoidance mean the...

You might know OXO for its well-designed, ergonomic kitchen gadgets. But OXO's expertise at creating handheld tools that look great and work well has promoted the company to expand into products for...

Over 14 years, Casey has saved $7200 by authorizing $30 to be deducted at the end of every month through a payroll deduction plan at his work. The money was sent to his savings account to be...

How can a company predict future cash flow problems? Question 1Select one: Trend of sales Level of inventory Monthly updates of cash flow projection spreadsheet

relevant tittle about education management

How might Ed use the informal network in his organization to learn the normative practices of the company and the meanings they have to others in the company?

Do you think the banquet is a ritual? Why or why not?

How can speakers enhance their credibility?