Question: The goal of this assignment is to write a program that will scan a web page and harvest as many email addresses as possible. Many

The goal of this assignment is to write a program that will scan a web page and harvest as many email addresses as possible. Many of these email address will be obfuscated in some way. You're job is to get the computer to figure out how to recognize the obfuscation and return a good result!

Here are some examples to get you started (in the form obfuscated email => what your program should interpret the email as):

mst3k@Virginia.EDU => mst3k@Virginia.EDU

thomas.jefferson@cs.virginia.edu => thomas.jefferson@cs.virginia.edu

mst3k at virginia.edu => mst3k@virginia.edu

mst3k at virginia dot edu => mst3k@virginia.edu

You can come up with regular expressions that will look for particular patterns in a line that could be an email address.

Your program must implement the following function:

find_emails_in_website(url): This function takes as input a string representation of the URL of a website that you want to search. We have a page https://cs1110.cs.virginia.edu/emails.html that has a set of example emails you should be able to find (and some that you can look for but we are not requiring). This function should return a list of all of the valid email addresses that you find

This is what I have so far, but for some reason is not working

import urllib.request import re stream = urllib.request.urlopen( "https://cs1110.cs.virginia.edu/emails.html" ) for line in stream: decoded = line.decode("UTF-8") print(decoded.strip()) xx=re.search(r'[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+', re.IGNORECASE) xx.findall(stream)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

CERTIFICATE IV IN FINANCE AND MORTGAGE BROKING - FN540820 Page 1 UNIT 9 MANAGE PERSONAL AND PROFESSIONAL DEVELOPMENT Unit Code: BSBPEF501 This unit describes the skills and knowledge required to...

I have to create a program in C and I can't figure it out. The program has to read a source file. Please help. /******************************************************************** PROJECT: Glossary...

There are two problems due this week (each worth 35 points) as follows. Case 5-1David L. Miller: Portrait of a White-Collar Criminal (page 144). In comprehensive paragraphs, answerrequirements 1?6....

This will help hone your regex skills, as well as introduce some Internet-related and module-related things. Let's say that there was a particular section of a particular web page that you wanted...

For the exclusive use of S. Setiawan, 2015. 9-910-036 REV: APRIL 11, 2011 BENJAMIN EDELMAN THOMAS R. EISENMANN Go oogle In nc. Go oogle's mission is to organize the world's inf n nformation and make...

For the exclusive use of F. Ortolano, 2015. 9-910-036 REV: APRIL 11, 2011 BENJAMIN EDELMAN THOMAS R. EISENMANN Go oogle In nc. Go oogle's mission is to organize the world's inf n nformation and make...

i want complete solution for my assignment and it should be without plagiarism COIT20274: Information Systems for Business Professionals, Term One 2016 Assignments 1 & 2 Requirements Assignment 1 -...

Not sure what information I could be missing, please let me know and I can add whatever may be needed, I provided all the info I was given. The only thing I can think that I didn't add is the install...

The light rigid bar ABCD shown in Fig. P-252 is pinned at B and connected to two vertical rods. Assuming that the bar was initially horizontal and the rods stress-free, determine the stress in each...

Member AB is supported at B by a cable and at A by a smooth fixed square rod which fits loosely through the square hole of the collar. Determine the tension in cable BC if the force F = {-45k}lb. 8...

The Internal Revenue Service may conduct an audit of an individual

Chemical engineering question Thums - up , if answer is good 5 4 0 .

In an Excel Pivot Table, how is a Fact/Measure Column repeated?

In Gender Pay Equity Studies in the Federal Service, how can comparisons be ensured across Job of Comparable Worth?

In the Federal Evaluation System (FES), what standards are used in the Job Evaluation Process?