Question: You will create a Python script on HPC6 that will detect the presence of broken links on a given Web page. You will use Python

You will create a Python script on HPC6 that will detect the presence of broken links on a given Web page. You will use Python requests and BeautifulSoup to help you accomplish this.

The script should reside at ~/bin/check-links. The script can be run using the following syntax.

check-links 

where is a valid Web address, like https://bvu.edu/Links to an external site. or http://10.92.21.106:8008/Links to an external site.. The latter Web address is the one your instructor would like you to use to test your code.

When run, the check-links script places a list of bad link text into the file ~/.links/bad-links.txt. If there are no bad links, the check-links script should ensure that ~/.links/bad-links.txt does not exist (that is, it may have to remove the file if it does exist).

Links

Links in HTML files look like this.

The BVU Academic Calendar page.

Visit the Office of the President.

From these examples, you can see that the link text (e.g., "BVU Academic Calendar" or "Office of the President") is surrounded by and , and the includes within its angle brackets an href attribute that gives the actual link itself. Thus, clicking on "BVU Academic Calendar" will take the user to the Web page found at https://bvu.edu/academics/calendar/Links to an external site.. The actual link may have several forms. In the case of https://bvu.edu/academics/calendar/Links to an external site., the link is a full Web address which consists of the host (bvu.edu) and a path (/academics/calendar). Another link form we may encounter is simply a path without a host. If there is no host given, a Web browser assumes the current host. In other words, if we were visiting a Web page located at bvu.edu, and we clicked on the "Office of the President" link, the browser would navigate us to https://bvu.edu/president/Links to an external site. by concatenating https://bvu.edu Links to an external site.and /president/.

Automating the Running of check-links

You will create a cron job that runs check-links at 7:00 a.m. every day. Then, in your ~/.profile, you will check to see if ~/.links/bad-links.txt exists. If it does, you will display its contents whenever you log in.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!