Question: I want to add another step into the following code: remove duplicate URLs. from bs4 import BeautifulSoup from urllib.request import urlopen from urllib.parse import urljoin

I want to add another step into the following code: remove duplicate URLs.

from bs4 import BeautifulSoup

from urllib.request import urlopen

from urllib.parse import urljoin

import csv

my_url = 'https://www.census.gov/programs-surveys/popest/about/schedule.html'

# opening up connection, grabbing the page

page = urlopen(my_url)

# html parsering

soup = BeautifulSoup(page, 'html.parser')

#save as csv file

with open('index.csv','w') as csv_file:

writer = csv.writer(csv_file)

for link in soup.find_all('a', href=True):

url = link.get('href')

url = urljoin(my_url, url)

print (url)

writer.writerow([url])

I am trying to add this part:

#remove duplicate links

file = open('index.csv', 'w')

links = {}

for link in soup.find_all('a', href=True):

url = link.get('href')

url = urljoin(my_url, url)

if url not in links:

file.write("%s " % url)

links[url] = True

file.close()

Doesn't seem working. I want find all links from the web, all relative links become absolute URLs, no duplicate links, save as CSV file.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

IT LITERALLY WILL NOT KEEP AN INDENT HERE IT IS IN A DOC https://docs.google.com/document/d/1AzsBv1PeVX1vbJTmsi40NrWXHitNs3qQU61nl8pQjTo/edit?usp=sharing i need help allowing a text file of whatever...

Judy runs a shop that sells molds used in arts and crafts, specifically for use with resin. The molds are made of silicone and come in various shapes and sizes. Each mold weighs a different amount...

C++ code: I'm not sure how to implement these changes. #include "selectionSort.h" int cashier() { string date; string ISBN; string Title; int numOfBook; double price; double subtotal; double tax;...

Can someone help me create a Python 3 program that has 2 classes? I have to create a python program that asks the user for the name of the item, the price of the item, if the item is taxable, and if...

Algorithms in Artificial Intelligence (or, the old name: Introduction to Algorithmic Decision Making) Part 1 Based on slides by David Sarne and Lirong Xia Course Tentative Schedule Introduction...

Python I have 5 different files, containing their own function. Ive attached the codes and screen shots. I need seperate file to run these as modules. Modify each of the 5 functions to make it into a...

Alice, a graphic designer of Moonstar Company asks for your help to design a program that can help her track her budget and expenses for each month in 2022. The program shall prompt the user to key...

I need this for Python 5.15 Professor Jiraiya spreadsheet Professor Jiraiya likes to keep the records for his class in a spreadsheet containing a list of student's IDs, names, labs, and scores. Your...

Using Python 3.1.1 You will be creating a registration system in which a student can choose from a menu of classes. They can add as many classes as they want. In the end you will display a summary of...

Create a python program that calculates the pricing for the painting company AllBeige. Call import os at the top. To calculate the price for a painted wall, read from the user the height and width of...

A 9.00-kg object starting from rest falls through a viscous medium and experiences a resistive force R = bv, where v is the velocity of the object. If the object reaches one-half its terminal speed...

One of the measures of the effectiveness of a stimulus is how much the viewers pulse rate increases on exposure to it. In testing a lively new music theme for its television commercials, an...

The difference between an S corporation and a limited liability company is that a limited liability company is not restricted to 1 0 0 stockholders as an S corporation is . Question 3 options:...

A company issues 1,050 shares of its common stock for $33,600 cash. Prepare journal entries to record this event under each of the following separate situations.