Question: Program has to be in Python using regular expressions: Could you please help me find the problem in this code, I am trying to deidentify

Program has to be in Python using regular expressions: Could you please help me find the problem in this code, I am trying to deidentify "names" and "emails" from a text file. The program code so far deidentify emails, but it does NOT deidentify any names that start with prefix. Please help!

# This program removes names and email addresses occurring in a given input file and saves it in an output file.

import re def deidentify(): infilename = input("Give the input file name: ") outfilename = input("Give the output file name: ")

infile = open(infilename,"r") text = infile.read() infile.close()

# replace names nameRE = "(Ms\.|Mr\.|Dr\.|Prof\.) [A-Z](\.|[a-z]+) [A-Z][a-z]+" deidentified_text = re.sub(nameRE,"**name**",text)

emailRE = "(\S*@\S*\S?)" deidentified_text = re.sub(emailRE, "**email**", text)

outfile = open(outfilename,"w") print(deidentified_text, file=outfile) outfile.close()

deidentify()

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!