Lab Scrape Emails
This lab is to be done on your own.
- Create a python program named scrape_emails.py that does the following:
- Comments at the top of the module with your name, date, and description. 1 point.
- Put this website url into a variable: http://ridgewater.edu/contact-us/staff-directory/ 1 points.
- Set up the context that is needed to read https pages (secure pages). 2 points.
- Set up the headers dictionary. 1 point.
- Use urllib.request.Request to request a website. 1point.
- Use urllib.request.urlopen to open the website 1 point.
- Use '(mailto:)([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)'
as the regular expression to find all email addresses. 2 points.
- Open a new file named 'email_addresses.txt'. 1 point.
- Iterate through the regular expression matches to extract the individual email addresses. 2 point.
- Write the email adresses to the screen. 2 point
- Write the email addresses to the text file. 5 points
- Close the text file. 1point
Submit all these files to the D2L dropbox:
- The scrape_emails.py code file.
- The email_addresses.txt file.
- A screenshot in .jpg or .png format showing the start of the run as output.
- A screenshot in .jpg or .png format showing the end of the run as output.
The image below shows an example start run of the program.
The image below shows an example end run of the program.