Python: Scraping Local Web Page by Beautiful Soup

n order to get related data from other web pages, some of web site uses web scraping to get parse HTML files an extract information. Learn how to use Beautiful Soup library to parse web page.

Learn on HTML structure

In order for us to read out the content of a web page, we will need to understand some basic of HTML tag.

A anatomy of an HTML Tag

Tag Name	Description
<html>	The root-level tag of HTML document. It encapsulates all other HTML tags.
<head>	The head section of an HTML document that contains metadata about the page.
<title>	The title of the web page, ,to be displayed on the tab of the browser.
<body>	The body of an HTML document, with all displayed content.
<h1>	A level 1 heading for example, the title of a news article
<p>	A paragraph of displayed content
<div>	A container uses for page elements that divide the HTML document into sections.
<a>	A hyperlink to link one page to another

Create a sample local web page

We will create a sample html for our demo on our article.
Create a CSS (style) for the redtext and leftmargin to use on the hyperlink
Hyperlink to the libraries and directory of University of Kentucky.

Extract Information from Beautiful Soup

Import Beautiful Soup libraries with bs4 (Beautiful Soup 4)
Open the local web page that we have created
Load using Beautiful Soup html parser to generate BeautifulSoup object, which represents the document as a nested data structure.
Use findAll tag to get related tags.
Print the <a> hyperlink tags and its text.

Finally, you now can work on a offline web page for scraping the data content. Next, we will discuss on live web page .

Take care and see you again.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Learn Scratch SG

learning scratch from singapore cool projects

Python: Scraping Local Web Page by Beautiful Soup

Learn on HTML structure

Create a sample local web page

Extract Information from Beautiful Soup

Leave a comment Cancel reply

Python: Scraping Local Web Page by Beautiful Soup

Learn on HTML structure

Create a sample local web page

Extract Information from Beautiful Soup

Share this:

Related

Leave a comment Cancel reply