This blog post is about understanding CSS selectors when doing web page web scraping, and what tools might be handy to use in addition to Python beautifulsoup, lxml libraries.

📌Note: This blog post is not a complete CSS selectors reference, but a mini-guided tour of frequently used type of selectors and how to work them.


pip install lxml beautifulsoup4

A basic familiarity with bs4 library, or whatever HTML parser package/framework you're using.

Usage of CSS selectors in different languages, frameworks, packages are not much different.

What is CSS selectors

CSS selectors are patterns used to select (match) the element(s) you want to style extract.


Let's start with easy one, SelectorGadget extension. This extension allows to quickly grab CSS selector(s) by clicking on desired element in your browser, and returns a CSS selector(s).

SelectorGadget is an open-source tool that makes CSS selector generation and discovery on complicated sites a breeze.

Uses cases:
  • for web page scraping with tools such as Nokogiri and BeautifulSoup.
  • to generate jQuery selectors for dynamic sites.
  • as a tool to examine JavaScript-generated DOM structures.
  • as a tool to help you style only particular elements on the page with your stylesheets.
  • for selenium or phantomjs testing.

When using SelectorGadget it highlights element(s) in:

  • yellow which is mean that it's guessing what the user is looking for, and needs possible additional clarification.

Image description

  • red excludes from match selection.

Image description

  • green includes to match selection.

Image description

Pick CSS Selectors by Hand

Now is the time to think a little, just a little. Since SelectorGadget isn't a magical all around tool, sometimes it can't get the desired element. This happens when website HTML tree is not well structured, or if the site is rendered via JavaScript.

When it happens, I use Elements tab via Dev Tools (F12 on a keyboard) to locate and grab CSS selector(s) or HTML elements by their:

  • type selector: <input>
  • class selector: .class
  • id selector: #id
  • attribute selector: [attribute]

Type Selectors

Syntax: element_name

Type selectors matches elements by node name. In other words, it selects all elements of the given type within a HTML document.'a')      # returns all <a> elements'span')   # returns all <span> elements'input')  # returns all <input> elements'script') # returns all <script> elements

Class Selectors

Syntax: .class_name

Class selectors matches elements based on the contents of their class attribute. It's like calling a class function PressF().when_playing_cod().'.mt-5')                   # returns all elements with current .selector'.crayons-avatar__image')  # returns all elements with current .selector'.w3-btn')                 # returns all elements with current .selector

ID Selectors

Syntax: #id_value

ID selectors matches an element based on the value of the elements id attribute. In order for the element to be selected, its id attribute must match exactly the value given in the selector.'#eob_16')              # returns all elements with current #selector'#notifications-link')  # returns all elements with current #selector'#value_hover')         # returns all elements with current #selector

Attribute Selectors

Syntax: [attribute=attribute_value] or [attribute], more examples.

Attribute selectors matches elements based on the presence or value of a given attribute.

The only difference is that this selectors uses curly braces [] instead of a dot (.) as class, or a hash (or octothorpe) symbol (#) as ID.'[jscontroller="K6HGfd"]')         # returns all elements with current [selector]'[data-ved="2ascASqwfaspoi_SA8"]') # returns all elements with current [selector]

# elements with an attribute name of data-id'[data-id]')                       # returns all elements with current [selector]

Selector List

Syntax: element, element, element, ...

Selector list selects all the matching nodes (elements). From a web scraping perspective this CSS selectors is great (in my opinion) to handle different HTML layouts because if one of the selectors is present it will grab all elements from an existing selector.

As an example from Google Search (carousel results), the HTML layout will be different depending on country where the search is coming from.

When country of the search is not the United States:

Image description

When country of the search is set to the United States:

Image description

Following examples translates to this code snippet (handles both HTML layouts):

# will return all elements either by one of these selectors'#kp-wp-tab-Albums .PZPZlf, .keP9hb')

Descendant combinator

Syntax: selector1 selector2

Descendant combinator represented by a single space ( ) character and selects two selectors such that elements matched by the second selector are selected if they have an ancestor (parent, parent's parent, parent, etc) element matching the first selector.'.NQyKp .REySof')   # dives insie .selector -> dives again to other .selector and grabs it'div cite.iUh30')   # dives inside div -> dives inside cite.selector and grabs it'span#21Xy a.XZx2') # dives inside span#id -> dives insize a.selector and grabs it

Other Useful CSS Selectors

  • :nth-child(n): Selects every n element that is the second child of its parent.
  • :nth-of-type(n): Selects every n element that is the second n element of its parent.
  • a:has(img): Selects every element <a> element that has an <img> element.

Additional useful CSS selectors you can find on W3C Level 4 Selectors, W3Schools CSS Selectors Reference, and MDN documentation.

Test CSS Selectors

To test if the selector extracts correct data you can:

Place those CSS selector(s) in the SelectorGadget window and see what elements being selected:

Image description

Use Dev Tools Console tab via $$(".selector") method (creates an array (list()) of elements):


Which is equivalent to document.querySelectorAll(".selector") method (according to Chrome Developers website:


Output from the DevTools Console for both methods are the same:

Image description

Cons of CSS Selector

Betting only classes might be not a good idea since they could probably change.

A "better" way (in terms of CSS selectors) would be to use selectors such as attribute selectors (mentioned above), they are likely to change less frequently.

See attribute selectors examples on the screenshot below (HTML from Google Organic results):

Image description

Many modern websites use autogenerated CSS selectors for every change that is being made to certain style component, which means that rely exclusively on them is not a good idea. But again, it will depend on how often do they really change.

The biggest problem that might appear is that when the code will be executed it will blow up with an error, and the maintainer of the code should manually change CSS selector(s) to make the code run properly.

Seems like not a big deal, which is true, but it might be annoying if selectors are changing frequently.

Code Examples

This section will show a couple of actual examples from different websites to get you familiarize a bit more.

Image description

Test CSS container selector:

Image description


import requests, lxml
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36"

html = requests.get("", headers=headers)
soup = BeautifulSoup(html.text, "lxml")

for result in".tF2Cxc"):
    title = result.select_one(".DKV0Md").text
    link = result.select_one(".yuRUbf a")["href"]
    displayed_link = result.select_one(".lEBKkf span").text
    snippet = result.select_one(".lEBKkf span").text


# part of the output 
Log in | Minecraft › login
Still have a Mojang account? Log in here: Email. Password. Forgot your password? Login. Mojang © 2009-2021. "Minecraft" is a trademark of Mojang AB.

What is Minecraft? | Minecraft › en-us › about-minecraft
Prepare for an adventure of limitless possibilities as you build, mine, battle mobs, and explore the ever-changing Minecraft landscape.

Extract titles from SerpApi Blog

Image description

Testing .post-card-title CSS selector in Devtools Console:


(7) [,,,,,,]
length: 7
[[Prototype]]: Array(0)


import requests, lxml
from bs4 import BeautifulSoup

html = requests.get("")
soup = BeautifulSoup(html.text, "lxml")

for title in".post-card-title"):
Scrape Google Carousel Results with Python
SerpApi’s YouTube Search API
DuckDuckGo Search API for SerpApi
Extract all search engines ad results at once using Python
Scrape Multiple Google Answer Box Layouts with Python
SerpApi’s Baidu Search API
How to reduce the chance of being blocked while web scraping search engines

Image description

Test CSS selector with either SelectorGadget or DevTools Console:

Image description


import requests, lxml
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36"

html = requests.get("", headers=headers)
soup = BeautifulSoup(html.text, "lxml")

for result in".crayons-story__title"):
    title = result.text.strip()
    link = f'{result.a["href"].strip()}'

    print(title, link, sep="\n")

# part of the output:
How to Create and Publish a React Component Library
A One Piece of CSS Art!
Windster - Tailwind CSS admin dashboard interface [MIT License]

Join us on Twitter | YouTube