w3resource

Python BeautifulSoup: Extract all the URLs from the webpage python.org that are nested within <li> tags from

BeautifulSoup: Exercise-8 with Solution

Write a Python program to extract all the URLs from the webpage python.org that are nested within <li> tags from.

Sample Solution:

Python Code:

import requests
from bs4 import BeautifulSoup
url = 'https://www.python.org/'
reqs = requests.get(url)
soup = BeautifulSoup(reqs.text, 'lxml')

urls = []
for h in soup.find_all('li'):
    a = h.find('a')
    urls.append(a.attrs['href'])
print(urls)

Sample Output:

['/', '/psf-landing/', 'https://docs.python.org', 'https://pypi.python.org/', '/jobs/', '/community/', '#', 'javascript:;', 'javascript:;', 'javascript:;', '#', 'https://www.facebook.com/pythonlang?fref=ts', 'https://twitter.com/ThePSF', '/community/irc/', '/about/', '/about/apps/', '/about/quotes/', '/about/gettingstarted/', '/about/help/', 'http://brochure.getpython.info/', '/downloads/', '/downloads/', '/downloads/source/', '/downloads/windows/', '/downloads/mac-osx/', '/download/other/', 'https://docs.python.org/3/license.html', '/download/alternatives', '/doc/', '/doc/', '/doc/av', 'https://wiki.python.org/moin/BeginnersGuide', 'https://devguide.python.org/', 'https://docs.python.org/faq/', 'http://wiki.python.org/moin/Languages', 'http://python.org/dev/peps/', 'https://wiki.python.org/moin/PythonBooks', '/doc/essays/', '/community/', '/community/survey', '/community/diversity/', '/community/lists/', '/community/irc/', '/community/forums/', '/community/workshops/', '/community/sigs/', '/community/logos/', 'https://wiki.python.org/moin/', '/community/merchandise/', '/community/awards', 'https://www.python.org/psf/codeofconduct/', '/success-stories/', '/success-stories/category/arts/', '/success-stories/category/business/', '/success-stories/category/education/', '/success-stories/category/engineering/', '/success-stories/category/government/', '/success-stories/category/scientific/', '/success-stories/category/software-development/', '/blogs/', '/blogs/', 'http://planetpython.org/', 'http://pyfound.blogspot.com/', 'http://pycon.blogspot.com/', '/events/', '/events/python-events', '/events/python-user-group/', '/events/python-events/past/', '/events/python-user-group/past/', 'https://wiki.python.org/moin/PythonEventsCalendar#Submitting_an_Event', '/shell/', '//docs.python.org/3/tutorial/controlflow.html#defining-functions', '//docs.python.org/3/tutorial/introduction.html#lists', 'http://docs.python.org/3/tutorial/introduction.html#using-python-as-a-calculator', '//docs.python.org/3/tutorial/', '//docs.python.org/3/tutorial/controlflow.html', 'http://feedproxy.google.com/~r/PythonSoftwareFoundationNews/~3/NXMcoIchkxY/2018-in-review.html', 'http://feedproxy.google.com/~r/PythonSoftwareFoundationNews/~3/t_DSEH1vASY/python-core-developer-mentorship.html', 'http://feedproxy.google.com/~r/PythonSoftwareFoundationNews/~3/v7pD576k9iA/mariatta-wijaya-lets-use-github-issues.html', 'http://feedproxy.google.com/~r/PythonSoftwareFoundationNews/~3/mnSfdQZDRUM/petr-viktorin-extension-modules-and.html', 'http://feedproxy.google.com/~r/PythonSoftwareFoundationNews/~3/-JcoXQeMgsQ/scott-shawcroft-history-of-circuitpython.html', '/events/python-events/809/', '/events/python-user-group/848/', '/events/python-user-group/838/', '/events/python-events/827/', '/events/python-events/826/', 'http://www.djangoproject.com/', 'http://wiki.python.org/moin/TkInter', 'http://www.scipy.org', 'http://buildbot.net/', 'http://www.ansible.com', '/about/', '/about/apps/', '/about/quotes/', '/about/gettingstarted/', '/about/help/', 'http://brochure.getpython.info/', '/downloads/', '/downloads/', '/downloads/source/', '/downloads/windows/', '/downloads/mac-osx/', '/download/other/', 'https://docs.python.org/3/license.html', '/download/alternatives', '/doc/', '/doc/', '/doc/av', 'https://wiki.python.org/moin/BeginnersGuide', 'https://devguide.python.org/', 'https://docs.python.org/faq/', 'http://wiki.python.org/moin/Languages', 'http://python.org/dev/peps/', 'https://wiki.python.org/moin/PythonBooks', '/doc/essays/', '/community/', '/community/survey', '/community/diversity/', '/community/lists/', '/community/irc/', '/community/forums/', '/community/workshops/', '/community/sigs/', '/community/logos/', 'https://wiki.python.org/moin/', '/community/merchandise/', '/community/awards', 'https://www.python.org/psf/codeofconduct/', '/success-stories/', '/success-stories/category/arts/', '/success-stories/category/business/', '/success-stories/category/education/', '/success-stories/category/engineering/', '/success-stories/category/government/', '/success-stories/category/scientific/', '/success-stories/category/software-development/', '/blogs/', '/blogs/', 'http://planetpython.org/', 'http://pyfound.blogspot.com/', 'http://pycon.blogspot.com/', '/events/', '/events/python-events', '/events/python-user-group/', '/events/python-events/past/', '/events/python-user-group/past/', 'https://wiki.python.org/moin/PythonEventsCalendar#Submitting_an_Event', '/dev/', 'https://devguide.python.org/', 'https://bugs.python.org/', 'https://mail.python.org/mailman/listinfo/python-dev', '/dev/core-mentorship/', '/news/security/', '/about/help/', '/community/diversity/', 'https://github.com/python/pythondotorg/issues', 'https://status.python.org/']

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Python program to find the text of the first <a> tag of a given html text.
Next: Write a Python program to find all the h2 tags and list the first four from the webpage python.org.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

Getting the last element of a list:

some_list[-1] is the shortest and most Pythonic.

In fact, you can do much more with this syntax. The some_list[-n] syntax gets the nth-to-last element. So some_list[-1] gets the last element, some_list[-2] gets the second to last, etc, all the way down to some_list[-len(some_list)], which gives you the first element.

You can also set list elements in this way. For instance:

>>> some_list = [1, 2, 3]
>>> some_list[-1] = 5 # Set the last element
>>> some_list[-2] = 3 # Set the second to last element
>>> some_list
[1, 3, 5]

Note that getting a list item by index will raise an IndexError if the expected item doesn't exist. This means that some_list[-1] will raise an exception if some_list is empty, because an empty list can't have a last element.

Ref: https://bit.ly/3d8TfFP