Python script for checking for broken links and analyzing links status on a single webpage
or all webpages of a website
it also has the support of using XML sitemap of a website to check for the broken links and analyze links in all webpages of the website
requests
, BeautifulSoup4
and lxml
pip install requests
pip install beautifulsoup4
pip install lxml
if you get error running pip
try using pip3
instead
it works through the command line and you pass data to it through the python sys module using sys.argv
then pass the data from sys.argv
to the main()
function as parameter
import sys
# ...
main(sys.argv)
from broken_links_checker import main as links_checker_main # or any other name
import sys
# ...
links_checker_main(sys.argv)
broken_links_checker https://full-url
python3 broken_links_checker.py https://full-url
it doesn't have to be https
can also be http
broken_links_checker --with-sitemap C:/full/path/to/the/sitemap.xml
python3 broken_links_checker.py --with-sitemap /full/path/to/the/sitemap.xml
the --with-sitemap
flag doesn't have to come before the url, it can also come after it
you can also call the python file with a custom name like names like npm, git, composer and so on, you can watch my yt video on how to do it, link below
How to run python file with custom name like npm on Windows and Linux
Fork the repo and create a pull request with the description of the changes you made