Linkie looks through files for broken links using Python 3.5+
Linkie will search all files within the directory it's run and any subdirectories, and requires a simple YAML config file to run. You can then run Linkie from the command line.
linkie
You can also pass Linkie a YAML file of configuration values (for example
linkie linkie.yaml
). The YAML file can contain the following optional
settings:
exclude-directories
- Any directories listed will be ignored, these are relative to the directory Linkie is run from.file-types
- The file extensions to search for URLs.skip-urls
- URLs to skip checking.
Example configuration file (these are the default values Linkie uses):
exclude-directories:
- .git/
- docs/build/
file-types:
- html
- md
- rst
- txt
Linkie can also be used within Python:
import linkie
checker = linkie.Linkie() # Creates a linkie object.
result = checker.run() # Returns 1 if broken links found, otherwise 0.
You can pass a dictionary of settings directly using Python:
import linkie
settings = {"file-types": ["md", "rst"]}
checker = linkie.Linkie(config=settings)
You can also use a config file within Python:
import linkie
checker = linkie.Linkie(config_file_path='linkie.yaml')
You can also access the following attributes from the linkie after it's run:
linkie.urls # Dictionary of processed URLs and their data.
linkie.status_counts # Dictionary of status codes and their counts.
linkie.file_count # Number of files processed.
Linkie is licensed under the MIT License. Read the license file for more details.
- Add logic to delay Linkie requesting from a domain if it responds with code 429.
- Reorganise logging output so that the URL is printed last.
- Update dependencies.
- Update to only check links prefixed by one of [=", (, <, ' '(a space)].
- Linkie now finds all unique links at once, then uses multithreading to check them all.
- Linkie now rechecks links that had a ConnectionError, as these are often valid.
- Broken links in the SUMMARY are now also displayed with their status code.
- Update logging configuration.
- Update dependencies.
- Set User-Agent to emulate browser viewing.
- Use Python logging module.
- Allow passing of variable of config settings in Python.
- Update method for URLs with brackets.
- Allow adding URLs to skip to configuration file.
- Skip checking URLs that have already been checked.
- Show connection error names instead of 999 status.
- Uses class based object allowing user to retrieve values after running.
- Initial linkie release.
We required a script to check our repositories for broken links. This tool was initially written in Python, and a published Python package makes it easy for repositories to use this tool, in combination with pyup notifying if the package is updated.
Probably not. This script was initally created as an internal tool so we are not actively developing and supporting it compared to our other repositories. However we have published it freely under the MIT License to allow you to copy and modify linkie as you wish.
Maybe. This script was initally created as an internal tool so doesn't have the same level of polish as other projects we create. If we have more time down the road, we may spend more time developing linkie.
$ git clone https://github.com/uccser/linkie.git
$ cd linkie
$ pip3 install .