Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library? #17

Open
mem89de opened this issue Aug 3, 2023 · 1 comment

Comments

@mem89de
Copy link

mem89de commented Aug 3, 2023

Hi, I'm trying to get recipemd-extract running, but it doesn't work:

$ recipemd-extract https://www.chefkoch.de/rezepte/2625281412358611/Wurst-Pasta.html
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Marc\AppData\Local\Programs\Python\Python311\Scripts\recipemd-extract.exe\__main__.py", line 7, in <module>
  File "C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages\recipemd_extract\main.py", line 57, in main
    recipe=extract(url,args.debug)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages\recipemd_extract\main.py", line 21, in extract
    soup = BeautifulSoup(page.text, "html5lib")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages\bs4\__init__.py", line 193, in __init__
    raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?

Do you have any suggestions? I use Python on Windows 11.

$ python --version
Python 3.11.4

$ pip show recipemd-extract html5lib recipe-scrapers recipemd requests scrape-schema-recipe
Name: recipemd-extract
Version: 1.1.1
Summary: Extracts recipes from websites and saves them in the RecipeMD format
Home-page:
Author: AberDerBart
Author-email: nonatz@web.de
License:
Location: C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages
Requires: beautifulsoup4, html5lib, recipe-scrapers, recipemd, requests, scrape-schema-recipe
Required-by:
---
Name: html5lib
Version: 1.0.1
Summary: HTML parser based on the WHATWG HTML specification
Home-page: https://github.com/html5lib/html5lib-python
Author: James Graham
Author-email: james@hoppipolla.co.uk
License: MIT License
Location: C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages
Requires: six, webencodings
Required-by: mf2py, pyRdfa3, recipemd-extract
---
Name: recipe-scrapers
Version: 5.3.0
Summary: Python package, scraping recipes from all over the internet
Home-page: https://github.com/hhursev/recipe-scrapers/
Author: Hristo Harsev
Author-email: r+pypi@hharsev.com
License:
Location: C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages
Requires: beautifulsoup4, requests
Required-by: recipemd-extract
---
Name: recipemd
Version: 4.0.8
Summary: Markdown recipe manager, reference implementation of RecipeMD
Home-page: https://recipemd.org
Author: Tilman Stehr
Author-email: tilman@tilman.ninja
License: UNKNOWN
Location: C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages
Requires: argcomplete, commonmark, dataclasses-json, pyparsing, yarl
Required-by: recipemd-extract
---
Name: requests
Version: 2.22.0
Summary: Python HTTP for Humans.
Home-page: http://python-requests.org
Author: Kenneth Reitz
Author-email: me@kennethreitz.org
License: Apache 2.0
Location: C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages
Requires: certifi, chardet, idna, urllib3
Required-by: mf2py, recipe-scrapers, recipemd-extract, scrape-schema-recipe
---
Name: scrape-schema-recipe
Version: 0.0.4
Summary: Extracts cooking recipe from HTML structured data in the https://schema.org/Recipe format.
Home-page: https://github.com/micahcochran/scrape-schema-recipe
Author: Micah Cochran
Author-email:
License: Apache-2
Location: C:\Users\Marc\AppData\Local\Programs\Python\Python311\Lib\site-packages
Requires: extruct, isodate, requests, setuptools, validators
Required-by: recipemd-extract
@AberDerBart
Copy link
Collaborator

Sorry for the late response - I neglected the project for quite a while. Now we moved the project to the RecipeMD organization and I also merged a PR updating dependencies. Can you try if this fixes your issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants