Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add recipe json-ld extraction #50

Merged
merged 1 commit into from
Jul 24, 2020
Merged

Add recipe json-ld extraction #50

merged 1 commit into from
Jul 24, 2020

Conversation

egh
Copy link
Contributor

@egh egh commented Jul 4, 2020

Many (most?) recipe sites include recipe data in the JSON-LD format (see https://developers.google.com/search/docs/data-types/recipe)

This adds a recipe extractor for JSON-LD format.

Many of the built in sites that org-chef supports also support JSON-LD (Fine Cooking, Serious Eats, Allrecipes, NYT, etc). I think that using JSON-LD would be easier to maintain, because it should be a more stable data format than parsing the html.

In order to test this extractor against the custom extractor, I defined a custom variable org-chef-prefer-json-ld. If it is t, org-chef will prefer to use the JSON-LD extractor. Otherwise, it is used as a last resort only.

This addresses #16 and #48 and (I think) #49

@egh egh force-pushed the json-ld branch 3 times, most recently from b217f19 to bc4b54b Compare July 4, 2020 06:28
@Chobbes Chobbes self-requested a review July 4, 2020 18:54
@Chobbes
Copy link
Owner

Chobbes commented Jul 20, 2020

Thanks for looking into this! This looks great, I wasn't aware of this, and I'm excited to have this functionality added.

One thing that I'm noticing, though, is that this uses dom-search, which I don't seem to have in emacs 26.3. It seems like it's something that exists in emacs development branches? Is it possible to modify this to work for earlier emacs versions?

Thanks again :).

Many recipe sites include recipe data in the JSON-LD format (see
https://developers.google.com/search/docs/data-types/recipe)

Add a recipe extractor for JSON-LD format and prefer it if
org-chef-prefer-json-ld is set. Otherwise, use it as a last resort.
@egh
Copy link
Contributor Author

egh commented Jul 20, 2020

@Chobbes Yes, I just learned about it. It's pretty amazing. All the recipe is right there, all structured :)

Thanks for the info on dom-search. Should be fixed in ec51cd1

@Chobbes
Copy link
Owner

Chobbes commented Jul 24, 2020

Seems to be working now!

I do get a "Bad string format" error with the weber recipes, though.

https://www.weber.com/US/en/recipes/red-meat/the-ultimate-burger/weber-2008421.html

But we can figure this out later :). Thanks for submitting this!

@Chobbes Chobbes merged commit 77f97ad into Chobbes:master Jul 24, 2020
@egh
Copy link
Contributor Author

egh commented Jul 24, 2020

Thank you!

@egh egh deleted the json-ld branch July 24, 2020 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants