Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed faz #21 #22

Merged
merged 1 commit into from
Oct 15, 2024
Merged

fixed faz #21 #22

merged 1 commit into from
Oct 15, 2024

Conversation

schochastics
Copy link
Contributor

Some selectors were not working anymore but should be fixed with this PR (tested 5 articles)

@JBGruber
Copy link
Owner

Thanks a lot!

FYI, the package comes with a test function for new/updated parsers, which flags this as failing:

library(paperboy)
test_df <- pb_collect("https://www.faz.net/rss/aktuell/")
test_df_parsed <- paperboy:::test_parser(test_df)
#> ℹ Trying to parse raw data
#> ℹ Checking results
#> ! 4.35% of datetime values failed to parse
#> ✖ 13.04% of text values failed to parse
#> ✖ Some tests failed. But you will get there! Don't stop planning now!

Created on 2024-10-15 with reprex v2.1.0

But I checked the failing articles and it makes sense. The missing datetime is from an articles which does not seem to have a date. 3 texts are missing, one from that same article, one is a live ticker and one just doesn't have any text for some reason.

I'm happy to accept this, since I don't care much about the live ticker. Just let me know if you want to have a look first.

@schochastics
Copy link
Contributor Author

Nice, didnt know about the test function yet. Pretty helpful. But given that the failing seem edge cases, I am fine with this and you can merge

@JBGruber JBGruber merged commit cdcf4d5 into JBGruber:main Oct 15, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants