diff --git a/CHANGELOG.md b/CHANGELOG.md index 5045b41..ca7c162 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -45,6 +45,8 @@ First release after the fork. This release is based on the 0.1.7 release of the - **parsing**: improved publication date extraction([`4d137eb`](https://github.com/AndyTheFactory/newspaper4k/commit/4d137eb0b6d5b3df971a01f4aa8c1961af9da118)) (by Andrei) - some linter errors, whitespaces and spelling([`79553f6`](https://github.com/AndyTheFactory/newspaper4k/commit/79553f6302cea1a6e36103fb4dc1c675ca704cd3)) (by Andrei) +################################### These are the original newspaper3k release notes ################################### +######################################################################################################################## ## [0.1.7](https://github.com/codelucas/newspaper/tree/0.1.7) (2016-01-30) [Full Changelog](https://github.com/codelucas/newspaper/compare/0.1.6...0.1.7) diff --git a/README.md b/README.md index 07517c1..cd1e5f6 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# Newspaper4k: Article scraping & curation, a continuation of the beloved newspaper3k by codelucas +# Newspaper4k: Article Scraping & Curation, a continuation of the beloved newspaper3k by codelucas [![PyPI version](https://badge.fury.io/py/newspaper4k.svg)](https://badge.fury.io/py/newspaper4k) ![Build status](https://github.com/AndyTheFactory/newspaper4k/actions/workflows/pipeline.yml/badge.svg) [![Coverage status](https://coveralls.io/repos/github/AndyTheFactory/newspaper4k/badge.svg?branch=master)](https://coveralls.io/github/AndyTheFactory/newspaper4k) @@ -152,15 +152,22 @@ Also, in any case, please provide the following information: # Requirements and dependencies +Following system packages are required: + +- PIL: `libjpeg-dev` `zlib1g-dev` `libpng12-dev` +- lxml: `libxml2-dev` `libxslt-dev` +- Python Development version: `python-dev` + + **If you are on Debian / Ubuntu**, install using the following: -- Install `pip3` command needed to install `newspaper3k` package: +- Install `python3` and `python3-dev`: - $ sudo apt-get install python3-pip + $ sudo apt-get install python3 python3-dev -- Python development version, needed for Python.h: +- Install `pip3` command needed to install `newspaper4k` package: - $ sudo apt-get install python-dev + $ sudo apt-get install python3-pip - lxml requirements: @@ -173,13 +180,17 @@ Also, in any case, please provide the following information: NOTE: If you find problem installing `libpng12-dev`, try installing `libpng-dev`. -- Download NLP related corpora: +- Install the distribution via pip: + + $ pip3 install newspaper4k + +- Download NLP (nltk) related corpora: $ curl https://raw.githubusercontent.com/AndyTheFactory/newspaper4k/master/download_corpora.py | python3 -- Install the distribution via pip: +- Download NLP (nltk) related corpora: - $ pip3 install newspaper3k + $ curl https://raw.githubusercontent.com/AndyTheFactory/newspaper4k/master/download_corpora.py | python3 **If you are on OSX**, install using the following, you may use both homebrew or macports: @@ -188,25 +199,10 @@ homebrew or macports: $ brew install libtiff libjpeg webp little-cms2 - $ pip3 install newspaper3k + $ pip3 install newspaper4k $ curl https://raw.githubusercontent.com/AndyTheFactory/newspaper4k/master/download_corpora.py | python3 -**Otherwise**, install with the following: - -NOTE: You will still most likely need to install the following libraries -via your package manager - -- PIL: `libjpeg-dev` `zlib1g-dev` `libpng12-dev` -- lxml: `libxml2-dev` `libxslt-dev` -- Python Development version: `python-dev` - -```{=html} - -``` - $ pip3 install newspaper3k - - $ curl https://raw.githubusercontent.com/codelucas/newspaper/master/download_corpora.py | python3 # LICENSE diff --git a/tests/create_test_data.py b/tests/create_test_data.py index cd3e502..144752b 100644 --- a/tests/create_test_data.py +++ b/tests/create_test_data.py @@ -48,7 +48,7 @@ def main(args): if __name__ == "__main__": - parser = argparse.ArgumentParser(description="Create test data for newspaper3k") + parser = argparse.ArgumentParser(description="Create test data for newspaper4k") parser.add_argument("--url", type=str, help="URL to download", required=True) parser.add_argument( "--language",