Skip to content

Commit

Permalink
added feature sentiment analysis. Breaking Change!
Browse files Browse the repository at this point in the history
  • Loading branch information
tassoman committed Nov 28, 2023
1 parent 69f0cb9 commit a8820e7
Show file tree
Hide file tree
Showing 5 changed files with 48 additions and 13 deletions.
29 changes: 21 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# RSS Newsfeed reader bot for Misskey 😻

This Python bot fetches RSS feeds every 5 minutes. Then "cherry pics" a news at time, each minute. Choosing from the freshest to the older posted.
This Python bot posts RSS news from your chosen feeds. You can choose the frequency of posting (in minutes) and the amount of Notes to post each time.

News and Notes flows are asyncronous, so that it can pick up always the fresher news an Note them as soon as possible.
Before posting it starts a **sentiment analysis** then flags with CW (Content Warning) and :NSFW: if sentiment is negative. (war, deaths, bad news)

News and Notes flows are asyncronous, so that it can pick up always the fresher news and Note as soon as possible.

Notes will not bloat your Misskey profile, because get deleted if older than a month.

Expand All @@ -19,7 +21,6 @@ Please, follow this instructions once, before starting:
- Remember to set `isBot = True`
- Visit the page: `https://your.misskey.instance/settings/api`
- Create a new API-key having at minimum `notes:write` privilege.
- copy `.env-example` in `.env` file, fill in your configuration

### Prepare a Python virtual environment

Expand All @@ -29,10 +30,26 @@ Please use python3. In latest GNU/Linux distros, it's already in as default. Oth
2. `python -m venv .venv` Python sandboxed enviroment creation
3. `. .venv/bin/activate` Environment activation
4. `pip install -r requirements.txt` Dependencies installation
5. `python -m spacy download en_core_web_lg` Gets sentiment analysis data (for NSFW posts)

## Configuration

Now you installed the software, you need a small amount of configuration.

### Fill the bot with RSS feed

Edit file `sources.txt` and list a RSS url for every line.
First of all, put the RSS Feed source URLS into the file `sources.txt`, line by line.

### Environment variables

Now copy `.env-example` in `.env` file to fill in your personal configuration:

- **HOST** your Misskey domain
- **APIKEY** app's credentials created before
- **VISIBILITY** choose [in which Timeline to post](https://misskey-hub.net/en/docs/features/timeline.html).
- **LOCAL** boolean for federated Notes
- **EVERY_MINUTES** posting frequency
- **HOW_MANY** posted Notes amount

## Run!

Expand All @@ -42,10 +59,6 @@ Inside your python environment:

It will setup if needed. Then will start three scheduled jobs.

- Fetch RSS every 5 minutes
- Post Notes every minute
- Delete Posted notes older than one month. Hourly.

### Service daemon configuration

You probably want to run it detached from the console by using `nohup` or running into a `screen` command. So that you can close the ssh shell without stopping.
Expand Down
14 changes: 10 additions & 4 deletions jobs/create.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import os
from misskey import Misskey
from dotenv import load_dotenv
from jobs.sentiment import getSentiment

load_dotenv()

Expand All @@ -24,26 +25,31 @@ def publish_note():
mk = Misskey(os.getenv('HOST'), i=os.getenv('APIKEY'))

c.execute('''
SELECT * FROM news WHERE noted = 0 ORDER BY publishedAt DESC LIMIT ?
SELECT * FROM news
WHERE notedAt IS NULL OR notedAt = ''
ORDER BY publishedAt DESC LIMIT ?
''', str(quantity))
data = c.fetchall()

if data is not None:
for d in data:
text = d[1] + "\n<b>" + d[4] + "</b>\n" + d[5] + "\n\n" + d[3]
sentiment = getSentiment(d[4] + d[5])
text = "\n<b>" + d[4] + "</b>\n" + d[5] + "<i>(" +d[1] + ")</i>\n\n" + d[3]
cw = None if sentiment >= 0 else ":nsfw: News article"
time.sleep(2)
api = mk.notes_create(
text=text,
visibility=visibility,
local_only=local_only,
cw=cw
)
n_id = api['createdNote']['id']
n_at = int(datetime.strptime(
api['createdNote']['createdAt'], '%Y-%m-%dT%H:%M:%S.%fZ'
).timestamp())

c.execute('''
UPDATE news SET noted = 1, noteId = ?, notedAt = ? WHERE id = ?
''', (n_id, n_at, d[0]))
UPDATE news SET sentiment = ?, noteId = ?, notedAt = ? WHERE id = ?
''', (sentiment, n_id, n_at, d[0]))
db.commit()
db.close()
3 changes: 2 additions & 1 deletion jobs/fetch.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def install():
"link" TEXT NOT NULL UNIQUE,
"title" TEXT NOT NULL,
"body" TEXT,
"noted" INTEGER NOT NULL DEFAULT 0,
"sentiment" DECIMAL(1,2),
"noteId" TEXT,
"notedAt" INTEGER,
PRIMARY KEY("id" AUTOINCREMENT)
Expand All @@ -32,6 +32,7 @@ def install():
CREATE TABLE IF NOT EXISTS "feeds" (
"id" INTEGER NOT NULL UNIQUE,
"url" TEXT NOT NULL UNIQUE,
"title" TEXT,
PRIMARY KEY("id" AUTOINCREMENT)
);
''')
Expand Down
14 changes: 14 additions & 0 deletions jobs/sentiment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
""" Sentiment Analysis Module """
import asent # pylint: disable=unused-import
import spacy

def getSentiment(text):
""" Sentiment analysis result """
# load spacy pipeline
nlp = spacy.load("en_core_web_lg")
# add the rule-based sentiment model
nlp.add_pipe("asent_en_v1")
sentiment = nlp(text)
#print(f"totale: {doc._.polarity.compound}")

return sentiment._.polarity.compound
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ feedparser
Misskey.py
python-dotenv
schedule
asent
pip-review

0 comments on commit a8820e7

Please sign in to comment.