Skip to content
/ rrss Public

RSS feed reader with web interface and SQLite data storage

License

Notifications You must be signed in to change notification settings

acavalin/rrss

Repository files navigation

RRSS - Ruby RSS feed reader

RRSS is a Rss/Rdf/Atom feed reader written in Ruby and using Sinatra and SQLite.

It reads simple configuration files in YML format, downloads and stores items in various SQLite databases and sports a nice web GUI to read and manage them (modify, comment, mark, etc...).

RRSS is also able to:

  • download gzipped files
  • use scripts for scraping a web page (placed in ./scrapes)
  • use scripts for manipulating the downloaded file (placed in ./scripts)
  • use regular expressions for manipulating the downloaded file (:regexp: option)
  • export stored items in JSON or ATOM format (with mark status too)
  • run in batch mode without starting the GUI
  • GUI: set custom feed favicon
  • GUI: use a custom skin (CSS)

0. Installation

  1. check the ruby version you have with ruby -v and make sure it is >= 1.9
  2. obtain a copy of the project from github, you can either:
- download the ZIP: `wget https://github.com/acavalin/rrss/archive/master.zip && cd rrss-master`
- clone the repository: `git clone https://github.com/acavalin/rrss.git && cd rrss`
  1. install the bundler gem: gem install bundler
  2. now you can install all required gems with bundle install
  3. edit config.yml as you prefer
  4. create a feeds.yml config file (see feeds.yml.example)
  5. run the application with ./rrss.rb
  6. point your browser to http://localhost:3333
  7. ????
  8. profit! ;^)

1. Configuration

1.1 config.yml

1.1.1 rss_dler options

Configuration for the feed downloader and parser:

KeyDescr
hash_keys Item properties to be hashed for generating its unique id.
Available keys are: :id, :link, :title, :descr, :date
max_item_daysNumber of required days for an item to be marked as old
max_old_itemsMaximum number of old items to keep
parse_timeoutRss parsing timeout (in seconds)
period Feed check default interval time (in minutes)
timeout Download/scrape timeout (in seconds)

1.1.2 rss_mngr options

Configuration for the feed manager (the web GUI):

KeyDescr
check_interval periodic feed check interval time (in minutes)
exit_grace_timefeed download grace time on exit (in minutes)
layout layout css file name
port webserver (GUI) listening port

1.2 feeds.yml

The file represents the feeds tree as an array of key-value options, here is an example (see also feeds.yml.example):

  ---
  # this item is on the root folder
  - :name:    example1
    :link:    http://www.foo.com
    :period:  10
    :enabled: true
    :regexp:  [['HELLO', 'Hello'], ['hi', 'HI']]
  
  - :name:    example2
    :url:     http://bar.org/rss.xml
    :period:  60
    :enabled: true
  
  # an open folder (w/o ':' at the beginning)
  - folder:
    # a collapsed subfolder (w/ ':' at the beginning)
    - :subfolder:
      # these two items are children of "subfolder"
      - :name:    example3A
        :url:     http://www.foo2.com/rss.php
        :period:  720
        :enabled: true
      
      - :name:    example3B
        :url:     http://www.bar2.com/atom.xml
        :period:  720
        :enabled: true
    # these two items are children of "folder"
    - :name:    example4A
      :url:     http://www.fb.org/en/feeds/news.rss
      :period:  1440
      :enabled: true
  
    - :name:    example4B
      :url:     http://www.xyz.net/news.xml
      :period:  1440
      :enabled: true

Every feed has a set of options you can use to customize it:

KeyDescr
:name: feed feedname ([a-z_])
:enabled: enable the periodic download for this feed (default false)
:hash_keys: array of item properties to be hashed for generating the unique id
:limit: only consider this quantity of most recent downloaded items
🔗 clickable link on the feeds tree
:parse_timeout:overrider default parse timeout (in seconds)
:period: periodic download interval (in minutes)
:regexp: array of pairs [regexp, replace_string] to manipulate items
:summary: save and show the summary of the item (default true)
:timeout: overrider default downaload timeout (in seconds)
:url: url of the xml file to download
:validation: apply feed validation during parsing (default true)

As you can see in the previous example, a folder comes in two flavors:

  • << name : >> renders an expanded folder on the feeds tree
  • << : name : >> renders a collapsed folder on the feeds tree

1.2.1 OPML import/conversion

Here is a useful command line combo to perform an easy OPML (indented XML) to YML conversion:

  cat feedlist.opml | \
    sed 's/<outline title="\(.*\)" text=".*">/- :\L\1:/' | \
    sed 's/\( \+\)<outline text="\([^"]*\)".*htmlUrl="\([^"]*\)" xmlUrl="\([^"]*\)".*\/>/\1- :name:    \L\2\E\n\1  :url:     \4\n\1  :link:    \3\n\1  :enabled: true\n/' | \
    grep -v "<.outline>" > feeds.yml

1.3 Feed processing

When adding a new feed, keep in mind the retrival/manipulation steps the application will perform on the downloaded file:

  1. if ./scrapes/feedname exists and is executable then run it and capture its output
  2. otherwise download the file specified in :url:
  3. if ./scripts/feedname exists and is executable then use it to convert the previous output (it must read the input from stdin and print output to stdout)
  4. sequentially apply every eventual regexp specified in :regexp:
  5. convert contents to UTF-8, parse and store them to ./db/feedname.db
  6. autopurge old items (only read and unkept ones)

2. Feed GUI/Webserver

2.1 Keyboard shortcuts

Key Function
h show help
n select next unread item
downselect next item
m/upselect previous item
homeselect first item
end select last item
u toggle unread on selected item
k toggle kept on selected item
esc close/reset view
v change view filter
l show linear view in list mode
L show linear view in thumbs mode
r refresh feeds tree
R mark all feed items as read
s search items in current feed/folder

2.2 Feed items export

You can download all desired feed items by using the following urls:

  • http://ip_address:port/dump/feed_name.xml (atom feed)
  • http://ip_address:port/dump/feed_name.json (json object)

Note: Feed/item preferences are included in the XML/Atom file within dc_type tags.

2.3 Change feed favicon

To set a custom favicon for a specific feed use the script set_favicon.rb:

set_favicon.rb feed_name favicon_uri

where feed_name is the name specified in :name: and the favicon_uri can be either an URL or a local file PATH.

3. Feed downloader

3.1 Batch mode/dump

You can run the download process of your feeds in batch mode using the script check_feeds.rb:

check_feeds.rb [dump_dir [format]]

if you supply a dump directory then the processed feed will be dumped on that place. The format can be either xml or json.

4. Referrer/External resources

Webservers tend to block a localhost referrer for feeds that rely on external resources like images :'(

If you use Firefox, you can bypass this problem by installing the Referrer Control extension; you can find the full documentation on its wiki page

You just need to add a custom rule:

*localhost*, <any>, <remove>

and remember to set the default rule to Skip if you wish to preserve the browser default behaviour.

A. Reference documentation

Here is a list of the specs, libraries and tools used to develop RRSS: