-
Notifications
You must be signed in to change notification settings - Fork 55
Home
matteoredaelli edited this page Aug 28, 2010
·
25 revisions
- the urls are saved to a NOSQL database (apache couchdb or riak) that support map/reduce queries
- external/internal url referrals can be saved
- many crawlers can be run concurrently, also remotely
- urls to be analysed are divided in several queues (depending on their depth/priority)
- you can run any custom method over the body of visited urls
- urls/domains can be filtered using regular expressions
- urls can be normalized/rewrited using many options (max_depth, remove_queries, … )
- many other options: see files ebot.app file and ebot_local.config
ebot statistics are saved to Round Robin Databases (using rrdtool)
- web REST interface sfor
- managing start/stop of crawlers
- submitting urls to crawlers (sync or async)
- showing ebot statistics