Skip to content

blabber/grawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

grawler

A gopherspace crawler

Project status

This project is not maintained any more. If you are interested in it, ping me and I will transfer maintainership/ownership of this project to you.

In the current state this software should not be used, as it is not a well behaving crawler:

  • It ignores any robots.txt that gopher holes might be providing to restrict crawling
  • It also does not attempt to reduce the load on the gopher servers, which often run on restricted resources, by spreading the requests over time

What is it?

grawler is a gopherspace crawler crawling all servers reachable (direct or indirect) from a gopherhole used as starting point. By default grawler will start crawling with gopher.floodgap.com, probably the most central gopherhole in existence.

To crawl the whole gopherspace, grawler will need a few hours on a reasonable fast computer and internet connection - todays gopherspace is not as big as it used to be :(

Installation

grawler is written in go. So you need a go environment and can install grawler by calling

go install github.com/blabber/grawler

Let the grawling begin

To start the crawler just call grawler. It will issue log messages on stderr and generate a file called grawler.dot that can be postprocessed using the graphviz graph visualization software.

Results

You can find an example grawler.dot in the results folder. If you do something cool with this data or your own result sets, please let me know.

Statistics

Gopherholes
  alive:  228
  dead:   162
  total:  390

Top 5 TLDs for alive gopherholes
  .org   73
  .net   41
  .com   36
  .de    9
  .uk    5

Top 5 TLDs for dead gopherholes
  .org   33
  .net   21
  .hu    18
  .edu   18
  .com   13

Some graphs

Raw graph

This graph was generated using the following commands:

sfdp -Tsvg -o graphs/raw.svg results/grawler.dot
All identified gopherholes, unresponsive ones colored red

This graph was generated using the following commands:

gvpr -f tools/colorize.g results/grawler.dot | \
	sfdp -Tsvg -Goverlap=false -Gsplines=true -o graphs/colored.svg
All responsive gopherholes

This graph was generated using the following commands:

gvpr -f tools/cleanup.g results/grawler.dot | \
	sfdp -Tsvg -Goverlap=false -Gsplines=true -o graphs/alive.svg

About

a gopherspace crawler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published