Library to retrieve meta data (title, favicon address etc) from a url
$ npm install --save url-info-scraper
var urlInfoScraper = require('url-info-scraper');
urlInfoScraper('http://en.wikipedia.org/wiki/Wikipedia', function(error, linkInfo) {
var title = linkInfo.title; //'Wikipedia - Wikipedia, the free encyclopedia'
});
The response is an object with the following properties:
{
isWebResource: boolean, //true if the link is valid
title: string, //title of the page requested
mime: string, //content-type header of the page e.g. image/jpeg
parsable: boolean, //false if the content type is 'application'
tooLarge: boolean, //true if the link body is greater than 5MB
faviconUrl: string //the url of the favicon for the root site, null if not found
}
- Rewrite tests to use mocked resources instead of real urls
- "Best image" support
- Store additional metadata (response time etc.)
- Screenshots
- ...?
MIT © Paul Cleary