RateBeer is a database of user-created reviews about beers and breweries. However, their API has been down for some time, making it difficult to get that information programmatically. This simplifies that process, allowing you to access it in the most painless way possible. Data is returned to you in a friendly, Pythonic way:
>>> import ratebeer
>>> rb = ratebeer.RateBeer()
>>> rb.search('Summit')
{'beers': [<Beer('/beer/21st-amendment-summit-ipa/61118/')>,
<Beer('/beer/3-daughters-summitennial-double-ipa/427367/')>,
...
<Beer('/beer/elland-summit-for-the-weekend/197690/')>,
<Beer('/beer/elland-summit-from-another-galaxy/197112/')>],
'breweries': [<Brewery('/brewers/summit-cider/22197/')>,
<Brewery('/brewers/summit-city-brewerks/18972/')>,
...
<Brewery('/brewers/summit-hard-cider-and-perry/18260/')>,
<Brewery('/brewers/summit-station-restaurant-brewery/346/')>]}
Because they're evil, and they issue takedown notices left and right. We like RateBeer. Scratch that, we love RateBeer.
Requires requests[security], beautifulsoup4, and lxml.
Use pip
:
pip install ratebeer
Or clone the package:
git clone https://github.com/alilja/ratebeer.git
Because ratebeer.py
does not use an API, since one is not provided,
no key is required. Simply:
>>> import ratebeer
>>> rb = ratebeer.RateBeer()
>>> rb.search("summit extra pale ale")
Methods
get_beer
-- Pass in the URL for a beer page and this function will return aBeer
object containing information about the beer. In addition the the URL, it accepts an optionalfetch
argument (default: False), which can be set to true to immediately download the object's attributes. See theBeer
class below. You can replicate theRateBeer.beer(URL)
functionality usingRateBeer.get_beer(URL, True).__dict__
.beer
-- Returns a dictionary with information about that beer.
>>> rb.beer("/beer/new-belgium-tour-de-fall/279122/")
{'_has_fetched': True,
'abv': 6.0,
'brewed_at': None,
'brewery': <Brewery('/brewers/new-belgium-brewing-company/77/')>,
'calories': 180,
'description': "New Belgium's love for beer, bikes and benefits is best "
'described by being at Tour de Fat. Our love for Cascade and '
'Amarillo hops is best tasted in our Tour de Fall Pale Ale. '
"We're cruising both across the country during our favorite "
'time of year. Hop on and find Tour de Fall Pale Ale in fall '
'2014.',
'ibu': 38,
'img_url': 'https://res.cloudinary.com/ratebeer/image/upload/w_120,c_limit/beer_279122.jpg',
'mean_rating': None,
'name': 'New Belgium Tour de Fall',
'num_ratings': 261,
'overall_rating': 72,
'retired': False,
'seasonal': 'Autumn',
'style': 'American Pale Ale',
'style_rating': 70,
'style_url': '/beerstyles/american-pale-ale/18/',
'tags': ['cascade', 'amarillo'],
'url': '/beer/new-belgium-tour-de-fall/279122/',
'weighted_avg': 3.34}
get_brewery
-- Pass in the URL for a brewery page and this function will return aBrewery
object containing information about that brewery. In addition the the URL, it accepts an optionalfetch
argument (default: False), which can be set to true to immediately download the object's attributes. See theBrewery
class below. You can replicate theRateBeer.brewery(URL)
functionality usingRateBeer.get_brewery(URL, True).__dict__
.brewery
-- Returns a dictionary with information about the brewery. Includes a 'get_beers()' generator that provides information about the brewery's beers.
>>> rb.brewery("/brewers/deschutes-brewery/233/")
{'_has_fetched': True,
'city': 'Bend',
'country': 'USA',
'name': 'Deschutes Brewery',
'postal_code': '97702',
'state': 'Oregon',
'street': '901 SW Simpson Ave',
'telephone': '(541) 385-8606',
'type': 'Microbrewery',
'url': '/brewers/deschutes-brewery/233/',
'web': 'https://www.facebook.com/deschutes.brewery'}
search
-- A generic search. A dictionary with two keys: beers and breweries. Each of those contains a list of objects, beers and breweries, respectively.
>>> rb = RateBeer()
>>> results = rb.search("summit extra pale ale")
>>> results
{'beers': [<Beer('/beer/summit-extra-pale-ale/7344/')>,
<Beer('/beer/summit-extra-pale-ale--rose-petals/317841/')>],
'breweries': []}
>>> results['beers'][0].__dict__
{'_has_fetched': False,
'name': 'Summit Extra Pale Ale',
'num_ratings': 721,
'overall_rating': 60,
'url': '/beer/summit-extra-pale-ale/7344/'}
beer_style_list
-- Returns a dictionary containing the beer style name and the style id.
>>> rb.beer_style_list()
{'Abbey Dubbel': 71,
'Abbey Tripel': 72,
...
'Witbier': 48,
'Zwickel/Keller/Landbier': 74}
beer_style
-- Returns a generator ofBeer
objects from the beer style page. Takes aident
for a beer style (see output ofbeer_style_list())
and optionalsort_type
("score" (default), "count", or "abv") andsort_order
("ascending" (low-to-high) or "descending" (high-to-low, default)).
>>> [b for b in rb.beer_style(71)]
[<Beer('/beer/st-bernardus-prior-8/2531/')>,
<Beer('/beer/westmalle-dubbel/2205/')>,
...
<Beer('/beer/belgh-brasse-mons-abbey-dubbel/187593/')>,
<Beer('/beer/new-glarus-thumbprint-series-dubbel/254781/')>]
Beer
requires the url of the beer you're looking for, like
RateBeer.beer
and RateBeer.get_beer
.
Attributes
abv
(float): percentage alcoholbrewery
(string): the name of the beer's brewerybrewery
(Brewery object): the beer's brewerybrewed_at
(Brewery object): actual brewery if contract brewedcalories
(float): estimated calories for the beerdescription
(string): the beer's descriptionimg_url
(string): a url to an image of the beermean_rating
(float): the mean rating for the beer (out of 5)name
(string): the full name of the beer (may include the brewery name)num_ratings
(int): the number of reviewsoverall_rating
(int): the overall rating (out of 100)retired
(boolean): True if the beer is retired, otherwise Falseseasonal
(string): Summer, Winter, Autumn, Spring, Series, Special, Nonestyle
(string): beer stylestyle_url
(string): beer style URLstyle_rating
(int): rating of the beer within its style (out of 100)url
(string): the url of the beer's ratebeer pagetags
(list of strings): tags given to the beerweighted_avg
(float): the beer rating average, weighted using some unknown algorithm (out of 5)
Any attributes not available will be returned as None
Methods
get_reviews
-- Returns a generator ofReview
objects for all the reviews in the beer. Takes areview_order
argument, which can be "most recent", "top raters", or "highest score".
Review
returns a datatype that contains information about a specific
review. For efficiency reasons, it requires the soup of the individual
review. Probably best to not try to make one yourself: use
beer.get_reviews
instead.
Attributes
appearance
(int): rating for appearance (out of 5)aroma
(int): aroma rating (out of 10)date
(datetime): review dateoverall
(int): overall rating (out of 20, for some reason)palate
(int): palate rating (out of 5)rating
(float): another overall rating provided in the review. Not sure how this different fromoverall
.taste
(int): taste rating (out of 10)text
(string): actual text of the review.user_location
(string): writer's locationuser_name
(string): writer's username
Brewery
requires the url of the brewery you want information on.
Attributes
city
(string): the brewery's citycountry
(string): the brewery's countryname
(string): the brewery's namepostal_code
(string): the brewery's postal codestate
(string): the brewery's state/municipality/provincestreet
(string): the street address of the brewerytelephone
(string): the brewery's telephone numbertype
(string): the type of brewery. Typically "microbrewery" or "macrobrewery"url
(string): the url of the brewery's ratebeer pageweb
(string): the url of the brewery's homepage
Methods
get_beers
-- Returns a generator ofBeer
objects for every beer produced by the brewery. Some brewery pages list beers that are produced by do not have any pages, ratings, or information besides a name. For now, these beers are omitted from the results.
ratebeer
uses the standard Python unit testing library.
This can be run via python test.py.
Note that the nature of web scraping means this might break at any time.
- Overhauled the Beer object so that it will be a little easier to fix with future changes. Beer object now also returns Brewery objects rather than strings for the brewery and brewed_at attributes. Also returns the url for the image of the beer and a list of user-assigned tags. The test.py file has been updated to be a bit clearer about where failures occur.
- Fixes to work with the new RateBeer search page.
Beer
andBrewery
objects are now "lazy", meaning they will not fetch the RateBeer page unless the requested attributes are not available. This should help minimize unnecessary requests.RateBeer.search()
now returns two lists ofBeer
andBrewery
objects.RateBeer.beer_style_list()
now returnsBeer
andBrewery
objects.Beer
andBrewery
objects now allow custom attributes to be set.
- Bugfixes and performance enhancements.
- Python 3 compatibility.
Major changes.
- New
Beer
,Review
, andBrewery
classes. - Substantial overhaul in
ratebeer.py
, addition of new files including separation of responsibilities - New generator functions in new classes.
reviews
is now a generator.
- Several improvements to results, particularly for edge cases and situations where search results are not in the expected order.
- Metadata for beers returns floats when appropriate.
- Captures more meta data.
- Plays better with foreign beers.
- Now if information is missing from a beer entry, its key is not added
to the
beer
output.
- Captures aliases for beer names.
- Added
beer_style_list
andbeer_style
.
- Everything conforms to PEP8 now. Thanks to the fine folks here.
- Minor refactoring.
- Added
reviews
. - Better exceptions (no more
LookupError
for 404s)
- Initial release.
Creator: Andrew Lilja
Contributors: * Vincent Castellano (Surye) - Python 2 and 3 compatability * Steven A. Cholewiak - General bug squishing * parryc - Scraping updates and general bug squishing
All code released under the Unlicense (a.k.a. Public Domain).