Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow downloading tweets by hashtag or cashtag #3

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

flxai
Copy link

@flxai flxai commented Jul 24, 2021

This branch adds the ability to download tweets not only for a profile, but also for hashtags or cashtags.

Changes were made to the functions get_tweets and pagination_parser in nitter_scraper/tweets.py and get_tweets in nitter_scraper/nitter.py. Please tell me if you're okay with the implementation or have suggestions for improvement.

Example usage for hashtags (leading #):

import nitter_scraper
from nitter_scraper import NitterScraper

hashtags = ["ToTheMoon"]

print("Scraping with local nitter docker instance.")

with NitterScraper(host="0.0.0.0", port=8008) as nitter:
    for hashtag in hashtags:
        tweets = nitter.get_tweets(hashtag, query_type='hashtag', pages=2)
        for tweet in tweets:
            print()
            pprint(tweet.dict())
            print(tweet.json(indent=4))

Example for cashtags (leading $):

import nitter_scraper
from nitter_scraper import NitterScraper

cashtags = ["USDT"]

print("Scraping with local nitter docker instance.")

with NitterScraper(host="0.0.0.0", port=8008) as nitter:
    for cashtag in cashtags:
        tweets = nitter.get_tweets(cashtag, query_type='cashtag', pages=2)
        for tweet in tweets:
            print()
            pprint(tweet.dict())
            print(tweet.json(indent=4))

@flxai
Copy link
Author

flxai commented Jul 24, 2021

Do you think it might be better to drop the parameter query_type and make it implicit? So query_strings that start with a # are implicitly hashtags, so are cashtags with a $ at the beginning and everything else must be a user's account?

@flxai
Copy link
Author

flxai commented Jul 25, 2021

Made it implicit now. Think this to be a more intuitive user experience. It works like before now, but allows for #hashtag or $cashtag use like so:

import nitter_scraper
from nitter_scraper import NitterScraper
from pprint import pprint

queries = ["dgnsrekt", "#ToTheMoon", "$USDT"]

print("Scraping with local nitter docker instance.")

with NitterScraper(host="0.0.0.0", port=8008) as nitter:
    for query in queries:
        print('=' * 80, '\n', query, '\n', '=' * 80)
        tweets = nitter.get_tweets(query, pages=1)
        for tweet in tweets:
            print('-' * 80)
            pprint(tweet.dict())
            print(tweet.json(indent=4))

Or with an arguable bit more readibility borrowing colored output:

import nitter_scraper
from nitter_scraper import NitterScraper
from pprint import pformat
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import TerminalFormatter

def pprint_color(obj):
    print(highlight(pformat(obj), PythonLexer(), TerminalFormatter()))

queries = ["dgnsrekt", "#ToTheMoon", "$USDT"]

print("Scraping with local nitter docker instance.")

with NitterScraper(host="0.0.0.0", port=8008) as nitter:
    for query in queries:
        print('=' * 80, '\n', query, '\n', '=' * 80)
        tweets = nitter.get_tweets(query, pages=1)
        for tweet in tweets:
            print('-' * 80)
            pprint_color(tweet.dict())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant