This project is an open-source backend service that scrapes data from Instagram and generates widgets based on the scraped data. It utilizes Redis for caching and requires a proxy for making requests to Instagram.
- Scrapes Instagram user profile information and posts
- Caches responses in Redis to reduce the load on Instagram
- Generates widgets based on the scraped data
- Manages session handling with Instagram
- Node.js
- Redis
- Proxy service
- Clone the repository:
git clone https://github.com/hormold/instagram-scraper-widget.git
cd instagram-scraper-widget
- Install dependencies:
npm install
- Set up environment variables:
Create a .env
file in the root of the project and add the following variables:
REDIS_URL=redis://localhost:6379
PROXY_URL=https://your-proxy-service.com/{session}
My recommendation for a proxy service is Scraper API, which provides a session-based proxy service for scraping Instagram and other websites. You can generate Residential or Datacenter proxies with unique sessions for each user.
- Start the Redis server:
redis-server
- Run the backend service:
npm start
<iframe src="http://localhost:3000/widget/{username}" width="300" height="400" frameborder="0"></iframe>
GET /scrape?username={username}
{
"username": "string",
"fullname": "string",
"description": "string",
"profilePhoto": "string",
"metrics": {
"followers": "number",
"following": "number",
"posts": "number"
},
"posts": [
{
"shortcode": "string",
"photo": "string",
"accessibility_caption": "string",
"caption": "string",
"location": "string",
"likes": "number",
"comments": "number"
}
]
}
- Generates a unique session ID for each user
- Stores session data in Redis with expiration
- Retrieves and reuses last working session if valid
- Revokes session if invalid response is received from Instagram
- Updates session activity to prevent expiration
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License.