Amplify Spiders v1 is an AWS Amplify project that hosts a Next.js site with real-time data and custom Lambda handlers that include Lambda containers and tensorflow.js. This project provides crawlers for several different search engines for competitive analysis only.
To get started with Amplify Spiders v1, follow these steps:
-
Clone the repository to your local machine.
-
Install the necessary dependencies by running
npm install
. -
Set up your AWS Amplify environment by following the instructions in the amplify/README.md file.
-
Run the project locally by running
npm run dev
. -
Build and push the container CI/CD Needs some work still: see here for more information Pre-push Hook This may need to be disabled on the first deploy?
-
Deploy the first time rename the hook,
amplify push
, put the hooks name back. -
Update the following secrets with
amplify update function
:- googleKey,
- googleCx,
- foursquareClientId,
- foursquareClientSecret,
- facebookAccessToken,
- infogroupApiKey,
- yellowpagesKey,
- yelpApiToken,
- foursquareApiKey
-
Deploy the project to the cloud with the hook enabled by running
ECR_REPO_NAME="" ACCOUNT_ID="" amplify push
. WhereECR_REPO_NAME
is the repo that CDK generates.
Amplify Spiders v1 includes the following features and functionality:
- Next.js site
- Real-time data
- Custom Lambda handlers
- Lambda containers
- Tensorflow.js for Universal Sentence Encoder to compare search results to search query.
- Crawler for Google custom search engine
- Crawler for Citysearch
- Crawler for Yelp
- Crawler for Yellow Pages
- Crawler for FourSquare
- Can create a new user
- Can login as a user
- Can create a domain to monitor
- Can view all domains
- Can view domain details
- Can view historical rankings for a domain and a search engine as a line chart
- Detects if the domain is in the first page search results for a search engine
Amplify Spiders v1 is currently in development. The following features and functionality are planned for future releases:
- Finish the main site menubar (can login, but not logout yet) IN PROGRESS
- Remove lambda contianers by treeshaking this library to reduce bundle size.
- Crawler for Facebook Business: Need to get the app approved by Facebook for the demo site
- Crawler for Bing?
- Crawler for Yahoo?
- CI/CD for Container Lambda handlers?
- Find good sources of regional statistical and demohgraphic data for cross referencing with search results?
See the CONTRIBUTING.md file for information on how to contribute to Amplify Spiders v1.
Amplify Spiders v1 is licensed under the MIT License. See LICENSE.txt for more information.
Amplify Spiders v1 has adopted the Contributor Covenant Code of Conduct. See CODE_OF_CONDUCT.md for more information.