aws-appsync-session-manager-neptune is a sample project that extends previous work with AWS AppSync to include an Amazon Neptune-powered recommendation engine for a simple scheduling app.
The goals of this phase of work are to:
- Utilize a graph database in Amazon Neptune, a fully-managed graph database offered by AWS.
- Build the Neptune database as well as supporting infrastructure via AWS CloudFormation and the AWS Serverless Application Model.
- Integrate with earlier work by extending the AppSync-powered GraphQL API via AWS Lambda.
A blog post on this project can be found at TBD.
To get started, clone this repository:
$ git clone https://github.com/jkahn117/aws-appsync-session-manager-neptune.git
This project requires the following to get started:
- Select an AWS Region into which you will deploy services. Be sure that all required services (AWS AppSync and Amazon Neptune, in particular) are available in the Region you select (AWS Region Table).
- Install AWS SAM CLI and required dependencies (i.e. Docker, AWS CLI, Python).
- Install jq.
If you have not already, deploy Phase I of the AWS AppSync Session Manager.
Once complete, we can extend the API to include a recommendation query. In addition, we will deploy an Amazon Neptune cluster and two Lambda functions to put data in Neptune as sessions and user schedules are modified. Below is a high-level overview of the architecture after Phase II (resources created in Phase I are shaded out):
Before we deploy new AWS resources, we first need to modify the existing Session Manager schema.
-
Open the AWS Console and navigate to AWS AppSync.
-
Open the previously created Session Manager API and click on Schema.
-
Within the Session Manager schema, modify the
Query
type as follows (note that we are adding arecommendation
query):type Query { userSchedule: UserSchedule allSessions(nextToken: String): SessionConnection getSession(SessionId: ID!): Session search(text: String!): SessionConnection recommendations(userId: String): [Session] }
-
Save the schema by clicking the "Save Schema" button in the upper right corner.
-
Create a new S3 Bucket to house deplomyment assets:
$ aws s3 mb s3://MY_BUCKET_NAME
-
Next, we will use the SAM CLI to deploy our Amazon Neptune cluster, networking infrastructure, and Lambda functions. Note that deployment may 10-20 minutes.
$ sam package \ --template-file template.yaml \ --s3-bucket MY_BUCKET_NAME \ --output-template-file packaged.yaml $ sam deploy \ --template-file packaged.yaml \ --capabilities CAPABILITY_NAMED_IAM \ --stack-name aws-appsync-session-manager-neptune
-
Once complete, we can create sample data by "registering" several users for sessions already in the session catalog (see blog post for further details):
$ node setup/setup.js
With test data loaded to Neptune, we can now generate recommendations via a collaborative filtering approach. In short, this approach leverages the graph database to identify sessions that a user is not currently registered for but that attendees with similar existing registrations are.
-
In the AWS AppSync Console, select "Queries" in the left-hand menu.
-
Our API requires an authenticated user to access the API. Click on the "Login with User Pools" button:
Enter the following:
- ClientId - available as an output of Phase I
- Username - 'user'
- Password - 'NewPassword1%'
-
Register the user for a few sessions (a listing of all session can be found via the
allSessions
query -- see Phase 1 for details):mutation ScheduleSession { scheduleSession(SessionId: "SESSION_ID") { Sessions { Title StartTime EndTime } } }
-
Generate recommendations for the currently signed-in user:
query Recommendations { recommendations { SessionId Title StartTime EndTime } }
-
Next, we can generate recommendations for another user by including that user's unique identifier as the
userId
parameter in our query:query RecommendationsOther { recommendations(userId: "28AA3C63-2454-4B3A-825B-983746CE935A") { SessionId Title } }
The result of the latter query should be similar to the following:
{
"data": {
"recommendations": [
{
"SessionId": "C516AA84-B0B1-4092-BFD5-D664D743992A",
"Title": "C5 Instances and the Evolution of Amazon EC2 Virtualization (CMP332)"
},
{
"SessionId": "A0439B4B-D359-4253-8C20-989BE39C0EE1",
"Title": "Deep Dive on the Amazon Aurora PostgreSQL-compatible Edition (DAT402)"
},
{
"SessionId": "481BE0E0-DC48-4399-8885-CF4CB7D245AC",
"Title": "Use Amazon Lex to Build a Customer Service Chatbot in Your (DEM72)"
},
...
]
}
}
To clean-up all resources associated with this project (i.e. delete all resources), enter the following:
$ aws cloudformation delete-stack \
--stack-name aws-appsync-session-manager-neptune
As part of this project, I have included several additional resources that you may find useful in learning Amazon Neptune:
I found AWS Cloud9 incredibly useful in building and debugging the Lambda functions for this phase of work. Cloud9 is a powerful browser-based IDE, but more importantly for this work, can run in a VPC. By running my Cloud9 instance in a VPC, I was able to directly interface with my Neptune cluster and use the SAM CLI pre-installed on Cloud9 to debug locally.
To enable Cloud9, modify the included template.yaml by un-commenting the DEV block towards the bottom of the file and deploy again. In the AWS Console, you can then access your Cloud9 instance, which should also have access to the Neptune cluster.
In Cloud9, I used the Gremlin Console to interace with Neptune. Details on connecting to your cluster from the Gremlin Console are available in AWS documentation.
Amazon Neptune also provides the capability to bulk load data from Amazon S3. Within the included template.yaml, un-comment the SAMPLE DATA section towards the bottom of the file to create an S3 Bucket and an IAM Role to load data in to Neptune from S3.
After creating the bucket, copy the files included in the data
directory:
$ aws s3 cp data s3://SAMPLE_DATA_BUCKET_NAME
Next, (add the IAM Role to your Neptune Cluster)[https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial-IAM.html#bulk-load-tutorial-IAM-add-role-cluster]. As of writing, this feature is not included in CloudFormation, but can be achieved via the AWS Conole of AWS CLI.
In Cloud9, you can then load the data via curl
(note: role ARN is inclued in the Outputs of the CloudFormation template):
$ curl -X POST \
-H 'Content-Type: application/json' \
http://your-neptune-endpoint:8182/loader -d '
{
"source" : "s3://SAMPLE_DATA_BUCKET_NAME",
"format" : "format",
"iamRoleArn" : "arn:aws:iam::{YOUR_ACCOUNT_ID}:role/{YOUR_ROLE_NAME}",
"region" : "YOUR_REGION",
"failOnError" : "FALSE"
}'
This project generally leverages DynamoDB Streams and Lambda functions to load data to Neptune, but bulk loading can be helpful in other scenarios.
- Josh Kahn - Initial work