Invoicely is a serverless application designed to process invoices (PDFs or images) uploaded to AWS S3. It extracts important data like CNPJ, total value, and items using AWS Textract, stores the extracted information in DynamoDB, and notifies the user upon success or failure via SNS. Additionally, the system includes robust error handling and retry mechanisms using SQS and DLQ (Dead Letter Queue) to ensure reliable processing.
- 🧾 Extracts data (CNPJ, total value, items) from invoices using Textract.
- 🗄️ Stores extracted data in DynamoDB.
- 🔄 Retry mechanism with SQS to reprocess failed files automatically.
⚠️ Dead Letter Queue for failed messages after max retry attempts.- 📧 Notifications via SNS upon successful or failed processing.
- 🏷️ Tracks status of each invoice processing (processing, completed, failed) in DynamoDB.
- 🖥️ Fully containerized using Docker for local development.
- Why? S3 provides scalable storage to handle uploads of large volumes of invoices (PDFs/images) with secure access.
- Why? Textract is ideal for extracting structured and unstructured data from invoices, eliminating the need for manual input.
- Why? DynamoDB offers scalable, serverless data storage with fast access, which fits well in this event-driven architecture.
- Why? SQS queues allow us to decouple and asynchronously process invoice uploads. It also enables a retry mechanism in case of failure.
- Why? DLQ ensures that after several retry attempts, failed messages are stored in a separate queue for further inspection.
- Why? SNS enables real-time notifications to users or external systems about the processing status of invoices (success or failure).
- Why? Docker allows the application to run consistently across different environments by packaging all dependencies in containers.
Ensure you have the following installed on your machine:
- Node.js (v14+)
- AWS CLI (configured with appropriate credentials)
- Docker (for running the app locally)
- npm (Node package manager)
-
Clone the repository:
git clone https://github.com/your-repo/invoicely.git cd invoicely
-
Install dependencies:
npm install
-
Set up your environment variables: Create a
.env
file in the root of the project and configure the following:AWS_REGION=your-region DYNAMODB_TABLE_NAME=your-dynamodb-table S3_BUCKET_NAME=your-s3-bucket SNS_TOPIC_ARN=your-sns-topic
-
Running locally with Docker: Build and run the Docker container:
docker-compose up --build
-
Upload an invoice (PDF/image) to S3:
- You can manually upload a file to the S3 bucket you configured in your
.env
file or use a tool like Postman to make a POST request to the upload route:POST /api/upload
- You can manually upload a file to the S3 bucket you configured in your
-
Automatic processing:
- The file will automatically trigger the SQS queue, which will invoke the Lambda function to extract data using Textract.
- Once the data is extracted, it is saved in DynamoDB.
- SNS will notify you upon success or failure.
-
Retry mechanism:
- If a failure occurs during processing (e.g., a Textract failure), SQS will retry the process.
- After a maximum of 3 attempts, the message will be sent to the Dead Letter Queue (DLQ) for further inspection.
-
Monitoring status:
- The status of each file (
processing
,completed
, orfailed
) can be tracked in the DynamoDB table.
- The status of each file (
-
Build the Docker image:
docker-compose build
-
Start the application:
docker-compose up
-
Run tests (for services, controllers, etc.):
npm test
-
Architecture: This project follows a fully serverless architecture leveraging AWS Lambda, S3, DynamoDB, Textract, SQS, and SNS to create an event-driven, scalable system for invoice processing.
-
Clean Code: The project adheres to clean code principles, ensuring maintainability and readability.
-
Error Handling: The retry mechanism with SQS ensures that errors are retried up to 3 times before being moved to the DLQ for manual inspection.
You can deploy this application using AWS Lambda and AWS S3 via AWS CLI or using infrastructure-as-code tools such as Terraform or AWS CloudFormation.
Here’s an example of how you might deploy the Lambda function using AWS CLI:
aws lambda create-function \
--function-name InvoicelyLambda \
--zip-file fileb://function.zip \
--handler index.handler \
--runtime nodejs14.x \
--role arn:aws:iam::YOUR_ACCOUNT_ID:role/service-role/YOUR_ROLE_NAME
- 🔍 Advanced Error Handling: Integration with AWS CloudWatch Alarms to monitor processing failures in real-time.
- 🚀 Performance Improvements: Adding caching layers using Redis for faster access to previously processed invoices.
- 📊 Data Analytics: Build a dashboard for viewing the status and history of processed invoices using Amazon QuickSight.