Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chg ! dedup engine #15

Merged
merged 1 commit into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
.*
~*
__pycache__

!.github

__pycache__
1 change: 1 addition & 0 deletions docs/components/hde/deduplication_description.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
It provides users with powerful capabilities to identify and remove duplicate records within the system, ensuring that data remains clean, consistent, and reliable.
1 change: 1 addition & 0 deletions docs/components/hde/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
To develop the service locally, you can utilize the provided `compose.yml` file. This configuration file defines all the necessary services, including the primary application and its dependencies, to create a consistent development environment. By using **Docker Compose**, you can effortlessly spin up the entire application stack, ensuring that all components work seamlessly together.

To build and start the service, along with its dependencies, run the following command:

docker compose up --build


Expand Down
5 changes: 5 additions & 0 deletions docs/components/hde/did/workflow.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
---
tags:
- Deduplication
---

The Image Processing and Duplicate Detection workflow is designed to provide reliable face detection, recognition, and duplicate detection by leveraging a pre-trained deep learning model.

## Inference Mode Operation
Expand Down
3 changes: 2 additions & 1 deletion docs/components/hde/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# Deduplication

Deduplication Engine component of the HOPE ecosystem. It provides users with powerful capabilities to identify and remove duplicate records within the system, ensuring that data remains clean, consistent, and reliable.
Deduplication Engine component of the HOPE ecosystem.

--8<-- "components/hde/deduplication_description.md"

## Repository

Expand Down
7 changes: 6 additions & 1 deletion docs/components/hde/setup.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
---
tags:
- Deduplication
---

## Prerequisites

This project utilizes [PDM](https://pdm-project.org/) as the package manager for managing Python dependencies and environments.
Expand Down Expand Up @@ -78,7 +83,7 @@ This backend is used for storing locally downloaded DNN model files and encoded
##### FILE_STORAGE_DNN
This backend is dedicated to storing DNN model files. Ensure that the following two files are present in this storage:

1. *deploy.prototxt*: Defines the model architecture.
1. *deploy.prototxt.txt*: Defines the model architecture.
2. *res10_300x300_ssd_iter_140000.caffemodel*: Contains the pre-trained model weights.

The current process involves downloading files from a [GitHub repository](https://github.com/sr6033/face-detection-with-OpenCV-and-DNN) and saving them to this specific Azure Blob Storage using command `django-admin upgrade --with-dnn-setup`, or the specialized`django-admin dnnsetup` command .
Expand Down
3 changes: 2 additions & 1 deletion docs/components/hde/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ If you encounter issues while running the service, the **admin panel** can be a

To efficiently track and monitor errors within the application, **Sentry** is integrated as the primary tool for error logging and alerting.

For Sentry to work correctly, ensure that the **SENTRY_DSN** environment variable is set.
!!! warning "Sentry environment"
For Sentry to work correctly, ensure that the **SENTRY_DSN** environment variable is set.
2 changes: 2 additions & 0 deletions docs/glossary/terms/process.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,6 @@ Sometimes used as a term pre-intervention to talk about who we are targeting.</p

## Deduplication

--8<-- "components/hde/deduplication_description.md"

#
Loading