Common problem is to find all appearences of family member on photos in family archive. Here is a solution.
-
- Scans images in specific directory (/dir) and its subdirectories and finds faces on them.
-
- Saves found faces in separate directory (/dir/tmp)
-
- Clusterizes faces using different methods (DBSCAN, OPTICS, HDBSCAN, KMeans, AgglomerativeClustering)
-
- Saves clusterized faces in separate directories (/dir/tmp/method/cluster_number)
-
- Results a saved in clusters.csv file including information about clusterization including images paths, faces and initial images hashes.
-
- Result can be used afterward for:
- finding images with specific person;
- indexing and tagging images by predefined persons;
- finding similar images;
-
- To run the script you need to run the following command in terminal: sudo apt-get install libboost-all-dev libgtk-3-dev build-essential cmake pip install face-recognition
-
- Run file clusters.py
You can check results for some group photos of G20 leaders in dir ./stock/tmp
Looks like AffinityPropagation method gives the best result given that number of cluster is not known apriori.
File cluster.csv contains data for 389 faces found: paths to original photos and its hash256.
That allows to seek indexed photo by their hashes anywhere in different stores.