This repository is the implementation of Latent BKI, aiming to reproduce the experiment results shown in the paper.
conda env create -f environment.yml # this may takes some time
conda activate latentbki_env
There are some package needs manual installation:
- Clip: follow the repo.
- torchsparse (sudo apt-get install libsparsehash-dev, pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git@v1.4.0)
- pytorch3d
(Optional) Install ROS Noetic to visualize map in Rviz
Note: Also checkout all dependent codebase if facing any issues.
Follow VLMap's Generate dataset section to get MP3D sequences. The ground truth semantic generated is a bit incorrect, so we provided a modifcation to obtain correct ground truth data here.
Follow 2DPASS' Data Preparation section to obtain Semantic KIITI dataset under Dataset
folder
Download SPVCNN model checkpoint from here and put it under ./TwoDPASS/pretrained/SPVCNN
.
- Download Record3D app on iPhone/iPad, record a video and export it as
.r3d
file. - Extract the files in
.r3d
same as extracting from zip files, you will get a folder namedrgbd
and ametadata
file. - Run
Data/select_r3d_frames.py
with customized parameters to create following real world dataset folder structure:
/[your dataset path]
├──
├── ...
└── real_world/
├──[sequence name]
├── conf/
| ├── 000000.npy
| ├── 000001.npy
| └── ...
└── depth/
| ├── 000000.npy
| ├── 000001.npy
| └── ...
└── rgb/
| ├── 000000.jpg
| ├── 000001.jpg
| └── ...
└── intrinsics.txt
└── poses.txt
You can download an already processed Record3D data example, my_house_long.zip
, here.
Download mp3d_pca_64.pkl
here to ./PCAonGPU/PCA_instance
Required to provide path in ./config/mp3d.yaml
to following parameters:
- data_dir: "/path/to/realworld/dataset/folder"
- pca_path: "/path/to/trained/pca/.pkl/file"
Other parameters are optional if only want to reproduce the result.
Modify realworld.yaml
under ./config
Required parameters:
- num_classes: [number of class desired]
- data_dir: "/path/to/realworld/dataset/folder"
- pca_path: "/path/to/trained/pca/.pkl/file"
- intrinsic: [matrix from the intrinsic.txt]
- sequences: [
[your_sequences_name]
]
- category: [
[List of words you want decode]
]
Optional parameters:
- feature_size: [PCA downsampled size, default 64]
- grid_mask: [ignore points outside local grid, default True]
- down_sample_feature: [default True]
- raw_data: [Set to True only if features are saved to disk]
- subsample_points: [How much pixel feature to use, default 1, use all feature]
- feature_dir: [set it only if you save latent feature to disk]
NOTE: semantic_kitti.yaml
is used to provide additional parameters, such as feature size. We are using the dataloader in 2DPASS. Change the following parameters in TwoDPASS/config/SPVCNN-semantickitti.yaml
:
train_data_loader:
data_path: "/path/to/kitti/dataset/sequences"
val_data_loader:
data_path: "/path/to/kitti/dataset/sequences"
In ./generate_results.py
, set MODEL_NAME
to one of the following:
- "LatentBKI_default": latent mapping using MP3D
- "LatentBKI_kitti": latent mapping using semantic KITTI
- "LatentBKI_vlmap": including vlmap heuristic for comparison experiment
- "LatentBKI_realworld": map real-world environment captured by Record3D
Generated latent map and evaluation result for each sequence will be under Results
folder.
In ./inference.py
, provide the following parameters:
- RESULT_SAVE: the folder that contain the map you want to evaluate
- MODEL_NAME: The model you used to create the above map
- scenes: the sequences you want to evaluate
The evalution result will be under the folder you provided to RESULT_SAVE
as a results.txt
file.
- Run
./publish_map.py
withlatent_map
andcategory_map
set to the map you want to visualize. - Open Rviz and subscribe to topic
visualization_marker_array
- Run
./publish_map.py
with customizedMODEL_NAME
andlatent_map_path
parameter. - Open Rviz and subscribe to topic
Open_Query/Heatmap
andOpen_Query/Uncertainty
- In terminal follow the prompt to query arbitrary word.