use pre-trained models to automate adding a ton of additional labels to laion aesthetic and/or 400m
- semantic segmentation
- instance segmentation and classification
- depth
- imagenet class
- common and/or multilingual text embeddings
- common image embeddings (how big are the pre-computed CLIP embeddings?)
be sure to version dataset within its name, so maybe LAIONpp_2022v1 or something like that.
- anticipate that we would update low confidence labels and/or apply better estimation models in the future
- chose a handful of enrichments to start with
- pick some models
- benchmark models for latency/batch rough estimation
- form an estimate of how much compute/time this would require to run. probably not much.
- estimate data requirement. this will be non trivial. maybe coordinate with TheEye for hosting?