-
Explanatory interactive learning Can we, by interacting with models during training, encourage their explanations to line up with our priors on what parts of the input are relevant?
-
Mosaic makes it easier for ML practitioners to interact with high-dimensional, multi-modal data. It provides simple abstractions for data inspection, model evaluation and model training supported by efficient and robust IO under the hood. Mosaic's core contribution is the DataPanel, a simple columnar data abstraction. The Mosaic DataPanel can house columns of arbitrary type – from integers and strings to complex, high-dimensional objects like videos, images, medical volumes and graphs.
- Introducing Mosaic (blog post)
- Working with Images in Mosaic (Google Colab)
- Working with Medical Images in Mosaic (Google Colab)
With the ability to train models without needing labelled data through self-supervision, the focus became on scaling models up and training on more data.
- GPT-3 was the first 170B parameter model capable of few-shot in-context learning developed by OpenAI.
- Moore's Law for Everything is a post about scale and its effect on AI / society.
- Switch Transformers is a mixture of experts for training massive models beyond the scale of GPT-3.
The way experts interact with their data (e.g. a radiologist’s eye movements) contains rich information about the task (e.g. classification difficulty), and the expert (e.g. drowsiness level). With the current trend of wearable technology (e.g. AR with eye tracking capability), the hardware needed to collect such human-data interactions is expected to become more ubiquitous, affordable, and standardized. In observational supervision, we investigate how to extract the rich information embedded in the human-data interaction, to either supervise models from scratch, or to improve model robustness.
Interesting works have collected observational signals such as:
- Eye tracking data in medicine (chest x-ray dataset)
- Eye tracking plus brain activity in NLP (Zuco dataset)
- We have also collaborated with Stanford radiologists to curate an additional two medical datasets with eye tracking data [coming soon!].
Critical papers in observational supervision:
- Some of the pioneering work on using gaze data. N. Hollenstein and C. Zhang showed how to use gaze data to improve NLP models paper.
- Improving zero-shot learning with gaze by N. Karasseli et al. paper
- Improving sample complexity with gaze by K. Saab et al. paper
- Our recent work on supervising medical models from scratch [coming soon!]