Skip to content

Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a follow-up question about the image, the task is to answer the question. [ref: https://visualdialog.org/]

Notifications You must be signed in to change notification settings

afarmer2005/Visual-Dialog-VisDial-

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 

Repository files navigation

About me

虚步, a Ph.D. student at BUPT. He is very fortunate to be advised by his advisor. His research is in the area of Vision, Language, and Reasoning, with a focus on Visual Dialogue. He is particularly interested in building a visually-grounded conversational AI (social robot) that can see the world and talk with us in natural language. Other interests include Visual/Language Grounding, Visual Reasoning, Visual Question Generation, and Visually-grounded Referring Expression.
Now I've been working on the GuessWhich, Visual Dialog(VisDial) and Talking-to-Videos(Video-Grounded Dialogue) task, please feel free to contact me with pangweitf@bupt.edu.cn or pangweitf@163.com if you have any questions or concerns.

Visual Dialog (VisDial) task

Visual Dialog needs an AI agent to chat with humans in natural, conversational language about visual content. Specifically, given a specific image, a dialog history, and a follow-up question about the image, the task for the AI agent is to answer the question in free-form natural language.

Paper

Performance

Training

References

  1. https://visualdialog.org/
  2. Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M.F. Moura, Devi Parikh, Dhruv Batra. Visual Dialog. In CVPR 2017.
  3. Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M.F. Moura, Devi Parikh, Dhruv Batra. Visual Dialog: Supplementary Document. In CVPR 2017.
  4. ...

About

Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a follow-up question about the image, the task is to answer the question. [ref: https://visualdialog.org/]

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published