A set of analogy tasks of the form A:B::C:D, intended as a benchmark for analogical reasoning and planning. Analogies are augmented with Penn Treebank part-of-speech tags and include both one-to-many and many-to-one relationships. The dataset contains 23,692 analogies in all.
The dataset was originally introduced in Harvesting Common-sense Navigational Knowledge for Robotics from Uncurated Text Corpora, CoRL 2017.
The BYU Analogical Reasoning Dataset is provided is four separate formats contained in subdirectories. Each subdirectory contains the same 23,692 analogical queries with the following distinctions.text - A set of plain text files containing one analogical query per line.
text-reversed - Plain text files with word positions swapped (i.e. cereal:box::broom:closet ==> box:cereal::closet:broom)
python - A set of python dictionaries containing each analogy subcorpus
python-reversed - Python dictionaries with word positions swapped (i.e. cereal:box::broom:closet ==> box:cereal:: closet:broom)
accessing_containers (420)
affordance (2448)
belong (992)
causation (210)
containers (420)
locations for objects (2070)
rooms for containers (1406)
rooms for objects (1406)
tools (930)
trash or treasure (552)
travel (992)
11,846 analogies total