-
Notifications
You must be signed in to change notification settings - Fork 4
Source Data Explanation
Weikai Huang edited this page Jul 4, 2024
·
2 revisions
Task-Me-Anything comprises various types of source data, including:
- 2D images
- 3D assets
- Real images and videos with scene graphs
- Human annotations detailing angles, materials, colors, and shapes of 3D objects.
- A taxonomy that reflects the relationships between different concepts within the source data.
In this document, we provide an explanation of source data in TaskMeAnything/annotations folders.
This 4 files contains all the annotations for the source data.
- attribute_category.json
- cateid_to_concept.json
- cateid_to_objects.json
- taxonomy.json
attribute_category.json contains human-annotated classifications of all attributes in the SceneGraph, categorizing them into groups such as “color” and “size” for more detailed Scene Graph questions generation.
cateid_to_concept.json, cateid_to_objects.json, and taxonomy.json includes all human-annotated knowledge graphs (taxonomy) and all annotations of angles, materials, colors, and shapes of 3D assets for Task-Me-Anything.
-
cateid_to_concept.json:- Collected all concepts from Scene Graphs and 3D assets (e.g. apple, glass, dog, etc.) and normalized them to their corresponding Wikidata pages.
- For example, “eyeglass”, “eyeglasses”, “glasses”, “spectacles” in 3D assets, and “eye glasses”, “glasses” in Scene Graphs were normalized to a QID:
Q37501which corresponding to a concept page on Wikidata: https://www.wikidata.org/wiki/Q37501."Q37501": { "surface_name": [ "eyeglasses", "glasses", "spectacles" ], "wikipedia": "Glasses", "wikidata": "glasses", "wikidata_description": "accessories that improve human vision", "objaverse": [ "eyeglass", "eyeglasses", "glasses", "spectacles" ], "scene_graph": [ "eye glasses", "glasses" ] }, -
surfaces_nameis the normalized name we use to generate questions, (e.g. There is a concept namedorange_(fruit)in 3D assets , we normalized it toorangeassurface_namefor better readibility in question genrations).
-
cateid_to_objects.json:- contains the annotations of angles, materials, colors, and shapes of 3D assets in each QID.
- For example, QID:
Q37501contains 3D asset:a709eff74e544fd6b9390bb2bae0f77e,imagesmeans its visable prospectives in 2D stickers image scenarios,attributescontain the color, material, shape of this 3D assets,anglescontain the visable angles of this 3D assets in 3D scenarios."a709eff74e544fd6b9390bb2bae0f77e": { "images": [ "000.png", "001.png", "002.png", "003.png", "004.png", "005.png", "006.png", "007.png", "008.png" ], "attributes": { "color": [ "blue" ], "material": [], "shape": [] }, "angles": [ 0, 120, 240 ] },
-
taxonomy.json:- Leveraged the concept net in Wikidata to build a concept graph (taxonomy) for all concepts (QID) in Task-Me-Anything.
- Includes information like “glasses (Q37501) is a subclass of optical instrument (Q1751850)”.
- In taxonomy.json,
means Q682582 is a subclass of Q11422.
[ "Q11422", "Q682582" ], - the
nodesbelowedgesare all the concepts (QID) that are not in the Scene Graphs and 3D assets, but helps to build the taxonomy.