Dataset Viewer

The viewer is disabled because this dataset repo requires arbitrary Python code execution. Please consider removing the loading script and relying on automated data support (you can use convert_to_parquet from the datasets library). If this is not possible, please open a discussion for direct help.

YAML Metadata Warning:The task_categories "conversational" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

Dataset Card for GEM/dstc10_track2_task2

Link to Main Data Card

You can find the main data card on the GEM Website.

Dataset Summary

The DSTC10 Track2 Task 2 follows the DSTC9 Track1 task, where participants have to implement knowledge-grounded dialog systems. The training dataset is inherited from the DSTC9 challenge and is in the written domain, while the test set is newly collected and consists of noisy ASR transcripts. Hence, the dataset facilitates building models for grounded dialog response generation.

You can load the dataset via:

import datasets
data = datasets.load_dataset('GEM/dstc10_track2_task2')

The data loader can be found here.

website

https://github.com/alexa/alexa-with-dstc10-track2-dataset

paper

https://assets.amazon.science/54/a1/5282d47044179737b4289622c824/how-robust-are-you-evaluating-task-oriented-dialogue-systems-on-spoken-conversations.pdf

authors

Seokhwan Kim, Yang Liu, Di Jin, Alexandros Papangelis, Karthik Gopalakrishnan, Behnam Hedayatnia, Dilek Hakkani-Tur (Amazon Alexa AI)

Dataset Overview

Where to find the Data and its Documentation

BibTex

@inproceedings{kim2021robust, title={" How Robust ru?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations}, author={Kim, Seokhwan and Liu, Yang and Jin, Di and Papangelis, Alexandros and Gopalakrishnan, Karthik and Hedayatnia, Behnam and Hakkani-Tur, Dilek}, journal={IEEE Automatic Speech Recognition and Understanding Workshop}, year={2021} }

Contact Name

Seokhwan Kim

Contact Email

seokhwk@amazon.com

Has a Leaderboard?

yes

Leaderboard Link

https://eval.ai/challenge/1663/overview

Leaderboard Details

It evaluates the models based on the automatic metrics defined in the task paper for the three tasks of detection, selection and generation.

Languages and Intended Use

Multilingual?

Covered Languages

En

License

apache-2.0: Apache License 2.0

Intended Use

To conduct research on dialogue state tracking and knowledge-grounded response generation.

Primary Task

Dialog Response Generation

Communicative Goal

This dataset aims to explore the robustness of conversational models when trained on spoken data. It has two aspects, multi-domain dialogue state tracking and conversation modeling with access to unstructured knowledge.

Credit

Curation Organization Type(s)

industry

Curation Organization(s)

Amazon

Dataset Creators

Seokhwan Kim, Yang Liu, Di Jin, Alexandros Papangelis, Karthik Gopalakrishnan, Behnam Hedayatnia, Dilek Hakkani-Tur (Amazon Alexa AI)

Funding

Amazon

Who added the Dataset to GEM?

Alexandros Papangelis (Amazon Alexa AI), Di Jin (Amazon Alexa AI), Nico Daheim (RWTH Aachen University)

Dataset Structure

Data Fields

 features = datasets.Features(
 {
 "id": datasets.Value("string"),
 "gem_id": datasets.Value("string"),
 "turns": [
 {
 "speaker": datasets.Value("string"),
 "text": datasets.Value("string"),
 "nbest": [
 {
 "hyp": datasets.Value("string"),
 "score": datasets.Value("float"),
 }
 ],
 }
 ],
 "knowledge": {
 "domain": datasets.Value("string"),
 "entity_name": datasets.Value("string"),
 "title": datasets.Value("string"),
 "body": datasets.Value("string"),
 },
 "response": datasets.Value("string"),
 "source": datasets.Value("string"),
 "linearized_input": datasets.Value("string"),
 "target": datasets.Value("string"),
 "references": [datasets.Value("string")],
 }
 )

nbest contains an nbest list of outputs generated by an ASR system along with their scores.

knowledge defines the annotated grounding as well as its metadata

Reason for Structure

It was kept compatible with MultiWox 2.X data.

Example Instance

{'id': '0', 'gem_id': 'GEM-dstc10_track2_task2-test-0', 'turns': [{'speaker': 'U', 'text': "hi uh i'm looking for restaurant in lower ha", 'nbest': [{'hyp': "hi uh i'm looking for restaurant in lower ha", 'score': -25.625450134277344}, {'hyp': "hi uh i'm looking for restaurant in lower hai", 'score': -25.969446182250977}, {'hyp': "hi uh i'm looking for restaurant in lower haig", 'score': -32.816890716552734}, {'hyp': "hi uh i'm looking for restaurant in lower haigh", 'score': -32.84316635131836}, {'hyp': "hi uh i'm looking for restaurant in lower hag", 'score': -32.8637580871582}, {'hyp': "hi uh i'm looking for restaurant in lower hah", 'score': -33.1048698425293}, {'hyp': "hi uh i'm looking for restaurant in lower hait", 'score': -33.96509552001953}, {'hyp': "hi um i'm looking for restaurant in lower hai", 'score': -33.97885513305664}, {'hyp': "hi um i'm looking for restaurant in lower haig", 'score': -34.56083679199219}, {'hyp': "hi um i'm looking for restaurant in lower haigh", 'score': -34.58711242675781}]}, {'speaker': 'S', 'text': 'yeah definitely i can go ahead and help you with that ummm what kind of option in a restaurant are you looking for', 'nbest': []}, {'speaker': 'U', 'text': 'yeah umm am looking for an expensive restaurant', 'nbest': [{'hyp': 'yeah umm am looking for an expensive restaurant', 'score': -21.272899627685547}, {'hyp': 'yeah umm m looking for an expensive restaurant', 'score': -21.444047927856445}, {'hyp': 'yeah umm a m looking for an expensive restaurant', 'score': -21.565458297729492}, {'hyp': 'yeah ummm am looking for an expensive restaurant', 'score': -21.68832778930664}, {'hyp': 'yeah ummm m looking for an expensive restaurant', 'score': -21.85947608947754}, {'hyp': 'yeah ummm a m looking for an expensive restaurant', 'score': -21.980886459350586}, {'hyp': "yeah umm a'm looking for an expensive restaurant", 'score': -22.613924026489258}, {'hyp': "yeah ummm a'm looking for an expensive restaurant", 'score': -23.02935218811035}, {'hyp': 'yeah um am looking for an expensive restaurant', 'score': -23.11180305480957}, {'hyp': 'yeah um m looking for an expensive restaurant', 'score': -23.28295135498047}]}, {'speaker': 'S', 'text': "lemme go ahead and see what i can find for you ok great so i do ummm actually no i'm sorry is there something else i can help you find i don't see anything expensive", 'nbest': []}, {'speaker': 'U', 'text': "sure ummm maybe if you don't have anything expensive how about something in the moderate price range", 'nbest': [{'hyp': "sure ummm maybe if you don't have anything expensive how about something in the moderate price range", 'score': -27.492507934570312}, {'hyp': "sure umm maybe if you don't have anything expensive how about something in the moderate price range", 'score': -27.75853729248047}, {'hyp': "sure ummm maybe if you don't have anything expensive how about something in the moderate price rang", 'score': -29.44410514831543}, {'hyp': "sure umm maybe if you don't have anything expensive how about something in the moderate price rang", 'score': -29.710134506225586}, {'hyp': "sure um maybe if you don't have anything expensive how about something in the moderate price range", 'score': -31.136560440063477}, {'hyp': "sure um maybe if you don't have anything expensive how about something in the moderate price rang", 'score': -33.088157653808594}, {'hyp': "sure ummm maybe i you don't have anything expensive how about something in the moderate price range", 'score': -36.127620697021484}, {'hyp': "sure umm maybe i you don't have anything expensive how about something in the moderate price range", 'score': -36.39365005493164}, {'hyp': "sure ummm maybe if yo don't have anything expensive how about something in the moderate price range", 'score': -36.43605041503906}, {'hyp': "sure umm maybe if yo don't have anything expensive how about something in the moderate price range", 'score': -36.70207977294922}]}, {'speaker': 'S', 'text': 'ok moderate lemme go ahead and check to see what i can find for moderate ok great i do have several options coming up how does the view lounge sound', 'nbest': []}, {'speaker': 'U', 'text': 'that sounds good ummm do they have any sort of happy hour special', 'nbest': [{'hyp': 'that sounds good ummm do they have any sort of happy hour special', 'score': -30.316478729248047}, {'hyp': 'that sounds good umm do they have any sort of happy hour special', 'score': -30.958009719848633}, {'hyp': 'that sounds good um do they have any sort of happy hour special', 'score': -34.463165283203125}, {'hyp': 'that sounds good ummm do they have any sirt of happy hour special', 'score': -34.48350143432617}, {'hyp': 'that sounds good umm do they have any sirt of happy hour special', 'score': -35.12503433227539}, {'hyp': 'that sounds good ummm do they have any sord of happy hour special', 'score': -35.61939239501953}, {'hyp': 'that sounds good umm do they have any sord of happy hour special', 'score': -36.26092529296875}, {'hyp': 'that sounds good ummm do they have any sont of happy hour special', 'score': -37.697105407714844}, {'hyp': 'that sounds good umm do they have any sont of happy hour special', 'score': -38.33863830566406}, {'hyp': 'that sounds good um do they have any sirt of happy hour special', 'score': -38.630191802978516}]}], 'knowledge': {'domain': 'restaurant', 'entity_name': 'The View Lounge', 'title': 'Does The View Lounge offer happy hour?', 'body': 'The View Lounge offers happy hour.'}, 'response': 'uhhh great question lemme go ahead and check that out for you ok fantastic so it looks like they do offer happy hour', 'source': 'sf_spoken', 'linearized_input': "