A new dataset from Stanford University is designed to train artificial intelligence (AI) systems to understand how to answer questions more effectively by knowing when they lack sufficient information to answer them accurately.
The update to the Stanford Question Answering Dataset (SQuAD 2.0) upgrades a dataset companies often use to tout the question-answering precision of their language-understanding AI systems.
Earlier datasets operated by providing a paragraph of text to the algorithm, and then asking it to answer some questions. Those datasets usually assumed the answer actually existed in the text, but SQuAD 2.0-trained AIs must decide either how to answer the question correctly or whether it is answerable.
SQuAD 2.0 has about 50,000 unanswerable questions that loosely relate to the subject matter of the reference text.
The first attempt to train question-answering systems on the dataset yielded 66% accuracy.
The publishing of SQuAD 2.0 will enable other scientists to train algorithms to improve their question-answering capability.
View Full Article
Abstracts Copyright © 2018 Information Inc., Bethesda, Maryland, USA