大小单双倍投公式 In recent years researchers have achieved considerable success applying neural network methods to question answering (QA). These approaches have achieved state of the art results in simplified closed-domain settings 1 such as the SQuAD (Rajpurkar et al. 2016) dataset, which provides a pre-selected passage, from which the answer to a given question may be extracted. More recently, researchers have begun to tackle open-domain QA, in which the model is given a question and access to a large corpus (e.g., wikipedia) instead of a pre-selected passage (Chen et al. 2017a). This setting is more complex as it requires large-scale search for relevant passages by an information retrieval component, combined with a reading comprehension model that "reads" the passages to generate an answer to the question. Performance in this setting lags well behind closed-domain performance. In this paper, we present a novel open-domain QA system called Reinforced Ranker-Reader (R 3), based on two algo-rithmic innovations. First, we propose a new pipeline for open-domain QA with a Ranker component, which learns to rank retrieved passages in terms of likelihood of extracting the ground-truth answer to a given question. Second, we propose a novel method that jointly trains the Ranker along with an answer-extraction Reader model, based on reinforcement learning. We report extensive experimental results showing that our method significantly improves on the state of the art for multiple open-domain QA datasets. 2
translated by 谷歌翻译
最近,开放域问答(QA)已经与机器理解大小单双倍投公式相结合,以在大型知识源中找到答案。 Asopen-domain QA要求从文本语料库中检索相关文档以解决问题,其性能在很大程度上取决于文档检索器的性能。然而,由于传统的信息检索系统无法有效地获取含有概念的文件,因此降低了QA系统的性能。简单地提取更多文档会增加无关文档的数量,这也会降低QA系统的性能。在本文中,我们介绍段落Ranker,其中包含检索到的文档的段落,以获得更低的回答。我们表明,使用ParagraphRanker对排名段落和聚合答案进行排名可以提高四个开放域QA数据集上的开放域QA管道的平均性能7.8%。
translated by 谷歌翻译
本研究考虑了机器大规模阅读(MRS)的任务,其中,在给出问题的情况下,系统首先执行信息检索(IR)任务,在知识源中找到相关的段落,然后执行提取理解(RC)任务。回答跨越这些段落。以前的MRS研究,其中IR组件的训练没有考虑到跨度,努力准确地从大量的段落中找到少量的相关段落。在本文中,我们提出了一种简单有效的方法,通过使用有监督的多任务学习来结合IR和RC任务,以便通过考虑答案范围来训练IR组件。标准基准测试的实验结果,使用完整维基百科作为知识源来回答SQuADquestions,表明我们的大小单双倍投公式实现了最先进的性能。此外,我们使用新的Japanesedataset和SQuAD彻底评估了我们的大小单双倍投公式组件的个体贡献。结果显示了IR任务的显着改进,并为RC提供了一个新的IR观点:教导段落的哪一部分回答问题而不是仅给出整个段落的相关性是有效的。
translated by 谷歌翻译
This paper proposes to tackle open-domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. This task of machine reading at scale combines the challenges of document retrieval (finding the relevant articles) with that of machine comprehension of text (identifying the answer spans from those articles). Our approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA datasets indicate that (1) both modules are highly competitive with respect to existing counterparts and (2) multitask learning using distant supervision on their combination is an effective complete system on this challenging task.
translated by 谷歌翻译
This paper aims at improving how machines can answer questions directly from text, with the focus of having models that can answer correctly multiple types of questions and from various types of texts, documents or even from large collections of them. To that end, we introduce the Weaver model that uses a new way to relate a question to a textual context by weaving layers of recurrent networks, with the goal of making as few assumptions as possible as to how the information from both question and context should be combined to form the answer. We show empirically on six datasets that Weaver performs well in multiple conditions. For instance, it produces solid results on the very popular SQuAD dataset (Rajpurkar et al., 2016), solves almost all bAbI tasks (Weston et al., 2015) and greatly outper-forms state-of-the-art methods for open domain question answering from text (Chen et al., 2017).
translated by 谷歌翻译
We consider the problem of adapting neural paragraph-level question answering models to the case where entire documents are given as input. Our proposed solution trains models to produce well calibrated confidence scores for their results on individual paragraphs. We sample multiple paragraphs from the documents during training, and use a shared-normalization training objective that encourages the model to produce globally correct output. We combine this method with a state-of-the-art pipeline for training models on document QA data. Experiments demonstrate strong performance on several document QA datasets. Overall, we are able to achieve a score of 71.3 F1 on the web portion of Triv-iaQA, a large improvement from the 56.7 F1 of the previous best system.
translated by 谷歌翻译
本文关注的是开放式域名问答(即OpenQA)。最近,一些作品将此问题视为阅读理解(RC)任务,并直接将成功的RC大小单双倍投公式应用于此。但是,这些大小单双倍投公式的性能不如RC任务那么好。在我们看来,RC的视角忽略了OpenQA任务中的三个特征:1)没有答案范围的许多段落都包含在数据集中; 2)一个给定段落内可能存在多个答案范围; 3)答案范围的结束位置取决于起始位置。在本文中,我们首先提出了一个新的OpenQA概率表达式,它基于三级层次结构,即问题级别,段落级别和跨度级别。然后设计分层答案跨度大小单双倍投公式(HAS-QA)以捕获每个概率。 HAS-QA能够解决上述三个问题,并且对公共OpenQA数据集的实验表明,它显着优于传统的RC基线和最近的OpenQA基线。
translated by 谷歌翻译
We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models. While most successful approaches for reading comprehension rely on recurrent neural networks (RNNs), running them over long documents is prohibitively slow because it is difficult to parallelize over sequences. Inspired by how people first skim the document, identify relevant parts, and carefully read these parts to produce an answer, we combine a coarse, fast model for selecting relevant sentences and a more expensive RNN for producing the answer from those sentences. We treat sentence selection as a latent variable trained jointly from the answer only using reinforcement learning. Experiments demonstrate the state of the art performance on a challenging subset of the WIKIREADING dataset (Hewlett et al., 2016) and on a new dataset, while speeding up the model by 3.5x-6.7x.
translated by 谷歌翻译
开放域问答(QA)是AI和NLP中的一个重要问题,它正在成为AI方法和技术普遍性进展的领头羊。通过信息检索方法和语料库构建的进步,开放域QA系统的大部分进展已经实现。在本文中,我们重点介绍最近推出的ARC Challengedataset,其中包含2,590个多项选择题,这些题目是为学校的科学考试而编写的。选择这些问题是当前QA系统中最具挑战性的问题,并且当前的现有技术性能仅比随机机会略好。我们提出了一个系统,它将agiven问题重写为用于从科学相关文本的大量语料库中检索支持文本的查询。我们的重写器能够整合来自ConceptNet的背景知识,并且与在SciTail上训练的通用textualentailment系统相结合,在检索结果中识别支持 ??- 在端到端QA任务上优于几个强大的基线,尽管只是经过培训以识别基本术语在原始资源中。我们使用一般化的决策方法而不是检索证据并回答候选人以选择最佳答案。通过结合querygrriting,背景知识和文本蕴涵,我们的系统能够在ARC数据集上表现出几个强大的基线。
translated by 谷歌翻译
We propose a simple yet robust stochastic answer network (SAN) that simulates multi-step reasoning in machine reading comprehension. Compared to previous work such as ReasoNet which used reinforcement learning to determine the number of steps, the unique feature is the use of a kind of stochastic prediction dropout on the answer module (final layer) of the neu-ral network during the training. We show that this simple trick improves robustness and achieves results competitive to the state-of-the-art on the Stanford Question Answering Dataset (SQuAD), the Adver-sarial SQuAD, and the Microsoft MAchine Reading COmprehension Dataset (MS MARCO).
translated by 谷歌翻译
开放式域名问题解答仍然是一项具有挑战性的任务,因为它需要能够理解问题和答案,收集有用信息和推理证据的大小单双倍投公式。以前的工作通常将此任务表示为从搜索引擎检索到的阅读理解或蕴涵问题。然而,当没有提供直接相关的证据时,现有技术很难找到间接相关的证据,特别是对于难以准确解析问题要求的复杂问题。在本文中,我们提出了一个猎头阅读器大小单双倍投公式,该大小单双倍投公式可以在问答过程中参与基本术语。 Webuild(1)一个重要的术语选择器,它首先识别问题中最重要的词,然后重新构造查询并搜索相关证据; (2)增强型读者,区分基本术语和分散注意力的词语以预测答案。我们在多个开放域QA数据集上评估我们的大小单双倍投公式,其中它优于现有的最新技术,特别是在AI2 ReasoningChallenge(ARC)数据集上相对改进了8.1%。
translated by 谷歌翻译
In this paper, we present the gated self-matching networks for reading comprehension style question answering, which aims to answer questions from a given passage. We first match the question and passage with gated attention-based recurrent networks to obtain the question-aware passage representation. Then we propose a self-matching attention mechanism to refine the representation by matching the passage against itself, which effectively encodes information from the whole passage. We finally employ the pointer networks to locate the positions of answers from the passages. We conduct extensive experiments on the SQuAD dataset. The single model achieves 71.3% on the evaluation metrics of exact match on the hidden test set, while the ensemble model further boosts the results to 75.9%. At the time of submission of the paper, our model holds the first place on the SQuAD leaderboard for both single and ensemble model.
translated by 谷歌翻译
教学计算机阅读和回答与文档有关的一般问题是一个具有挑战性但尚未解决的问题。在本文中,我们描述了一种称为推理网络(ReasoNet)形式理解任务的神经网络结构。 ReasoNets利用多个轮次来有效地利用,然后推断查询,文档和答案之间的关系。与使用固定数量的转弯推理推断的先前方法不同,ReasoNets引入终止状态以放宽对推理深度的约束。通过使用强化学习,ReasoNets可以在消化中间结果后动态地确定是否继续理解过程,或者当它包含现有信息足以产生答案时终止阅读。 ReasoNetshave在机器理解数据集中取得了卓越的性能,包括非结构化的CNN和Daily Mail数据集,Stanford SQuAD数据集以及结构化的Graph Reachability数据集。
translated by 谷歌翻译
We frame Question Answering (QA) as a Reinforcement Learning task, an approach that we call Active Question Answering. We propose an agent that sits between the user and a black box QA system and learns to reformulate questions to elicit the best possible answers. The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the returned evidence to yield the best answer. The reformulation system is trained end-to-end to maximize answer quality using policy gradient. We evaluate on SearchQA, a dataset of complex questions extracted from Jeopardy!. The agent outperforms a state-of-the-art base model, playing the role of the environment, and other benchmarks. We also analyze the language that the agent has learned while interacting with the question answering system. We find that successful question reformulations look quite different from natural language paraphrases. The agent is able to discover non-trivial reformulation strategies that resemble classic information retrieval techniques such as term re-weighting (tf-idf) and stemming.
translated by 谷歌翻译
理解自然语言需要常识和背景知识,但在大多数神经自然语言理解(NLU)系统中,必须在学习期间从训练语料库中获取该知识,然后在测试时它是静态的。我们为NLU大小单双倍投公式中的显式背景知识的动态整合引入了一种新的架构。通用目的读取模块以自由文本语句的形式(连同任务特定的文本输入)读取背景知识,并将精炼的单词表示产生到特定于任务的NLU体系结构,该体系结构使用这些表示重新处理任务输入。文档问答(DQA)和识别文本蕴涵(RTE)的实验证明了该方法的有效性和灵活性。分析表明,我们的大小单双倍投公式学会以语义上适当的方式开发知识。
translated by 谷歌翻译
阅读理解大小单双倍投公式基于递归神经网络,其有序地处理文档令牌。随着兴趣转向在较长的文档上回答更复杂的问题,顺序阅读大部分文本成为一个重要的瓶颈。受人类如何使用文档结构的启发,我们提出了一种新的阅读理解框架。我们将文档表示为树,并模拟一个代理,该代理学习通过文档树交换快速导航和更昂贵的答案提取。为了鼓励对文档树的探索,我们提出了一种基于Deep Q-Network(DQN)的新算法,该算法在训练时策略性地对树节点进行采样。根据经验,我们发现我们的算法提高了与DQN相比的问答性能和强大的信息检索(IR)基线,并且我们的大小单双倍投公式与IR基线的结合使得性能得到进一步提高。
translated by 谷歌翻译
Recent development of large-scale question answering (QA) datasets triggereda substantial amount of research into end-to-end neural architectures for QA.Increasingly complex systems have been conceived without comparison to simplerneural baseline systems that would justify their complexity. In this work, wepropose a simple heuristic that guides the development of neural baselinesystems for the extractive QA task. We find that there are two ingredientsnecessary for building a high-performing neural QA system: first, the awarenessof question words while processing the context and second, a compositionfunction that goes beyond simple bag-of-words modeling, such as recurrentneural networks. Our results show that FastQA, a system that meets these tworequirements, can achieve very competitive performance compared with existingmodels. We argue that this surprising finding puts results of previous systemsand the complexity of recent QA datasets into perspective.
translated by 谷歌翻译
In this paper we study the problem of answering cloze-style questions over documents. Our model, the Gated-Attention (GA) Reader 1 , integrates a multi-hop architecture with a novel attention mechanism , which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader. This enables the reader to build query-specific representations of tokens in the document for accurate answer selection. The GA Reader obtains state-of-the-art results on three benchmarks for this task-the CNN & Daily Mail news stories and the Who Did What dataset. The effectiveness of multi-plicative interaction is demonstrated by an ablation study, and by comparing to alternative compositional operators for implementing the gated-attention.
translated by 谷歌翻译
This paper describes a novel hierarchical attention network for reading comprehension style question answering, which aims to answer questions for a given narrative paragraph. In the proposed method, attention and fusion are conducted horizontally and vertically across layers at different levels of granularity between question and paragraph. Specifically, it first encode the question and paragraph with fine-grained language embeddings, to better capture the respective representations at semantic level. Then it proposes a multi-granularity fusion approach to fully fuse information from both global and attended representations. Finally, it introduces a hierarchical attention network to focuses on the answer span progressively with multi-level soft-alignment. Extensive experiments on the large-scale SQuAD and TriviaQA datasets validate the effectiveness of the proposed method. At the time of writing the paper (Jan. 12th 2018), our model achieves the first position on the SQuAD leader-board for both single and ensemble models. We also achieves state-of-the-art results on TriviaQA, AddSent and AddOne-Sent datasets.
translated by 谷歌翻译
本文描述了一种用于阅读理解风格问答的新型分层关注网络,旨在回答有关叙事段落的问题。在所提出的方法中,注意和融合在问题和段落之间的不同粒度水平上跨层地水平和垂直地进行。具体来说,它首先使用细粒度语言嵌入对问题和段落进行编码,以更好地捕获语义级别的各个表示。然后,它提出了多粒度融合方法,以完全融合来自全球和有人参与的表示的信息。最后,它引入了一个层次化的关注网络,以便逐步关注多层次的软件对齐。大规模SQuAD和TriviaQAdatasets的大量实验验证了该方法的有效性。在撰写论文时(2018年1月12日),我们的大小单双倍投公式在单一和整体大小单双倍投公式的SQuAD排行榜上取得了第一的位置。我们还在TriviaQA,AddSent和AddOne-Sent数据集上实现了最先进的结果。
translated by 谷歌翻译