第二届“讯飞杯”中文机器阅读理解评测CMRC2018

The Second Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC2018)

Posted by HFL-RC on January 22, 2018

哈工大讯飞联合实验室继续承办第二届“讯飞杯”中文机器阅读理解评测

CMRC 2018官方网站:http://www.hfl-tek.com/cmrc2018/

前言

第一届“讯飞杯”中文机器阅读理解评测(The First Evaluation Workshop on Chinese Machine Reading Comprehension,CMRC2017)在2017年10月与第十六届全国计算语言学学术会议(CCL2017)在南京圆满落幕,得到了相关同行的积极参与和大力支持。从2017年开始,全国计算语言学学术会议(CCL)计划举办评测活动。

作为CCL的系列评测,今年我们继续举办第二届“讯飞杯”中文机器阅读理解评测 (The Second Evaluation Workshop on Chinese Machine Reading Comprehension, CMRC2018),并将于2018年下半年与第十七届全国计算语言学学术会议(CCL2018,2018年10月19日~21日,湖南长沙)共同召开。本届中文机器阅读理解评测由中国中文信息学会计算语言学专业委员会(CIPS-CL)主办,哈工大讯飞联合实验室(HFL)承办,科大讯飞股份有限公司冠名赞助,旨在进一步促进中文机器阅读理解研究及发展并且为相关领域学者提供一个良好的沟通平台。在此,CMRC2018评测委员会诚邀各家单位参加本次评测活动!

哈工大讯飞联合实验室(HFL)长期致力于机器阅读理解的研究和发展。在中文阅读理解方面,哈工大讯飞联合实验室在2016年7月提出首个中文填空型阅读理解数据集PD&CFT。在2017年承办了第一届“讯飞杯”中文机器阅读理解评测(CMRC2017)并且发布了相关中文阅读理解数据集,进一步促进了中文机器阅读理解研究。我们希望通过每年的中文机器阅读理解评测,与相关领域学者共同推进中文机器阅读理解的技术水平。

任务描述

在去年的CMRC2017评测中,我们的任务焦点是填空型阅读理解任务(Cloze-style Machine Reading Comprehension),吸引了数十家单位报名参加,并且最终有十余家单位提交了系统结果。今年我们将聚焦基于篇章片段抽取的阅读理解(Span-Extraction Machine Reading Comprehension),作为填空型阅读理解任务的进一步延伸。虽然在英文阅读理解研究上有例如斯坦福SQuAD、NewsQA等篇章片段抽取型阅读理解数据集,但目前相关中文资源仍然处于空白状态。本届中文机器阅读理解评测将开放首个人工标注的中文篇章片段抽取型阅读理解数据集,参赛选手需要对篇章、问题进行建模,并从篇章中抽取出连续片段作为答案。本次评测依然采取训练集、开发集公开,测试集隐藏的形式以保证评测的公平性。

更详细的评测任务描述及流程请密切关注本网站后续更新。

奖项设置

本届评测将评选出如下奖项,颁发奖金和荣誉证书。 由中国中文信息学会计算语言学专委会(CIPS-CL)为获奖队伍提供荣誉证书。

金奖 ¥20,000 + 荣誉证书
银奖 ¥10,000 + 荣誉证书
铜奖 ¥ 5,000 + 荣誉证书
最佳单系统奖 ¥10,000 + 荣誉证书

重要时间点

以下所有截止时间点为北京时间(GMT+8)23:59。

以下时间节点仅供参考,请注册参加者密切关注本网站以及邮件通知。

事件 状态 时间
预报名 »已开始« 即日起
正式报名确认 未开始 2018年4月23日 ~ 2018年4月27日
发布训练集和开发集 未开始 2018年5月7日
系统搭建及调整 未开始 2018年5月7日 ~ 2018年8月7日
提交系统验证开发集 未开始 2018年6月7日 ~ 2018年8月7日
提交系统验证测试集 未开始 2018年8月13日 ~ 2018年8月17日
撰写系统描述摘要 未开始 2018年9月中旬
召开CMRC2018大会 未开始 TBD, 与CCL2018同步召开(2018年10月20日或21日)

注册报名

拟报名参加的单位,请参考如下链接: https://wj.qq.com/s/1822356/e14c

评测组织

主办方中国中文信息学会计算语言学专业委员会

承办方哈工大讯飞联合实验室

冠名商科大讯飞股份有限公司

评测主席

刘   挺 (哈尔滨工业大学)
崔一鸣(科大讯飞股份有限公司)

联系我们

如果有任何与本次评测相关的问题,请随时联系会务组。 评测会务组邮箱: cmrc2018@126.com

HFL-RC will organize ‘The Second Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC2018)’

CMRC 2018 Official Website (Chinese only):http://www.hfl-tek.com/cmrc2018/

Introduction

The First Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC2017) was a great success co-located with the CCL2017 at Nanjing on October, 2017. The CMRC2017 has attracted lots of attention from the Chinese NLP community. We would like to express our sincere thanks to all the participants and the support from the community.

To further accelerate the progress of Chinese Machine Reading Comprehension field, we are going to organize The Second Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC2018) this year, and will be co-located with the CCL2018 at Changsha on October 19 ~ 21.

CMRC2018 is hosted by the Technical Committee on Computational Linguistics, Chinese Information Processing Society of China (CIPS-CL), organized by Joint Laboratory of HIT and iFLYTEK Research, sponsored by iFLYTEK Co., Ltd.. We aim to provide a platform for the related researchers and a forum for communications on the related research. Welcome to join us!

Joint Laboratory of HIT and iFLYTEK Research(HFL) is devoting for the development and research on the machine reading comprehension. On Chinese reading comprehension, HFL has released the first Chinese cloze-style reading comprehension dataset: PD&CFT. In 2017, HFL has organized the first evaluation workshop on Chinese machine reading comprehension, which accelerated the research on Chinese reading comprehension. Through annual Chinese Machine Reading Comprehension workshop, we hope the researchers on related field could jointly promote the technical level of Chinese machine reading comprehension.

Task Description

In the last year, we focus on the Cloze-style reading comprehension task, and attracted many participants on evaluation. This year, we will focus on the Span-Extraction Machine Reading Comprehension, which is a extension of cloze-style reading comprehension. We have seen many dataset on this kind of reading comprehension, such as SQuAD, NewsQA. However, we did not see there is a Chinese corpus for this purpose. To add diversity in Chinese dataset, we will release the first Chinese Span-Extraction dataset. The participants will analyze the context and query and extract the correct span in the context for answer output.

Following the rule in previous evaluation, we will release training and validation set at first and keep the test set hidden for fairness of the evaluation process.

Prizes

We will award the top-3 systems as well as the best single model system on our evaluation. The details can be illustrated as follows.

Gold Prize ¥20,000 + Certificate*
Silver Prize ¥10,000 + Certificate
Bronze Prize ¥5,000 + Certificate
Best Single System Prize ¥10,000 + Certificate

*The certificate is provided by CIPS-CL

Important Dates

All the deadlines are Beijing Time (GMT+8) 23:59.

PLEASE PAY CLOSE ATTENTION TO THE WEBSITES, IN CASE THERE WILL BE CHANGE IN DEADLINE

Process State Time
Pre-registration »Begin« From now on
Confirmation for registration N/A April 23, 2018 ~ April 27, 2018
Release of Training and development set N/A May 7, 2018
Tuning System N/A May 7, 2018 ~ August 7, 2018
Validation for development set N/A June 7, 2018 ~ August 7, 2018
Submission for final system N/A August 13, 2018 ~ August 17, 2018
System Description N/A Mid-September, 2018
CMRC2018 workshop N/A TBD, co-locate with CCL2018(October 20 or 21, 2018)

Registration for Participation

Please fill the following form for participation: https://wj.qq.com/s/1822356/e14c

Organization

HostTechnical Committee on Computational Linguistics, Chinese Information Processing Society of China (CIPS-CL)

OrganizerJoint Laboratory of HIT and iFLYTEK Research

SponsoriFLYTEK Co., Ltd.

Evaluation Committee: > Ting Liu, Harbin Institute of Technology
Yiming Cui, iFLYTEK Research

Contact us

Any problems? Feel free to concat us. E-MAIL: cmrc2018@126.com