About | ConvLab

ConvLab

ConvLab is an open-source multidomain end-to-end dialog system platform, that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches, ranging from conventional pipeline systems to end-to-end neural models, in common environments.

DSTC8 Track 1: End-to-End Multi-Domain Dialog Challenge

As part of DSTC8, Microsoft Research and Tsinghua University are hosting a track intended to foster progress in building complex bots span over multiple sub-domains to accomplish a complex user goal. To advance state-of-the-art technologies for handling complex dialogs, we offer a timely task focusing on multi-domain end-to-end task completion dialog in travel planning settings.

Evaluation

We report each team's best submission result based on success rate for both automatic evaluation and human evaluation.

Team Submission ID is the ID CodaLab provided when you made the submission.

Human evaluation leaderboard is considered as the final ranking.

Human Evaluation Leaderboard

Rank	Team Submission ID	Spec #	Success Rate	Language Understanding Score	Response Appropriateness Score	Turns
1	504430	submission4	68.32%	4.149	4.287	19.507
2	504429	submission1	65.81%	3.538	3.632	15.481
3	504563	submission2	65.09%	3.538	3.84	13.884
4	504651	submission1	64.10%	3.547	3.829	16.906
5	504641	submission2	62.91%	3.742	3.815	14.968
6	504569	submission4	54.90%	3.784	3.824	14.107
7	504529	submission1	43.56%	3.554	3.446	21.818
8	504582	submission2	36.45%	2.944	3.103	21.128
9	504666	submission1	25.77%	2.072	2.258	16.8
10	504502	submission2	23.30%	2.612	2.65	15.333
11	504524	submission1	18.81%	1.99	2.059	16.105
N/A	Baseline	milu_rule_rule_template	56.45%	3.097	3.556	17.543

Automatic Evaluation Leaderboard

Rank	Team Submission ID	Spec #	Success Rate	Return	Turns	Precision	Recall	F1	Book Rate
1	504429	submission1	88.80%	61.56	7	0.92	0.96	0.93	93.75%
2	504563	submission4	88.60%	61.63	6.69	0.83	0.94	0.87	96.39%
3	504651	submission1	82.20%	54.09	6.55	0.71	0.92	0.78	94.56%
4	504641	submission4	80.60%	51.51	7.21	0.78	0.89	0.81	86.45%
5	504430	submission1	79.40%	49.69	7.59	0.8	0.89	0.83	87.02%
6	504529	submission1	58.00%	23.7	7.9	0.61	0.73	0.64	75.71%
7	504666	submission1	56.60%	20.14	9.78	0.68	0.77	0.7	58.63%
8	504502	submission1	55.20%	17.18	11.06	0.73	0.74	0.71	71.87%
9	504524	submission1	54.00%	17.15	9.65	0.66	0.76	0.69	72.42%
10	504569	submission4	52.20%	15.81	8.83	0.46	0.75	0.54	76.38%
11	504582	submission2	34.80%	-6.39	10.15	0.65	0.75	0.68	N/A
12	504632	submission1	0%	-58.88	20.88	0	0.01	0	N/A
N/A	Baseline	milu_rule_rule_template	63.40%	30.41	7.67	0.72	0.83	0.75	86.37%