Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

8-2021

Abstract

In machine learning we often encounter structured output prediction problems (SOPPs), i.e. problems where the output space admits a rich internal structure. Application domains where SOPPs naturally occur include natural language processing, speech recognition, and computer vision. Typical SOPPs have an extremely large label set, which grows exponentially as a function of the size of the output. Existing generalization analysis implies generalization bounds with at least a square-root dependency on the cardinality d of the label set, which can be vacuous in practice. In this paper, we significantly improve the state of the art by developing novel high-probability bounds with a logarithmic dependency on d. Furthermore, we leverage the lens of algorithmic stability to develop generalization bounds in expectation without any dependency on d. Our results therefore build a solid theoretical foundation for learning in large-scale SOPPs. Furthermore, we extend our results to learning with weakly dependent data.

Keywords

Structured output prediction, multi-label, Neural Networks, Multi-class, sequence-to-sequence, Stochastic Gradient Descent

Discipline

Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, Montreal, 2021 August 19-27

Volume

30

First Page

2841

Last Page

2847

ISBN

9780999241196

Identifier

10.24963/ijcai.2021/391

Publisher

International Joint Conferences on Artificial Intelligence

City or Country

International Joint Conferences on Artificial Intelligence

Additional URL

https://doi.org/10.24963/ijcai.2021/391

Share

COinS