Research Collection School Of Computing and Information Systems

SoftSkip: Empowering multi-modal dynamic pruning for single-stage referring comprehension

Dulanga WEERAKOON, Singapore Management UniversityFollow
Vigneshwaran SUBBARAJU, Singapore Management UniversityFollow
Tuan TRAN, Singapore Management UniversityFollow
Archan MISRA, Singapore Management UniversityFollow

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

10-2022

Abstract

Supporting real-time referring expression comprehension (REC) on pervasive devices is an important capability for human-AI collaborative tasks. Model pruning techniques, applied to DNN models, can enable real-time execution even on resource-constrained devices. However, existing pruning strategies are designed principally for uni-modal applications, and suffer a significant loss of accuracy when applied to REC tasks that require fusion of textual and visual inputs. We thus present a multi-modal pruning model, LGMDP, which uses language as a pivot to dynamically and judiciously select the relevant computational blocks that need to be executed. LGMDP also introduces a new SoftSkip mechanism, whereby 'skipped' visual scales are not completely eliminated but approximated with minimal additional computation. Experimental evaluation, using 3 benchmark REC datasets and an embedded device implementation, shows that LGMDP can achieve 33% latency savings, with an accuracy loss 0.5% - 2%.

Keywords

Human-Robot Interaction, Referring Expression Comprehension, Pruning, Computer Vision, Natural Language Processing

Discipline

Computer Engineering | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

MM '22: Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, October 10-14

First Page

3608

Last Page

3616

ISBN

9781450392037

Identifier

10.1145/3503161.3548432

Publisher

ACM

City or Country

New York

Citation

WEERAKOON, Dulanga; SUBBARAJU, Vigneshwaran; TRAN, Tuan; and MISRA, Archan. SoftSkip: Empowering multi-modal dynamic pruning for single-stage referring comprehension. (2022). MM '22: Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, October 10-14. 3608-3616.
Available at: https://ink.library.smu.edu.sg/sis_research/7707

Copyright Owner and License

Publisher

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3503161.3548432

Download

Find it in your library

Included in

Computer Engineering Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

SoftSkip: Empowering multi-modal dynamic pruning for single-stage referring comprehension

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

SoftSkip: Empowering multi-modal dynamic pruning for single-stage referring comprehension

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links