Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

5-2023

Abstract

An embedded system is a system consisting of software code, controller hardware, and I/O (Input/Output) hardware that performs a specific task. Developing an embedded system presents several challenges. First, the development often involves configuring hardware that requires domain-specific knowledge. Second, the library for the hardware may have API usage patterns that must be followed. To overcome such challenges, we propose a framework called ArduinoProg towards the automatic generation of Arduino applications. ArduinoProg takes a natural language query as input and outputs the configuration and API usage pattern for the hardware described in the query. Motivated by our findings on the characteristics of real-world queries posted in the official Arduino forum, we formulate ArduinoProg as three components, i.e., Library Retriever, Configuration Classifier, and Pattern Generator. First, Library Retriever preprocesses the input query and retrieves a set of relevant libraries using either lexical matching or vector-based similarity. Second, given Library Retriever's output, Configuration Classifier infers the hardware configuration by classifying the method definitions found in the library's implementation files into a hardware configuration class. Third, Pattern Generator also takes Library Retriever's output as input and leverages a sequence-to-sequence model to generate the API usage pattern. Having instantiated each component of ArduinoProg with various machine learning models, we have evaluated ArduinoProg on real-world queries. Library Retriever achieves a Precision@K range of 44.0%-97.1%; Configuration Classifier achieves an Area under the Receiver Operating Characteristics curve (AUC) of 0.79-0.95; Pattern Generator yields a Normalized Discounted Cumulative Gain (NDCG)@K of 0.45-0.73. Such results indicate that ArduinoProg can generate practical and useful hardware configurations and API usage patterns to guide developers in writing Arduino code.

Keywords

api recommendation, arduino, code generation, deep learning, embedded system, information retrieval

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

Proceedings of the 20th International Conference on Mining Software Repositories MSR 2023: Melbourne, May 15-16

ISBN

9798350311846

Identifier

https://doi.org/10.1109/MSR59073.2023.00069

Publisher

IEEE

City or Country

Piscataway, NJ

Comments

scisug

Share

COinS