AMS: Generating AutoML Search Spaces from Weak Specifications (ESEC/FSE 2020 - Research Papers)

Who

José Pablo Cambronero, Jürgen Cito, Martin C. Rinard

Track

ESEC/FSE 2020 Research Papers

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 11 Nov 2020 17:30 - 17:32 at Virtual room 2 - ML Model Building

Abstract

We consider a usage model for automated machine learning (AutoML) in which users
can influence the generated pipeline by providing a weak pipeline
specification: an unordered set of API components from which the AutoML
system draws the components it places into the generated pipeline.
Such specifications allow users to express preferences over the components that appear in the
pipeline, for example a desire for interpretable components to appear
in the pipeline. We present AMS, an approach to automatically strengthen
weak specifications to include unspecified
complementary and functionally related API components, populate the space of
hyperparameters and their values, and pair this configuration with a search
procedure to produce a strong pipeline specification: a full
description of the search space for candidate pipelines. ams uses
normalized pointwise mutual information on a code corpus to identify
complementary components, BM25 as a lexical similarity score
over the target API's documentation to identify
functionally related components, and frequency distributions in the code corpus to
extract key hyperparameters and values. We show that strengthened specifications
can produce pipelines that outperform the pipelines generated from the
initial weak specification and an expert-annotated variant, while producing pipelines that still
reflect the user preferences captured in the original weak specification.

DOI

https://doi.org/10.1145/3368089.3409700

José Pablo Cambronero

Massachusetts Institute of Technology, USA

United States

Jürgen Cito

TU Wien and MIT

United States

Martin C. Rinard

Massachusetts Institute of Technology, USA

United States

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 11 Nov
Displayed time zone: (UTC) Coordinated Universal Time change

17:30 - 18:00	ML Model BuildingResearch Papers / Student Research Competition / Paper Presentations / Visions and Reflections at Virtual room 2

17:30 2m Talk		AMS: Generating AutoML Search Spaces from Weak Specifications Research Papers José Pablo Cambronero Massachusetts Institute of Technology, USA, Jürgen Cito TU Wien and MIT, Martin C. Rinard Massachusetts Institute of Technology, USA DOI
17:33 1m Talk		Continuous Experimentation on Artificial Intelligence Software: A Research Agenda Visions and Reflections Anh Nguyen-Duc University of South Eastern Norway, Pekka Abrahamsson University of Jyväskylä DOI
17:35 1m Talk		DENAS: Automated Rule Generation by Knowledge Extraction from Neural Networks Research Papers Simin Chen University of Texas at Dallas, USA, Soroush Bateni University of Texas at Dallas, USA, Sampath Grandhi University of Texas at Dallas, USA, Xiaodi Li University of Texas at Dallas, USA, Cong Liu University of Texas at Dallas, USA, Wei Yang University of Texas at Dallas, USA DOI
17:37 1m Talk		On Decomposing a Deep Neural Network into ModulesACM SIGSOFT Distinguished Paper Award Research Papers Rangeet Pan Iowa State University, USA, Hridesh Rajan Iowa State University, USA DOI Media Attached
17:39 1m Talk		Synthesizing Correct Code for Machine Learning Programs Student Research Competition Joshua Gisi North Dakota State University, USA DOI
17:41 19m Talk		Conversations on ML Model Building Paper Presentations José Pablo Cambronero Massachusetts Institute of Technology, USA, Rangeet Pan Iowa State University, USA, Simin Chen , Wei Yang University of Texas at Dallas, USA, M: John-Paul Ore North Carolina State University