Archive 12 API mapping research papers (related work for PhD)

Reference papers on API mapping, migration, and evolution collected during
PhD research (2018). Topics include: API usage adaptation (LibSync),
statistical API mapping mining (StaMiner, MAM), API mapping via vector
representations (Word2Vec), text mining for API mappings (TMAP),
library migration graphs, framework evolution (AURA), class library
migration refactoring, and API specification inference (Doc2Spec).
This commit is contained in:
Yanxin Lu
2026-04-06 12:06:56 -07:00
parent b85169f4e7
commit ff146a9362
24 changed files with 6949 additions and 0 deletions

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2016
source: Nguyen_16_vector.pdf
---
# Mapping API Elements for Code Migration with Vector Representations
Trong Duc Nguyen, Anh Tuan Nguyen, Tien N. Nguyen (Iowa State University)
ICSE Companion 2016
Code migration between languages is challenging because different languages require developers to use different software libraries and frameworks. This paper introduces a statistical approach with vector representations to mine single API mappings between Java JDK and C# .NET. The authors characterize an API element by its usage context consisting of surrounding, co-occurring APIs, and use Word2Vec to project the APIs into continuous vector spaces. The transformation matrix between the two vector spaces is learned from a small set of human-written pairs of mappings, then used to derive other mappings and generate corresponding API sequences in C# via a phrase-based translation model.

View File

@@ -0,0 +1,17 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2005
source: balaban_05_migration.pdf
---
# Refactoring Support for Class Library Migration
Ittai Balaban (NYU), Frank Tip, Robert Fuhrer (IBM T.J. Watson Research Center)
OOPSLA 2005
As object-oriented class libraries evolve, classes are occasionally deprecated in favor of others with roughly the same functionality. In Java, for example, Hashtable has been superseded by HashMap, and Iterator is now preferred over Enumeration. Migrating client applications to use the new idioms is desirable, but making the required changes to declarations and allocation sites can be quite labor-intensive. Moreover, migration becomes complicated if a legacy class is not completely equivalent to its replacement, or if multiple interdependent classes must be migrated simultaneously.
The authors present an approach in which mappings between legacy classes and their replacements are specified by the programmer. Then, an analysis based on type constraints determines where declarations and allocation sites can be updated. The method was implemented in Eclipse, and evaluated on a number of Java applications. On average, the tool could migrate more than 90% of the references to legacy classes.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2013
source: gokhale_13_infer.pdf
---
# Inferring Likely Mappings between APIs
Amruta Gokhale, Vinod Ganapathy, Yogesh Padmanaban (Rutgers University)
ICSE 2013
Software developers often need to port applications written for a source platform to a target platform. A key task is to replace the source platform API with corresponding methods from the target platform API. This paper develops a novel approach to the problem of inferring likely mappings between the APIs of a source and target platform. The approach is tailored to the case where the source and target platform each have independently-developed applications that implement similar functionality. The authors observe that in building these applications, developers exercised knowledge of the corresponding APIs, and develop a technique to systematically harvest this knowledge and infer likely mappings between the APIs. The output is a ranked list of target API methods or method sequences that likely map to each source API method or method sequence. The prototype tool Rosetta was applied to infer likely mappings between the Java2 Platform Mobile Edition and Android graphics APIs.

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2014
source: gokhale_14_data.pdf
---
# Data-Driven Inference of API Mappings
Amruta Gokhale, Daeyoung Kim, Vinod Ganapathy (Rutgers University)
PROMOTO 2014
Porting mobile applications from one platform to another is one strategy used by developers to write cross-platform apps. One challenging task in porting is transforming the app to use the appropriate platform-specific APIs. The authors propose a novel approach to extract functionally equivalent API methods of two platforms, inspired by a technique in natural language processing that extracts a translation dictionary from non-parallel corpora of two natural languages. The approach statically analyses reverse-engineered code of the app to construct program paths, which form sentences in an unknown language where words are individual API methods. These are fed to an inference engine that extracts mappings between words of the two languages.

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2010
source: nguyen_10_adaptation.pdf
---
# A Graph-based Approach to API Usage Adaptation
Hoan Anh Nguyen, Tung Thanh Nguyen, Gary Wilson Jr., Anh Tuan Nguyen, Miryung Kim, Tien N. Nguyen (Iowa State University, UT Austin)
OOPSLA/SPLASH 2010
This paper presents LibSync, which guides developers in adapting API usage code by learning complex API usage adaptation patterns from other clients that already migrated to a new library version (and also from the API usages within the library's test code). LibSync uses several graph-based techniques to (1) identify changes to API declarations by comparing two library versions, (2) extract associated API usage skeletons before and after library migration, and (3) compare the extracted API usage skeletons to recover API usage adaptation patterns. Using the learned adaptation patterns, LibSync recommends the locations and edit operations for adapting API usages. Evaluation on real-world software systems shows precision of 100% and recall of 91%.

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2014
source: nguyen_14_staminer.pdf
---
# Statistical Learning Approach for Mining API Usage Mappings for Code Migration
Anh Tuan Nguyen, Hoan Anh Nguyen, Tung Thanh Nguyen, Tien N. Nguyen (Iowa State University, Utah State University)
ASE 2014
The same software product nowadays could appear in multiple platforms and devices. To address business needs, software companies develop a product in one language and then migrate it to another. The authors introduce StaMiner, a novel data-driven approach that statistically learns the mappings between APIs from the corpus of the corresponding client code of the APIs in two languages Java and C#. Instead of using heuristics on textual or structural similarity to map API methods and classes, StaMiner is based on a statistical model that learns the mappings from a corpus and provides mappings for APIs with all possible arities. Empirical evaluation shows StaMiner can detect API usage mappings with higher accuracy than state-of-the-art approaches. With the resulting API mappings mined by StaMiner, Java2CSharp, an existing migration tool, could achieve a higher level of accuracy.

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2015
source: pandita_15_text.pdf
---
# Discovering Likely Mappings between APIs using Text Mining
Rahul Pandita (NC State), Raoul Praful Jetley, Sithu D Sudarsan (ABB Corporate Research), Laurie Williams (NC State)
SCAM 2015
Developers often release different versions of their applications to support various platform/programming-language APIs. This paper proposes TMAP: Text Mining based approach to discover likely API method mappings using the similarity in the textual description of the source and target API documents. TMAP uses a vector space model of target API method descriptions, then queries it with automatically generated queries from the source API. Results show TMAP on average found relevant mappings for 57% more methods compared to previous approaches (Rosetta and StaMiner), and on average found exact mappings for 6.5 more methods per class with a maximum of 21 additional exact mappings for a single class.

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2017
source: phan_17_migration.pdf
---
# Statistical Migration of API Usages
Hung Dang Phan, Anh Tuan Nguyen, Trong Duc Nguyen (Iowa State University), Tien N. Nguyen (UT Dallas)
ICSE Companion 2017
To support code migration, the authors introduce JV2CS, a tool to generate a sequence of C# API elements and related control units that are needed to migrate a given Java code fragment. First, they mine the mappings between single APIs in Java and C#. To overcome the lexical mismatch between API names, they represent an API by its usages instead of its name, characterizing it with its context consisting of surrounding APIs, and use Word2Vec to project the APIs into continuous vector spaces. The transformation matrix is learned from a small set of human-written pairs of mappings, then used to derive other mappings and generate corresponding API sequences in C# via a statistical machine translation (SMT) tool.

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2012
source: teyton_12_graphs.pdf
---
# Mining Library Migration Graphs
Cedric Teyton, Jean-Remy Falleri, Xavier Blanc (Univ. Bordeaux, LaBRI)
WCRE 2012
Software systems intensively depend on external libraries, chosen at conception time. However, relevance of any library irremediably changes during projects and/or library life cycle. This paper proposes an approach that identifies sets of similar libraries and produces library migration graphs that show how existing projects have performed migrations among them. These graphs, constructed from the observation of a large number of software projects, ease the discovery and selection of library replacements. The approach analyses modern software project management tools (Maven, Ivy, Gradle) where dependencies are explicit, to mine common migration rules.

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2010
source: wu_10_aura.pdf
---
# AURA: A Hybrid Approach to Identify Framework Evolution
Wei Wu, Yann-Gael Gueheneuc (Ecole Polytechnique de Montreal), Giuliano Antoniol (Ecole Polytechnique de Montreal), Miryung Kim (UT Austin)
ICSE 2010
Software frameworks and libraries are indispensable to today's software systems. As they evolve, it is often time-consuming for developers to keep their code up-to-date. The authors introduce AURA, a novel hybrid approach that combines call dependency and text similarity analyses to overcome the limitations of existing approaches that cannot automatically handle one-replaced-by-many or many-replaced-by-one change rules. AURA was implemented in a Java system and compared with three previous approaches. On average, the recall of AURA is 53.07% higher while its precision is similar (e.g., 0.10% lower).

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,15 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2010
source: zhong_10_mam.pdf
---
# Mining API Mapping for Language Migration
Hao Zhong, Suresh Thummalapenta, Tao Xie, Lu Zhang, Qing Wang (Chinese Academy of Sciences, Peking University, NC State)
ICSE 2010
To address business requirements, companies often have to release different versions of their projects in different languages. Manually migrating projects (e.g., from Java to C#) is tedious and error-prone. This paper proposes MAM (Mining API Mapping), a novel approach that automatically mines how APIs of one language are mapped to APIs of another using API client code. MAM accepts a set of projects each with two versions in two languages and mines API mapping relations between those two languages based on how APIs are used by the two versions. Results show that the tool mines 25,805 unique mapping relations of APIs between Java and C# with more than 80% accuracy, and the mined relations help reduce 54.4% compilation errors and 43.0% defects during migration with Java2CSharp.

Binary file not shown.

View File

@@ -0,0 +1,13 @@
---
category: academic
type: academic
person: Yanxin Lu
date: 2012
source: zhong_12_spec.pdf
---
# Inferring Resource Specifications from Natural Language API Documentation
Hao Zhong, Lu Zhang, Tao Xie, Hong Mei (Peking University, NC State)
Typically, software libraries provide API documentation through which developers can learn how to use libraries correctly. However, developers may still write code inconsistent with API documentation and thus introduce bugs. The authors propose Doc2Spec, an approach that infers resource specifications from existing API documentation in natural languages. The approach uses Natural Language Processing (NLP) techniques to analyze API documentation and infer resource specifications. Evaluation on Javadocs of five libraries shows the approach infers various specifications with relatively high precisions, recalls, and F-scores. The inferred specifications are useful to detect previously known or unknown bugs in open source projects.