ACM Transactions on

Software Engineering and Methodology (TOSEM)

Latest Articles


An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

Millions of open source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can... (more)

The Virtual Developer: Integrating Code Generation and Manual Development with Conflict Resolution

Model Driven Development (MDD) requires proper tools to derive the implementation code from the application models. However, the integration of handwritten and generated code is a long-standing issue that affects the adoption of MDD in the industry. This article presents a model and code co-evolution approach that addresses such a problem a... (more)

Automated Reuse of Model Transformations through Typing Requirements Models

Model transformations are key elements of model-driven engineering, where they are used to automate... (more)

Recommending New Features from Mobile App Descriptions

The rapidly evolving mobile applications (apps) have brought great demand for developers to identify new features by inspecting the descriptions of... (more)

Precise Learn-to-Rank Fault Localization Using Dynamic and Static Features of Target Programs

Finding the root cause of a bug requires a significant effort from developers. Automated fault localization techniques seek to reduce this cost by... (more)

Differential Testing of Certificate Validation in SSL/TLS Implementations: An RFC-guided Approach

Certificate validation in Secure Sockets Layer or Transport Layer Security protocol (SSL/TLS) is critical to Internet security. Thus, it is... (more)


ACM TOSEM announces the first Continuous Special Section on AI and SE with section editors Paolo Tonella, Tim Menzies, and Michael R. Lyu.
The special section welcomes papers presenting novel results in the emerging area that lie at the intersection of AI and SE. Special section papers can be submitted at any time up to December 31, 2020. The special section call is available here

ACM TOSEM announces new 'fast-impact' track

TOSEM has launched the new fast-impact track review process. Papers that qualify as journal-first papers, and do not exceed a reasonable length, will benefit from a review time of no more than 90 days for the first review and 45 days for the subsequent review

Forthcoming Articles

Towards Better Evolutionary Program Repair: An Integrated Approach

Visualizing distributed system executions

Distributed systems pose unique challenges for software developers. Reasoning about concurrent activities, and even understanding the system's communication topology, can be difficult. Three key tasks frequently performed during distributed system analysis but poorly supported by current tools are: (1) understanding the relative ordering of events, (2) searching for specific patterns of interaction between hosts, and (3) identifying structural similarities and differences between pairs of executions. This paper presents a new method, consisting of XVector and ShiViz, that support analysis of distributed systems. XVector instruments systems to capture the happens-before relation between events. ShiViz visualizes the distributed system executions as interactive time-space diagrams to support the three above tasks. We evaluated ShiViz to measure how it aids developers performing the three tasks, including a controlled experiment and two case studies. Participants using ShiViz answered statistically significantly more system comprehension questions correctly than control groups with a very large effect size, and all participants found ShiViz helpful in their analyses of complex distributed system executions.

Quality Indicators in Search-Based Software Engineering: An Empirical Evaluation

Search-Based Software Engineering (SBSE) researchers who apply multi-objective search algorithms (MOSAs) often assess the quality of solutions produced by MOSAs with one or more quality indicators (QIs). However, SBSE lacks evidence providing insights on commonly used QIs, especially about agreements among them and their relations with SBSE problems and applied MOSAs. To this end, we conducted an extensive empirical evaluation to provide insights on commonly used QIs in the context of SBSE, by studying agreements among QIs with and without considering differences of SBSE problems and MOSAs. In addition, by defining a systematic process based on three common ways of comparing MOSAs in SBSE, we present additional observations that were automatically produced based on the results of our empirical evaluation. These observations can be used by SBSE researchers to gain a better understanding of the commonly used QIs in SBSE and their agreements, and also be useful for QI designers to design new QIs with such a comprehensive view of agreements among the studied QIs.

Many-Objective Test Suite Generation for Software Product Lines

A Software Product Line (SPL) is a set of products that are built from a number of features, the set of valid products being defined by a feature model. Typically, it does not make sense to test all of the products defined by an SPL and so in testing one needs to choose a set of products to test (test selection) and, ideally, derive a good order in which to test them (test prioritisation). This paper introduces a new technique for solving the test selection and prioritisation problems. The approach, the grid-based evolution strategy (GrES), considers a number of fitness functions that assess how good a selection or prioritisation is and aims to optimise on all of these. The problem tackled is thus a many-objective optimisation problem. We introduce a new approach, in which all of the fitness functions are considered but one (pairwise coverage) is seen as the most important. We also derive a novel evolution strategy on the basis of domain knowledge. The results of the evaluation, on randomly generated and realistic feature models, were promising, with GrES outperforming previously proposed techniques and a range of many-objective optimisation algorithms.

Is Static Analysis Able to Identify Unnecessary Source Code?

Grown software systems often contain code that is not necessary anymore. Unnecessary code wastes resources during development and maintenance, for example, when preparing code for migration or certification. Running a profiler may reveal code that is not used in production, but it is often time-consuming to obtain representative data this way. We investigate to what extent a static analysis approach, which is based on code stability and code centrality, is able to identify unnecessary code and whether its recommendations are relevant in practice. To study the feasibility and usefulness of our static approach, we conducted a study involving 14 open-source and closed-source software systems. As there is no perfect oracle for unnecessary code, we compared recommendations of our approach with historical cleanup actions, runtime usage data, and feedback from 25 developers of 5 software projects. Our study shows that recommendations generated from stability and centrality information point to unnecessary code. Developers confirmed that 34% of recommendations were indeed unnecessary and deleted 20% of the recommendations shortly after our interviews. Overall, our results suggest that static analysis can provide quick feedback on unnecessary code and is useful in practice.

Assessing and Improving Malware Detection Sustainability through App Evolution Studies

Learning-based classification dominates malware detectors for Android. However, due to the evolution of the Android ecosystem, existing such techniques are limited by their reliance on new malware samples, which may not be timely available, and constant retraining, which are often costly. A practical detector needs not only to be accurate on particular datasets but, more critically, to be able to sustain its capabilities over time without frequent retraining. We propose and study the sustainability problem for learning-based app classifiers. We define sustainability metrics and compare them among five state-of-the-art malware detectors. We further developed DroidSpan, a novel classification system based on a new behavioral profile that capture sensitive access distribution. We evaluated the sustainability of DroidSpan versus the five detectors on longitudinal datasets across eight years, which include 13,627 benign apps and 12,755 malware. We showed that DroidSpan significantly outperformed these baselines in sustainability at reasonable costs, by 6?32% for same-period detection and 21?37% for over-time detection. The main takeaway, which also explains the superiority of DroidSpan, is that the use of features consistently differentiating malware from benign apps over time is essential for sustainable learning-based malware detection, and that these features can be learned from app evolution studies.

On the Monitoring of Decentralized Specifications: Semantics, Properties, Analysis, and Simulation

We define two complementary approaches to monitor decentralized systems. The first relies on those with a centralized specification, i.e, when the specification is written for the behavior of the entire system. To do so, our approach introduces a data-structure that i) keeps track of the execution of an automaton, ii) has predictable parameters and size, and iii) guarantees strong eventual consistency. The second approach defines decentralized specifications wherein multiple specifications are provided for separate parts of the system. We study two properties of decentralized specifications pertaining to monitorability and compatibility between specification and architecture. We also present a general algorithm for monitoring decentralized specifications. We map three existing algorithms to our approaches and provide a framework for analyzing their behavior. Furthermore, we introduce THEMIS, a framework for designing such decentralized algorithms and simulating their behavior. We show the usage of THEMIS to compare multiple algorithms and verify the trends predicted by the analysis by studying two scenarios: a synthetic benchmark and a real example.

Desen: Specification of Sociotechnical Systems via Patterns of Regulation and Control

We address the problem of engineering a sociotechnical (STS) system with respect to its stakeholders' requirements. We motivate a two-tier conception on an STS comprising (i) a technical tier that provides control mechanisms and describes what actions are allowed by the software components; and (ii) a social tier that characterizes the stakeholders' expectations of each other in terms of norms. Specifically, we adopt agents as computational entities, each representing a different stakeholder. Unlike previous approaches, our framework, Desen, incorporates the social dimension into the formal verification process. Thus, Desen supports agents potentially violating applicable norms-a consequence of their autonomy. In addition to formal requirements verification via model checking, Desen supports refinement of system specifications via design patterns to meet stated (and changing) requirements. We demonstrate how Desen carries out refinement on a scenario involving information sharing in a hospital during an emergency. We show via a human-subject study that a design process based on our patterns is helpful for participants who are inexperienced in conceptual modeling and norms.

Automatically Generating SystemC Code from HCSP Formal Models

In the model-driven design of embedded systems, how to generate code from high-level control models seamlessly and correctly is challenging, as control models are normally modeled as hybrid systems, which are involved with continuous evolution, discrete jumps, and the complicated entanglement between them, while code only contains discrete actions. In this paper, we investigate the code generation from a formal control model, given by Hybrid CSP (HCSP), to SystemC. We first introduce the notion of approximate bisimulation, that will be used as a criterion to check the consistency between two different systems, especially between the original control model and the final generated code. We prove that it is decidable whether two HCSP processes are approximately bisimilar in bounded time and unbounded time, respectively. For both the cases, we define the discretization of HCSP processes and prove that the original HCSP model and its discretization are approximately bisimilar. Furthermore, based on the discretization, we define a transfer function to map a discretized HCSP model into SystemC code such that they are bisimilar. We finally implement a tool to automatically do the translation from HCSP processes to SystemC code, and show our approach by some case studies.

How C++ templates are used for generic programming - an empirical study on 50 open source systems

Generic programming is a key paradigm for developing reusable software components. The inherent support for generic constructs is therefore important in programming languages. As for C++, the generic construct -- templates, has been supported since the language was released. However, little is currently known about how C++ templates are actually used in developing real software. In this study, we conduct an experiment to investigate the use of templates in practice. First, we conduct a survey to understand developers perception of templates. Then, we analyze 1267 historical revisions of 50 open-source software systems, consisting of 566 million lines of C++ code, to collect the data of the practical use of templates. Finally, we perform statistical analysis on the collected data and get many interesting results. We uncover the following important findings: (1) the new template features are not more often used than their old substitutes; (2) user-defined templates do not significantly play a role to reduce code replications in the client code; and (3) freestanding function templates in most software systems do not practically play a role to reduce generic function-like macros. These findings should be helpful for practitioners to understand and use template


ACM Transactions on Software Engineering and Methodology (TOSEM) is part of the family of journals produced by the ACM, the Association for Computing Machinery.

TOSEM publishes one volume yearly. Each volume is comprised of four issues, which appear in January, April, July and October.

All ACM Journals | See Full Journal Index

Search TOSEM
enter search term and/or author name