SSL/TLS is critical to Internet security. Thus, it is significant to check whether certificate validation in SSL/TLS implementations is correctly implemented. With this motivation, we propose a novel differential testing approach which is based on the standard Request For Comments (RFC). First, rules of certificates are extracted automatically from RFCs. Second, low-level test cases are generated through dynamic symbolic execution. Third, high-level test cases, i.e. certificates, are assembled automatically. Finally, with the assembled certificates being test cases, certificate validations in SSL/TLS implementations are tested to reveal latent vulnerabilities or bugs. Our approach named RFCcert has the following advantages: (1) certificates of RFCcert are discrepancy-targeted since they are assembled according to standards instead of genetics; (2) with the obtained certificates, RFCcert not only reveals the invalidity of traditional differential testing but also is able to conduct testing that traditional differential testing cannot do; and (3) the supporting tool of RFCcert has been implemented and extensive experiments show that the approach is effective in finding bugs of SSL/TLS implementations. In addition, by providing seed certificates for mutation approaches with RFCcert, the ability of mutation approaches in finding distinct discrepancies is significantly enhanced.
Model Driven Development (MDD) requires proper tools to derive the implementation code from the application models. However, the integration of manually programmed and automatically generated code is a long-standing issue, which affects the adoption of MDD in the industry. This paper presents a model and code co-evolution approach that addresses such a problem a posteriori, using the standard collision detection capabilities of Version Control Systems to support the semi-automatic merge of the two types of code. We assess the proposed approach by contrasting it with the more traditional template-based, forward engineering process, adopted by most MDD tools.
We define two complementary approaches to monitor decentralized systems. The first relies on those with a centralized specification, i.e, when the specification is written for the behavior of the entire system. To do so, our approach introduces a data-structure that i) keeps track of the execution of an automaton, ii) has predictable parameters and size, and iii) guarantees strong eventual consistency. The second approach defines decentralized specifications wherein multiple specifications are provided for separate parts of the system. We study two properties of decentralized specifications pertaining to monitorability and compatibility between specification and architecture. We also present a general algorithm for monitoring decentralized specifications. We map three existing algorithms to our approaches and provide a framework for analyzing their behavior. Furthermore, we introduce THEMIS, a framework for designing such decentralized algorithms and simulating their behavior. We show the usage of THEMIS to compare multiple algorithms and verify the trends predicted by the analysis by studying two scenarios: a synthetic benchmark and a real example.
Millions of open-source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects. First, we mine millions of bug-fixes from the change histories of projects hosted on GitHub, in order to extract meaningful examples of such bug-fixes. Next, we abstract the buggy and corresponding fixed code, and use them to train an Encoder-Decoder model able to translate buggy code into its fixed version. In our empirical investigation we found that such a model is able to fix thousands of unique buggy methods in the wild. Overall, this model is capable of predicting fixed patches generated by developers in 9-50% of the cases, depending on the number of candidate patches we allow it to generate. Also, the model is able to emulate a variety of different Abstract Syntax Tree operations and generate candidate patches in a split second.
Model transformations are key in model-driven engineering, where they automate the manipulation of models.However, they are typed with respect to concrete source and target metamodels, making their reuse for other (even similar) metamodels challenging. For this purpose, we propose capturing the typing requirements for reusing a transformation with other metamodels by the notion of typing requirements model (TRM). A TRM describes the prerequisites that a transformation imposes on source and target metamodels to obtain a correct typing. This way, any metamodel pair that satisfies the TRM is a valid reuse context for the transformation. A TRM is made of two domain requirement models (DRMs) describing the requirements for source and target metamodels, and a compatibility model expressing dependencies between them. We define a notion of refinement between DRMs, seeing metamodels as a special case of DRM. We provide a catalogue of refinements and describe how to automatically extract a TRM from an ATL transformation. The approach is supported by our tool TOTEM. We report on two experiments -- based on transformations developed by third parties and metamodel mutation -- validating the correctness and completeness of TRM extraction and confirming the power of TRMs to encode variability and support flexible reuse.
Finding the root cause of bugs requires a significant effort from developers. Automated fault localization techniques seek to reduce this cost by computing the suspiciousness scores(i.e.,the likelihood of program entities being faulty). Existing techniques have been developed by utilizing input features of specific types for the computation of suspiciousness scores, such as program spectrum or mutation analysis results. This paper presents a novel learn-to-rank fault localization technique called PRINCE(PRecise machINe learning-based fault loCalization tEchnique). PRINCE uses genetic programming(GP) to combine multiple sets of localization input features that have been studied separately until now. For dynamic features, PRINCE encompasses both Spectrum Based Fault Localization(SBFL) and Mutation Based Fault Localization(MBFL) techniques. It also uses static features, such as dependency information and structural complexity of program entities. All such information is used by GP to train a ranking model for fault localization. The empirical evaluation on CoREBench,SIR, Defects4J shows that PRINCE outperforms the state-of-the-art SBFL, MBFL, and learn-to-rank techniques significantly. PRINCE localizes a fault after reviewing 2.4% of the executed statements on average(4.2 and 3.0 times more precise than the best of the compared SBFL and MBFL techniques, respectively). Also, PRINCE ranks 52.9% of the target faults within the top ten suspicious statements.
The rapidly evolving mobile applications (apps) have brought great demand for developers to identify new features by inspecting descriptions of similar apps and identify features that are missing from their apps. Unfortunately, due to the high number of apps, this manual process is time-consuming and unscalable. To help developers identify new features, we propose a new approach named Similar App based FEature Recommender (SAFER). SAFER needs to solve two research challenges: the identification of features from app descriptions (feature identification) and the identification of similar apps (similar app identification). In this study, to address the feature identification challenge, we first develop a new tool to automatically extract features from app descriptions. Then, we leverage a topic model applied to the extracted features and apps API invocations to identify similar apps and thus address the similar app identification challenge. Finally, the features of identified similar apps are aggregated and sorted to be recommended. Experiments validate that SAFER can accurately recommend features to new apps from a collection of more than 8,000 apps; evaluated over a collection of 533 annotated features from 100 apps, SAFER achieves a [email protected] score of up to 78.68% and outperforms a baseline approach by 17.23% on average.
Generic programming is a key paradigm for developing reusable software components. The inherent support for generic constructs is therefore important in programming languages. As for C++, the generic construct -- templates, has been supported since the language was released. However, little is currently known about how C++ templates are actually used in developing real software. In this study, we conduct an experiment to investigate the use of templates in practice. First, we conduct a survey to understand developers perception of templates. Then, we analyze 1267 historical revisions of 50 open-source software systems, consisting of 566 million lines of C++ code, to collect the data of the practical use of templates. Finally, we perform statistical analysis on the collected data and get many interesting results. We uncover the following important findings: (1) the new template features are not more often used than their old substitutes; (2) user-defined templates do not significantly play a role to reduce code replications in the client code; and (3) freestanding function templates in most software systems do not practically play a role to reduce generic function-like macros. These findings should be helpful for practitioners to understand and use template