In this paper, the term formula code refers to fragments of source code that implement a mathematical formula. We present empirical studies that analyze the diversity and frequency of formula code in open-source-software projects. In an exploratory study, we investigated what kinds of formulas are implemented in real-world Java projects and derived syntactical patterns and constraints. We refined these patterns for sum and product formulas to automatically detect formula code in software archives and to reconstruct the implemented formula in mathematical notation. In a quantitative study of a large sample of engineered Java projects on GitHub we analyzed the frequency of formula code and estimated that one of 700 lines of code in this sample implements a sum or product formula. For a sample of scientific-computing projects, we found that one of 100 lines of code implements a sum or product formula. To assess the need for tool support, we investigated the helpfulness of comments for program understanding in a sample of formula-code fragments and performed an online survey. Our findings provide first insights into the characteristics of formula code, that can motivate further studies on the role of formula code in software projects and the design of formula-related tools.

0
下载
关闭预览

相关内容

Java 是一门编程语言,拥有跨平台、面向对象、泛型编程等特性。

Hamiltonian cycles in graphs were first studied in the 1850s. Since then, an impressive amount of research has been dedicated to identifying classes of graphs that allow Hamiltonian cycles, and to related questions. The corresponding decision problem, that asks whether a given graph is Hamiltonian (i. e. admits a Hamiltonian cycle), is one of Karp's famous NP-complete problems. It remains NP-complete on planar cubic graphs. In this paper we study graphs of bounded degree that are far from being Hamiltonian, where a graph G on n vertices is far from being Hamiltonian, if modifying a constant fraction of n edges is necessary to make G Hamiltonian. We exhibit classes of graphs of bounded degree that are locally Hamiltonian, i.e. every subgraph induced by the neighbourhood of a small vertex set appears in some Hamiltonian graph, but that are far from being Hamiltonian. We then use these classes to obtain a lower bound in property testing. We show that in the bounded-degree graph model, Hamiltonicity is not testable with one-sided error probability and query complexity o(n). This contrasts the known fact that on planar (or minor-free) graph classes, Hamiltonicity is testable with constant query complexity in the bounded-degree graph model with two-sided error. Our proof is an intricate construction that shows how to turn a d-regular graph into a graph that is far from being Hamiltonian, and we use d-regular expander graphs to maintain local Hamiltonicity.

0
0
下载
预览

We studied the hyperlogic HyperQPTL, which combines the concepts of trace relations and $\omega$-regularity. We showed that HyperQPTL is very expressive, it can express properties like promptness, bounded waiting for a grant, epistemic properties, and, in particular, any $\omega$-regular property. Those properties are not expressible in previously studied hyperlogics like HyperLTL. At the same time, we argued that the expressiveness of HyperQPTL is optimal in a sense that a more expressive logic for $\omega$-regular hyperproperties would have an undecidable model checking problem. We furthermore studied the realizability problem of HyperQPTL. We showed that realizability is decidable for HyperQPTL fragments that contain properties like promptness. But still, in contrast to the satisfiability problem, propositional quantification does make the realizability problem of hyperlogics harder. More specifically, the HyperQPTL fragment of formulas with a universal-existential propositional quantifier alternation followed by a single trace quantifier is undecidable in general, even though the projection of the fragment to HyperLTL has a decidable realizability problem. Lastly, we implemented the bounded synthesis problem for HyperQPTL in the prototype tool BoSy. Using BoSy with HyperQPTL specifications, we have been able to synthesize several resource arbiters. The synthesis problem of non-linear-time hyperlogics is still open. For example, it is not yet known how to synthesize systems from specifications given in branching-time hyperlogics like HyperCTL$^*$.

0
0
下载
预览

Quantifying the amount of polarization is crucial for understanding and studying political polarization in political and social systems. Several methods are used commonly to measure polarization in social networks by purely inspecting their structure. We analyse eight of such methods and show that all of them yield high polarization scores even for random networks with similar density and degree distributions to typical real-world networks. Further, some of the methods are sensitive to degree distributions and relative sizes of the polarized groups. We propose normalization to the existing scores and a minimal set of tests that a score should pass in order for it to be suitable for separating polarized networks from random noise. The performance of the scores increased by 38%-220% after normalization in a classification task of 203 networks. Further, we find that the choice of method is not as important as normalization, after which most of the methods have better performance than the best-performing method before normalization. This work opens up the possibility to critically assess and compare the features and performance of structural polarization methods.

0
0
下载
预览

We propose a major revision of the format XCSP 2.1, called XCSP3, to build integrated representations of combinatorial constrained problems. This new format is able to deal with mono/multi optimization, many types of variables, cost functions, reification, views, annotations, variable quantification, distributed, probabilistic and qualitative reasoning. The new format is made compact, highly readable, and rather easy to parse. Interestingly, it captures the structure of the problem models, through the possibilities of declaring arrays of variables, and identifying syntactic and semantic groups of constraints. The number of constraints is kept under control by introducing a limited set of basic constraint forms, and producing almost automatically some of their variations through lifting, restriction, sliding, logical combination and relaxation mechanisms. As a result, XCSP3 encompasses practically all constraints that can be found in major constraint solvers developed by the CP community. A website, which is developed conjointly with the format, contains many models and series of instances. The user can make sophisticated queries for selecting instances from very precise criteria. The objective of XCSP3 is to ease the effort required to test and compare different algorithms by providing a common test-bed of combinatorial constrained instances.

0
0
下载
预览

In this letter, we consider the proof of non-stationary channel polarization theory. First we construct a multi-channel stochastic process for the non-stationary channel polarization operation. Then based on this stochastic process, we extend Ar{\i}kan's standard martingale proof method on the average channel capacity and average channel Bhattacharyya parameter, by which we have proved the non-stationary channel polarization theory.

0
0
下载
预览

It is argued that 4-valued paraconsistent truth values (called here "p-bits") can serve as a conceptual, mathematical and practical foundation for highly AI-relevant forms of probabilistic logic and probabilistic programming and concept formation. First it is shown that appropriate averaging-across-situations and renormalization of 4-valued p-bits operating in accordance with Constructible Duality (CD) logic yields PLN (Probabilistic Logic Networks) strength-and-confidence truth values. Then variations on the Curry-Howard correspondence are used to map these paraconsistent and probabilistic logics into probabilistic types suitable for use within dependent type based programming languages. Zach Weber's paraconsistent analysis of the sorites paradox is extended to form a paraconsistent / probabilistic / fuzzy analysis of concept boundaries; and a paraconsistent version of concept formation via Formal Concept Analysis is presented, building on a definition of fuzzy property-value degrees in terms of relative entropy on paraconsistent probability distributions. These general points are fleshed out via reference to the realization of probabilistic reasoning and programming and concept formation in the OpenCog AGI framework which is centered on collaborative multi-algorithm updating of a common knowledge metagraph.

0
0
下载
预览

To obtain the highest confidence on the correction of numerical simulation programs implementing the finite element method, one has to formalize the mathematical notions and results that allow to establish the soundness of the method. Sobolev spaces are the correct framework in which most partial derivative equations may be stated and solved. These functional spaces are built on integration and measure theory. Hence, this chapter in functional analysis is a mandatory theoretical cornerstone for the definition of the finite element method. The purpose of this document is to provide the formal proof community with very detailed pen-and-paper proofs of the main results from integration and measure theory.

0
0
下载
预览

Identifier names convey useful information about the intended semantics of code. Name-based program analyses use this information, e.g., to detect bugs, to predict types, and to improve the readability of code. At the core of name-based analyses are semantic representations of identifiers, e.g., in the form of learned embeddings. The high-level goal of such a representation is to encode whether two identifiers, e.g., len and size, are semantically similar. Unfortunately, it is currently unclear to what extent semantic representations match the semantic relatedness and similarity perceived by developers. This paper presents IdBench, the first benchmark for evaluating semantic representations against a ground truth created from thousands of ratings by 500 software developers. We use IdBench to study state-of-the-art embedding techniques proposed for natural language, an embedding technique specifically designed for source code, and lexical string distance functions. Our results show that the effectiveness of semantic representations varies significantly and that the best available embeddings successfully represent semantic relatedness. On the downside, no existing technique provides a satisfactory representation of semantic similarities, among other reasons because identifiers with opposing meanings are incorrectly considered to be similar, which may lead to fatal mistakes, e.g., in a refactoring tool. Studying the strengths and weaknesses of the different techniques shows that they complement each other. As a first step toward exploiting this complementarity, we present an ensemble model that combines existing techniques and that clearly outperforms the best available semantic representation.

0
0
下载
预览

This paper introduces a problem in which the state of a system needs to be determined through costly tests of its components by a limited number of testing units and before a given deadline. We also consider a closely related search problem in which there are multiple searchers to find a target before a given deadline. These natural generalizations of the classical sequential testing problem and search problem are applicable in a wide range of time-critical operations such as machine maintenance, diagnosing a patient, and new product development. We show that both problems are NP-hard, develop a pseudo-polynomial dynamic program for the special case of two time slots, and describe a partial-order-based as well as an assignment-based mixed integer program for the general case. Based on extensive computational experiments, we find that the assignment-based formulation performs better than the partial-order-based formulation for the testing variant, but that this is the other way round for the search variant. Finally, we propose a pairwise-interchange-based local search procedure and show that, empirically, it performs very well in finding near-optimal solutions.

0
0
下载
预览

For extracting meaningful topics from texts, their structures should be considered properly. In this paper, we aim to analyze structured time-series documents such as a collection of news articles and a series of scientific papers, wherein topics evolve along time depending on multiple topics in the past and are also related to each other at each time. To this end, we propose a dynamic and static topic model, which simultaneously considers the dynamic structures of the temporal topic evolution and the static structures of the topic hierarchy at each time. We show the results of experiments on collections of scientific papers, in which the proposed method outperformed conventional models. Moreover, we show an example of extracted topic structures, which we found helpful for analyzing research activities.

0
7
下载
预览
小贴士
相关论文
Isolde Adler,Noleen Köhler
0+阅读 · 1月18日
Bernd Finkbeiner,Christopher Hahn,Jana Hofmann,Leander Tentrup
0+阅读 · 1月18日
Ali Salloum,Ted Hsuan Yun Chen,Mikko Kivelä
0+阅读 · 1月18日
Frederic Boussemart,Christophe Lecoutre,Gilles Audemard,Cédric Piette
0+阅读 · 1月16日
Yizhi Zhao,Hongmei Chi,Shiwei Xu,Lingjuan Wu,Yuling Fan
0+阅读 · 1月15日
Lebesgue integration. Detailed proofs to be formalized in Coq
François Clément,Vincent Martin
0+阅读 · 1月14日
Yaza Wainakh,Moiz Rauf,Michael Pradel
0+阅读 · 1月14日
Alessandro Agnetis,Ben Hermans,Roel Leus,Salim Rostami
0+阅读 · 1月14日
Rem Hida,Naoya Takeishi,Takehisa Yairi,Koichi Hori
7+阅读 · 2018年5月6日
相关VIP内容
专知会员服务
31+阅读 · 2020年7月26日
Stabilizing Transformers for Reinforcement Learning
专知会员服务
21+阅读 · 2019年10月17日
Keras François Chollet 《Deep Learning with Python 》, 386页pdf
专知会员服务
43+阅读 · 2019年10月12日
强化学习最新教程,17页pdf
专知会员服务
45+阅读 · 2019年10月11日
【新书】Python编程基础,669页pdf
专知会员服务
81+阅读 · 2019年10月10日
MIT新书《强化学习与最优控制》
专知会员服务
108+阅读 · 2019年10月9日
最新BERT相关论文清单,BERT-related Papers
专知会员服务
28+阅读 · 2019年9月29日
相关资讯
强化学习的Unsupervised Meta-Learning
CreateAMind
6+阅读 · 2019年1月7日
Unsupervised Learning via Meta-Learning
CreateAMind
26+阅读 · 2019年1月3日
RL 真经
CreateAMind
4+阅读 · 2018年12月28日
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字闺中
10+阅读 · 2018年12月24日
计算机类 | LICS 2019等国际会议信息7条
Call4Papers
3+阅读 · 2018年12月17日
Disentangled的假设的探讨
CreateAMind
7+阅读 · 2018年12月10日
disentangled-representation-papers
CreateAMind
20+阅读 · 2018年9月12日
Hierarchical Disentangled Representations
CreateAMind
3+阅读 · 2018年4月15日
【推荐】免费书(草稿):数据科学的数学基础
机器学习研究会
12+阅读 · 2017年10月1日
强化学习 cartpole_a3c
CreateAMind
8+阅读 · 2017年7月21日
Top