Jul 17 2017 cs.DL
Computer science offers a large set of tools for prototyping, writing, running, testing, validating, sharing and reproducing results, however computational science lags behind. In the best case, authors may provide their source code as a compressed archive and they may feel confident their research is reproducible. But this is not exactly true. James Buckheit and David Donoho proposed more than two decades ago that an article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code, and data that produced the result. This implies new workflows, in particular in peer-reviews. Existing journals have been slow to adapt: source codes are rarely requested, hardly ever actually executed to check that they produce the results advertised in the article. ReScience is a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research can be replicated from its description. To achieve this goal, the whole publishing chain is radically different from other traditional scientific journals. ReScience resides on GitHub where each new implementation of a computational study is made available together with comments, explanations, and software tests.
Dec 07 2016 cs.ET
Optical interconnects have attracted significant research interest for use in short-reach board-level optical communication links in supercomputers and data centres. Multimode polymer waveguides in particular constitute an attractive technology for on-board optical interconnects as they provide high bandwidth, offer relaxed alignment tolerances, and can be cost-effectively integrated onto standard printed circuit boards (PCBs). However, the continuing improvements in bandwidth performance of optical sources make it important to investigate approaches to develop high bandwidth polymer waveguides. In this paper, we present dispersion studies on a graded-index (GI) waveguide in siloxane materials designed to deliver high bandwidth over a range of launch conditions. Bandwidth-length products of >70 GHzxm and ~65 GHzxm are observed using a 50/125 um multimode fibre (MMF) launch for input offsets of +/- 10 um without and with the use of a mode mixer respectively; and enhanced values of >100 GHzxm are found under a 10x microscope objective launch for input offsets of ~18 x 20 um^2. The large range of offsets is within the -1 dB alignment tolerances. A theoretical model is developed using the measured refractive index profile of the waveguide, and general agreement is found with experimental bandwidth measurements. The reported results clearly demonstrate the potential of this technology for use in high-speed board-level optical links, and indicate that data transmission of 100 Gb/s over a multimode polymer waveguide is feasible with appropriate refractive index engineering.
Nov 29 2016 cs.LG
Functional MRI (fMRI) and diffusion MRI (dMRI) are non-invasive imaging modalities that allow in-vivo analysis of a patient's brain network (known as a connectome). Use of these technologies has enabled faster and better diagnoses and treatments of neurological disorders and a deeper understanding of the human brain. Recently, researchers have been exploring the application of machine learning models to connectome data in order to predict clinical outcomes and analyze the importance of subnetworks in the brain. Connectome data has unique properties, which present both special challenges and opportunities when used for machine learning. The purpose of this work is to review the literature on the topic of applying machine learning models to MRI-based connectome data. This field is growing rapidly and now encompasses a large body of research. To summarize the research done to date, we provide a comparative, structured summary of 77 relevant works, tabulated according to different criteria, that represent the majority of the literature on this topic. (We also published a living version of this table online at http://connectomelearning.cs.sfu.ca that the community can continue to contribute to.) After giving an overview of how connectomes are constructed from dMRI and fMRI data, we discuss the variety of machine learning tasks that have been explored with connectome data. We then compare the advantages and drawbacks of different machine learning approaches that have been employed, discussing different feature selection and feature extraction schemes, as well as the learning models and regularization penalties themselves. Throughout this discussion, we focus particularly on how the methods are adapted to the unique nature of graphical connectome data. Finally, we conclude by summarizing the current state of the art and by outlining what we believe are strategic directions for future research.
We propose a new internal guidance method for automated theorem provers based on the given-clause algorithm. Our method influences the choice of unprocessed clauses using positive and negative examples from previous proofs. To this end, we present an efficient scheme for Naive Bayesian classification by generalising label occurrences to types with monoid structure. This makes it possible to extend existing fast classifiers, which consider only positive examples, with negative ones. We implement the method in the higher-order logic prover Satallax, where we modify the delay with which propositions are processed. We evaluated our method on a simply-typed higher-order logic version of the Flyspeck project, where it solves 26% more problems than Satallax without internal guidance.
Certain constructs allowed in Mizar articles cannot be represented in first-order logic but can be represented in higher-order logic. We describe a way to obtain higher-order theorem proving problems from Mizar articles that make use of these constructs. In particular, higher-order logic is used to represent schemes, a global choice construct and set level binders. The higher-order automated theorem provers Satallax and LEO-II have been run on collections of these problems and the results are discussed.
Online social systems are multiplex in nature as multiple links may exist between the same two users across different social networks. In this work, we introduce a framework for studying links and interactions between users beyond the individual social network. Exploring the cross-section of two popular online platforms - Twitter and location-based social network Foursquare - we represent the two together as a composite multilayer online social network. Through this paradigm we study the interactions of pairs of users differentiating between those with links on one or both networks. We find that users with multiplex links, who are connected on both networks, interact more and have greater neighbourhood overlap on both platforms, in comparison with pairs who are connected on just one of the social networks. In particular, the most frequented locations of users are considerably closer, and similarity is considerably greater among multiplex links. We present a number of structural and interaction features, such as the multilayer Adamic/Adar coefficient, which are based on the extension of the concept of the node neighbourhood beyond the single network. Our evaluation, which aims to shed light on the implications of multiplexity for the link generation process, shows that multilayer features, constructed from properties across social networks, perform better than their single network counterparts in predicting links across networks. We propose that combining information from multiple networks in a multilayer configuration can provide new insights into user interactions on online social networks, and can significantly improve link prediction overall with valuable applications to social bootstrapping and friend recommendations.
We analyze two large datasets from technological networks with location and social data: user location records from an online location-based social networking service, and anonymized telecommunications data from a European cellphone operator, in order to investigate the differences between individual and group behavior with respect to physical location. We discover agreements between the two datasets: firstly, that individuals are more likely to meet with one friend at a place they have not visited before, but tend to meet at familiar locations when with a larger group. We also find that groups of individuals are more likely to meet at places that their other friends have visited, and that the type of a place strongly affects the propensity for groups to meet there. These differences between group and solo mobility has potential technological applications, for example, in venue recommendation in location-based social networks.
The layouts of the buildings we live in shape our everyday lives. In office environments, building spaces affect employees' communication, which is crucial for productivity and innovation. However, accurate measurement of how spatial layouts affect interactions is a major challenge and traditional techniques may not give an objective view.We measure the impact of building spaces on social interactions using wearable sensing devices. We study a single organization that moved between two different buildings, affording a unique opportunity to examine how space alone can affect interactions. The analysis is based on two large scale deployments of wireless sensing technologies: short-range, lightweight RFID tags capable of detecting face-to-face interactions. We analyze the traces to study the impact of the building change on social behavior, which represents a first example of using ubiquitous sensing technology to study how the physical design of two workplaces combines with organizational structure to shape contact patterns.
Mar 27 2014 cs.SC
In this paper we introduce the notion of an Open Non-uniform Cylindrical Algebraic Decomposition (NuCAD), and present an efficient model-based algorithm for constructing an Open NuCAD from an input formula. A NuCAD is a generalization of Cylindrical Algebraic Decomposition (CAD) as defined by Collins in his seminal work from the early 1970s, and as extended in concepts like Hong's partial CAD. A NuCAD, like a CAD, is a decomposition of n-dimensional real space into cylindrical cells. But unlike a CAD, the cells in a NuCAD need not be arranged cylindrically. It is in this sense that NuCADs are not uniformly cylindrical. However, NuCADs--- like CADs --- carry a tree-like structure that relates different cells. It is a very different tree but, as with the CAD tree structure, it allows some operations to be performed efficiently, for example locating the containing cell for an arbitrary input point.
Oct 15 2013 cs.CY
In many work environments, serendipitous interactions between members of different groups may lead to enhanced productivity, collaboration and knowledge dissemination. Two factors that may have an influence on such interactions are cultural differences between individuals in highly multicultural workplaces, and the layout and physical spaces of the workplace itself. In this work, we investigate how these two factors may facilitate or hinder inter-group interactions in the workplace. We analyze traces collected using wearable electronic badges to capture face-to-face interactions and mobility patterns of employees in a research laboratory in the UK. We observe that those who interact with people of different roles tend to come from collectivist cultures that value relationships and where people tend to be comfortable with social hierarchies, and that some locations in particular are more likely to host serendipitous interactions, knowledge that could be used by organizations to enhance communication and productivity.
The focused organization theory of social ties proposes that the structure of human social networks can be arranged around extra-network foci, which can include shared physical spaces such as homes, workplaces, restaurants, and so on. Until now, this has been difficult to investigate on a large scale, but the huge volume of data available from online location-based social services now makes it possible to examine the friendships and mobility of many thousands of people, and to investigate the relationship between meetings at places and the structure of the social network. In this paper, we analyze a large dataset from Foursquare, the most popular online location-based social network. We examine the properties of city-based social networks, finding that they have common structural properties, and that the category of place where two people meet has very strong influence on the likelihood of their being friends. Inspired by these observations in combination with the focused organization theory, we then present a model to generate city-level social networks, and show that it produces networks with the structural properties seen in empirical data.
Thanks to widely available, cheap Internet access and the ubiquity of smartphones, millions of people around the world now use online location-based social networking services. Understanding the structural properties of these systems and their dependence upon users' habits and mobility has many potential applications, including resource recommendation and link prediction. Here, we construct and characterise social and place-focused graphs by using longitudinal information about declared social relationships and about users' visits to physical places collected from a popular online location-based social service. We show that although the social and place-focused graphs are constructed from the same data set, they have quite different structural properties. We find that the social and location-focused graphs have different global and meso-scale structure, and in particular that social and place-focused communities have negligible overlap. Consequently, group inference based on community detection performed on the social graph alone fails to isolate place-focused groups, even though these do exist in the network. By studying the evolution of tie structure within communities, we show that the time period over which location data are aggregated has a substantial impact on the stability of place-focused communities, and that information about place-based groups may be more useful for user-centric applications than that obtained from the analysis of social communities alone.
We introduce design and optimization considerations for the 'khmer' package.
Scientists spend an increasing amount of time building and using software. However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more reliable and maintainable code with less effort. We describe a set of best practices for scientific software development that have solid foundations in research and experience, and that improve scientists' productivity and the reliability of their software.
Apr 13 2010 cs.LO
We study simple type theory with primitive equality (STT) and its first-order fragment EFO, which restricts equality and quantification to base types but retains lambda abstraction and higher-order variables. As deductive system we employ a cut-free tableau calculus. We consider completeness, compactness, and existence of countable models. We prove these properties for STT with respect to Henkin models and for EFO with respect to standard models. We also show that the tableau system yields a decision procedure for three EFO fragments.
Apr 27 2009 cs.CC
Although whether P equals NP is an important, open problem in computer science, and although Jaeger's 2008 paper, "Solving the P/NP Problem Under Intrinsic Uncertainty" (arXiv:0811.0463) presents an attempt at tackling the problem by discussing the possibility that all computation is uncertain to some degree, there are a number of logical oversights present in that paper which preclude it from serious consideration toward having resolved P-versus-NP. There are several differences between the model of computation presented in Jaeger's paper and the standard model, as well as several bold assumptions that are not well supported in Jaeger's paper or in the literature. In addition, we find several omissions of rigorous proof that ultimately weaken this paper to a point where it cannot be considered a candidate solution to the P-versus-NP problem.
We investigate cut-elimination and cut-simulation in impredicative (higher-order) logics. We illustrate that adding simple axioms such as Leibniz equations to a calculus for an impredicative logic -- in our case a sequent calculus for classical type theory -- is like adding cut. The phenomenon equally applies to prominent axioms like Boolean- and functional extensionality, induction, choice, and description. This calls for the development of calculi where these principles are built-in instead of being treated axiomatically.