Information Nutritional Label and Word Embedding to Estimate Information Check-Worthiness

Automatic fact-checking is an important challenge nowadays since anyone can write about anything and spread it in social media, no matter the information quality. In this paper, we revisit the information check-worthiness problem and propose a method that combines the “information nutritional label” features with POS-tags and word-embedding representations. To predict the information check-worthy claim, we train a machine learning

Clustering for Traceability Managing in System Specifications

System specifications are generally organized according to several documents hierarchies levels linked in order to represent the traceability information. Requirements engineering experts verify manually the links between each specification which allows to generate a traceability matrix. The purpose of this paper is to automatize the generation of the traceability matrix since it is a time consuming and costly task. We

DEMOS: a participatory design approach for democratic empowerment of IS users

The issue of democracy in society is at the heart of our current concerns. The organizations and their information systems are also concerned by this issue. Democracy in organization requires a debate about norms, values and language encapsulated in the information systems. The participatory design approaches address this issue by proposing a democratic empowerment for users during design phase of

Shape-based Outlier Detection in Multivariate Functional Data

Multivariate functional data refer to a population of multivariate functions generated by a system involving dynamic parameters depending on continuous variables (e.g., multivariate time series). Outlier detection in such a context is a challenging problem because both the individual behavior of the parameters and the dynamic correlation between them are important. To address this problem, recent work has focused on

PRYNT: a tool for prioritization of disease candidates from proteomics data using a combination of shortest-path and random walk algorithms

The urinary proteome is a promising pool of biomarkers of kidney disease. However, the protein changes observed in urine only partially reflect the deregulated mechanisms within kidney tissue. In order to improve on the mechanistic insight based on the urinary protein changes, we developed a new prioritization strategy called PRYNT (PRioritization bY protein NeTwork) that employs a combination of two

Coalitional Strategies for Efficient Individual Prediction Explanation

As Machine Learning (ML) is now widely applied in many domains, in both research and industry, an understanding of what is happening inside the black box is becoming a growing demand, especially by non-experts of these models. Several approaches had thus been developed to provide clear insights of a model prediction for a particular observation but at the cost of

Improving vehicle re‐identification using CNN latent spaces: Metrics comparison and track‐to‐track extension

Herein, the problem of vehicle re-identification using distance comparison of images in CNN latent spaces is addressed. First, the impact of the distance metrics, comparing performances obtained with different metrics is studied: the minimal Euclidean distance (MED), the minimal cosine distance (MCD) and the residue of the sparse coding reconstruction (RSCR). These metrics are applied using features extracted from five

Efficient query of multidimensional RDF data with aggregates : comparing NOSQL, RDF and relational data stores

This paper proposes an approach to tackle the problem of querying large volume of statistical RDF data. Our approach relies on pre-aggregation strategies to better manage the analysis of this kind of data. Specifically, we define a conceptual model to represent original RDF data with aggregates in a multidimensional structure. A set of translations rules for converting a well-known multidimensional

OCL Constraints Checking on NoSQL Systems Through an MDA-Based Approach

Big data have received a great deal of attention in recent years. Not only is the amount of data on a completely different level than before, but also the authors have different type of data including factors such as format, structure, and sources. This has definitely changed the tools one needs to handle big data, giving rise to NoSQL systems.