IVFeb 13, 2023
Deep Anatomical Federated Network (Dafne): An open client-server framework for the continuous, collaborative improvement of deep learning-based medical image segmentationFrancesco Santini, Jakob Wasserthal, Abramo Agosti et al.
Purpose: To present and evaluate Dafne (deep anatomical federated network), a freely available decentralized, collaborative deep learning system for the semantic segmentation of radiological images through federated incremental learning. Materials and Methods: Dafne is free software with a client-server architecture. The client side is an advanced user interface that applies the deep learning models stored on the server to the user's data and allows the user to check and refine the prediction. Incremental learning is then performed at the client's side and sent back to the server, where it is integrated into the root model. Dafne was evaluated locally, by assessing the performance gain across model generations on 38 MRI datasets of the lower legs, and through the analysis of real-world usage statistics (n = 639 use-cases). Results: Dafne demonstrated a statistically improvement in the accuracy of semantic segmentation over time (average increase of the Dice Similarity Coefficient by 0.007 points/generation on the local validation set, p < 0.001). Qualitatively, the models showed enhanced performance on various radiologic image types, including those not present in the initial training sets, indicating good model generalizability. Conclusion: Dafne showed improvement in segmentation quality over time, demonstrating potential for learning and generalization.
90.5CLMay 20
On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientistsSeungone Kim, Dongkeun Yoon, Kiril Gashteovski et al.
With the advancement of AI capabilities, AI reviewers are beginning to be deployed in scientific peer review, yet their capability and credibility remain in question: many scientists simply view them as probabilistic systems without the expertise to evaluate research, while other researchers are more optimistic about their readiness without concrete evidence. Understanding what AI reviewers do well, where they fall short, and what challenges remain is essential. However, existing evaluations of AI reviewers have focused on whether their verdicts match human verdicts (e.g., score alignment, acceptance prediction), which is insufficient to characterize their capabilities and limits. In this paper, we close this gap through a large-scale expert annotation study, in which 45 domain scientists in Physical, Biological, and Health Sciences spent 469 hours rating 2,960 individual criticisms (each targeting one specific aspect of a paper) from human-written and AI-generated reviews of 82 Nature-family papers on correctness, significance, and sufficiency of evidence. On a composite of all three dimensions, a reviewing agent powered by GPT-5.2 scores above each paper's top-rated human reviewer (60.0% vs. 48.2%, p = 0.009), while all three AI reviewers (including Gemini 3.0 Pro and Claude Opus 4.5) exceed the lowest-rated human across every dimension. AI reviewers' accurate criticisms are also more often rated significant and well-evidenced, and surface a distinct 26% of issues no human raises. However, AI reviewers overlap far more than humans do (21% vs. 3% for cross-reviewer pairs), and exhibit 16 recurring weaknesses humans do not share, such as limited subfield knowledge, lack of long context management over multiple files, and overly critical stance on minor issues. Overall, our results position current AI reviewers as complements to, not substitutes for, human reviewers.
QUANT-PHSep 9, 2024
An encoding of argumentation problems using quadratic unconstrained binary optimizationMarco Baioletti, Francesco Santini
In this paper, we develop a way to encode several NP-Complete problems in Abstract Argumentation to Quadratic Unconstrained Binary Optimization (QUBO) problems. In this form, a solution for a QUBO problem involves minimizing a quadratic function over binary variables (0/1), where the coefficients can be represented by a symmetric square matrix (or an equivalent upper triangular version). With the QUBO formulation, exploiting new computing architectures, such as Quantum and Digital Annealers, is possible. A more conventional approach consists of developing approximate solvers, which, in this case, are used to tackle the intrinsic complexity. We performed tests to prove the correctness and applicability of classical problems in Argumentation and enforcement of argument sets. We compared our approach to two other approximate solvers in the literature during tests. In the final experimentation, we used a Simulated Annealing algorithm on a local machine. Also, we tested a Quantum Annealer from the D-Wave Ocean SDK and the Leap Quantum Cloud Service.
AIJan 18, 2019
Block ArgumentationRyuta Arisaka, Stefano Bistarelli, Francesco Santini
We contemplate a higher-level bipolar abstract argumentation for non-elementary arguments such as: X argues against Ys sincerity with the fact that Y has presented his argument to draw a conclusion C, by omitting other facts which would not have validated C. Argumentation involving such arguments requires us to potentially consider an argument as a coherent block of argumentation, i.e. an argument may itself be an argumentation. In this work, we formulate block argumentation as a specific instance of Dung-style bipolar abstract argumentation with the dual nature of arguments. We consider internal consistency of an argument(ation) under a set of constraints, of graphical (syntactic) and of semantic nature, and formulate acceptability semantics in relation to them. We discover that classical acceptability semantics do not in general hold good with the constraints. In particular, acceptability of unattacked arguments is not always warranted. Further, there may not be a unique minimal member in complete semantics, thus sceptic (grounded) semantics may not be its subset. To retain set-theoretically minimal semantics as a subset of complete semantics, we define semi-grounded semantics. Through comparisons, we show how the concept of block argumentation may further generalise structured argumentation.
AIFeb 22, 2018
On Looking for Local Expansion Invariants in Argumentation Semantics: a Preliminary ReportStefano Bistarelli, Francesco Santini, Carlo Taticchi
We study invariant local expansion operators for conflict-free and admissible sets in Abstract Argumentation Frameworks (AFs). Such operators are directly applied on AFs, and are invariant with respect to a chosen "semantics" (that is w.r.t. each of the conflict free/admissible set of arguments). Accordingly, we derive a definition of robustness for AFs in terms of the number of times such operators can be applied without producing any change in the chosen semantics.
LOSep 29, 2015
Semiring-based Specification Approaches for Quantitative SecurityFabio Martinelli, Ilaria Matteucci, Francesco Santini
Our goal is to provide different semiring-based formal tools for the specification of security requirements: we quantitatively enhance the open-system approach, according to which a system is partially specified. Therefore, we suppose the existence of an unknown and possibly malicious agent that interacts in parallel with the system. Two specification frameworks are designed along two different (but still related) lines. First, by comparing the behaviour of a system with the expected one, or by checking if such system satisfies some security requirements: we investigate a novel approximate behavioural-equivalence for comparing processes behaviour, thus extending the Generalised Non Deducibility on Composition (GNDC) approach with scores. As a second result, we equip a modal logic with semiring values with the purpose to have a weight related to the satisfaction of a formula that specifies some requested property. Finally, we generalise the classical partial model-checking function, and we name it as quantitative partial model-checking in such a way to point out the necessary and sufficient conditions that a system has to satisfy in order to be considered as secure, with respect to a fixed security/functionality threshold-value.
PLFeb 24, 2014
Timed Soft Concurrent Constraint Programs: An Interleaved and a Parallel ApproachStefano Bistarelli, Maurizio Gabbrielli, Maria Chiara Meo et al.
We propose a timed and soft extension of Concurrent Constraint Programming. The time extension is based on the hypothesis of bounded asynchrony: the computation takes a bounded period of time and is measured by a discrete global clock. Action prefixing is then considered as the syntactic marker which distinguishes a time instant from the next one. Supported by soft constraints instead of crisp ones, tell and ask agents are now equipped with a preference (or consistency) threshold which is used to determine their success or suspension. In the paper we provide a language to describe the agents behavior, together with its operational and denotational semantics, for which we also prove the compositionality and correctness properties. After presenting a semantics using maximal parallelism of actions, we also describe a version for their interleaving on a single processor (with maximal parallelism for time elapsing). Coordinating agents that need to take decisions both on preference values and time events may benefit from this language. To appear in Theory and Practice of Logic Programming (TPLP).