CLJan 6, 2024
MultiSiam: A Multiple Input Siamese Network For Social Media Text Classification And Duplicate Text DetectionSudhanshu Bhoi, Swapnil Markhedkar, Shruti Phadke et al.
Social media accounts post increasingly similar content, creating a chaotic experience across platforms, which makes accessing desired information difficult. These posts can be organized by categorizing and grouping duplicates across social handles and accounts. There can be more than one duplicate of a post, however, a conventional Siamese neural network only considers a pair of inputs for duplicate text detection. In this paper, we first propose a multiple-input Siamese network, MultiSiam. This condensed network is then used to propose another model, SMCD (Social Media Classification and Duplication Model) to perform both duplicate text grouping and categorization. The MultiSiam network, just like the Siamese, can be used in multiple applications by changing the sub-network appropriately.
CRJun 8, 2020
An operational architecture for privacy-by-design in public service applicationsPrashant Agrawal, Anubhutie Singh, Malavika Raghavan et al.
Governments around the world are trying to build large data registries for effective delivery of a variety of public services. However, these efforts are often undermined due to serious concerns over privacy risks associated with collection and processing of personally identifiable information. While a rich set of special-purpose privacy-preserving techniques exist in computer science, they are unable to provide end-to-end protection in alignment with legal principles in the absence of an overarching operational architecture to ensure purpose limitation and protection against insider attacks. This either leads to weak privacy protection in large designs, or adoption of overly defensive strategies to protect privacy by compromising on utility. In this paper, we present an operational architecture for privacy-by-design based on independent regulatory oversight stipulated by most data protection regimes, regulated access control, purpose limitation and data minimisation. We briefly discuss the feasibility of implementing our architecture based on existing techniques. We also present some sample case studies of privacy-preserving design sketches of challenging public service applications.
CRAug 26, 2019
OpenVoting: Recoverability from Failures in Dual VotingPrashant Agrawal, Kabir Tomer, Abhinav Nakarmi et al.
In this paper we address the problem of recovery from failures without re-running entire elections when elections fail to verify. We consider the setting of \emph{dual voting} protocols, where the cryptographic guarantees of end-to-end verifiable voting (E2E-V) are combined with the simplicity of audit using voter-verified paper records (VVPR). We first consider the design requirements of such a system and then suggest a protocol called \emph{OpenVoting}, which identifies a verifiable subset of error-free votes consistent with the VVPRs, and the polling booths corresponding to the votes that fail to verify with possible reasons for the failures. To an ordinary voter \emph{OpenVoting} looks just like an old fashioned paper based voting system, with minimal additional cognitive overload.