Debajyoti Mukhopadhyay

IR
h-index12
28papers
380citations
Novelty20%
AI Score35

28 Papers

DBDec 26, 2025
Cost-Aware Text-to-SQL: An Empirical Study of Cloud Compute Costs for LLM-Generated Queries

Saurabh Deochake, Debajyoti Mukhopadhyay

Text-to-SQL systems powered by Large Language Models (LLMs) achieve high accuracy on standard benchmarks, yet existing efficiency metrics such as the Valid Efficiency Score (VES) measure execution time rather than the consumption-based costs of cloud data warehouses. This paper presents the first systematic evaluation of cloud compute costs for LLM-generated SQL queries. We evaluate six state-of-the-art LLMs across 180 query executions on Google BigQuery using the StackOverflow dataset (230GB), measuring bytes processed, slot utilization, and estimated cost. Our analysis yields three key findings: (1) reasoning models process 44.5% fewer bytes than standard models while maintaining equivalent correctness (96.7%-100%); (2) execution time correlates weakly with query cost (r=0.16), indicating that speed optimization does not imply cost optimization; and (3) models exhibit up to 3.4x cost variance, with standard models producing outliers exceeding 36GB per query. We identify prevalent inefficiency patterns including missing partition filters and unnecessary full-table scans, and provide deployment guidelines for cost-sensitive enterprise environments.

DSOct 20, 2025
Sorting by Strip Swaps is NP-Hard

Swapnoneel Roy, Asai Asaithambi, Debajyoti Mukhopadhyay

We show that \emph{Sorting by Strip Swaps} (SbSS) is NP-hard by a polynomial reduction of \emph{Block Sorting}. The key idea is a local gadget, a \emph{cage}, that replaces every decreasing adjacency $(a_i,a_{i+1})$ by a guarded triple $a_i,m_i,a_{i+1}$ enclosed by guards $L_i,U_i$, so the only decreasing adjacencies are the two inside the cage. Small \emph{hinge} gadgets couple adjacent cages that share an element and enforce that a strip swap that removes exactly two adjacencies corresponds bijectively to a block move that removes exactly one decreasing adjacency in the source permutation. This yields a clean equivalence between exact SbSS schedules and perfect block schedules, establishing NP-hardness.

AIFeb 4, 2022
HENRI: High Efficiency Negotiation-based Robust Interface for Multi-party Multi-issue Negotiation over the Internet

Saurabh Deochake, Shashank Kanth, Subhadip Chakraborty et al.

This paper proposes a framework for a full fledged negotiation system that allows multi party multi issue negotiation. It focuses on the negotiation protocol to be observed and provides a platform for concurrent and independent negotiation on individual issues using the concept of multi threading. It depicts the architecture of an agent detailing its components. The paper sets forth a hierarchical pattern for the multiple issues concerning every party. The system also provides enhancements such as the time-to-live counters for every advertisement, refinement of utility considering non-functional attributes, prioritization of issues, by assigning weights to issues.

LGMar 25, 2015
A Survey of Classification Techniques in the Area of Big Data

Praful Koturwar, Sheetal Girase, Debajyoti Mukhopadhyay

Big Data concern large-volume, growing data sets that are complex and have multiple autonomous sources. Earlier technologies were not able to handle storage and processing of huge data thus Big Data concept comes into existence. This is a tedious job for users unstructured data. So, there should be some mechanism which classify unstructured data into organized form which helps user to easily access required data. Classification techniques over big transactional database provide required data to the users from large datasets more simple way. There are two main classification techniques, supervised and unsupervised. In this paper we focused on to study of different supervised classification techniques. Further this paper shows a advantages and limitations.

IRMar 25, 2015
Role of Matrix Factorization Model in Collaborative Filtering Algorithm: A Survey

Dheeraj kumar Bokde, Sheetal Girase, Debajyoti Mukhopadhyay

Recommendation Systems apply Information Retrieval techniques to select the online information relevant to a given user. Collaborative Filtering is currently most widely used approach to build Recommendation System. CF techniques uses the user behavior in form of user item ratings as their information source for prediction. There are major challenges like sparsity of rating matrix and growing nature of data which is faced by CF algorithms. These challenges are been well taken care by Matrix Factorization. In this paper we attempt to present an overview on the role of different MF model to address the challenges of CF algorithms, which can be served as a roadmap for research in this area.

IRMar 25, 2015
User Profiling Trends, Techniques and Applications

Sumitkumar Kanoje, Sheetal Girase, Debajyoti Mukhopadhyay

The Personalization of information has taken recommender systems at a very high level. With personalization these systems can generate user specific recommendations accurately and efficiently. User profiling helps personalization, where information retrieval is done to personalize a scenario which maintains a separate user profile for individual user. The main objective of this paper is to explore this field of personalization in context of user profiling, to help researchers make aware of the user profiling. Various trends, techniques and Applications have been discussed in paper which will fulfill this motto.

IRMar 23, 2015
An Item-Based Collaborative Filtering using Dimensionality Reduction Techniques on Mahout Framework

Dheeraj kumar Bokde, Sheetal Girase, Debajyoti Mukhopadhyay

Collaborative Filtering is the most widely used prediction technique in Recommendation System. Most of the current CF recommender systems maintains single criteria user rating in user item matrix. However, recent studies indicate that recommender system depending on multi criteria can improve prediction and accuracy levels of recommendation by considering the user preferences in multi aspects of items. This gives birth to Multi Criteria Collaborative Filtering. In MC CF users provide the rating on multiple aspects of an item in new dimensions,thereby increasing the size of rating matrix, sparsity and scalability problem. Appropriate dimensionality reduction techniques are thus needed to take care of these challenges to reduce the dimension of user item rating matrix to improve the prediction accuracy and efficiency of CF recommender system. The process of dimensionality reduction maps the high dimensional input space into lower dimensional space. Thus, the objective of this paper is to propose an efficient MC CF algorithm using dimensionality reduction technique to improve the recommendation quality and prediction accuracy. Dimensionality reduction techniques such as Singular Value Decomposition and Principal Component Analysis are used to solve the scalability and alleviate the sparsity problems in overall rating. The proposed MC CF approach will be implemented using Apache Mahout, which allows processing of massive dataset stored in distributed/non-distributed file system.

DCMar 23, 2015
Algorithm for Back-up and Authentication of Data Stored on Cloud

Manali Raje, Debajyoti Mukhopadhyay

Everyday a huge amount of data is generated in Cloud Computing. The maintenance of this electronic data needs some extremely efficient services. There is a need to properly collect this data, check for its authenticity and develop proper backups is needed. The Objective of this paper is to provide Response Server, some solution for the backup of data and its restoration, using the Cloud. Thecollection of the data is to be done from the client and then the data should be sent to a central location. This process is a platform independent one. The data can then be used as required. The Remote Backup Server facilitates the collection of information from any remote location and provides services to recover the data in case of loss. The authentication of the user is done by using the Asymmetric key algorithm which will in turn leads to the authentication of the data.

IRMar 23, 2015
User Profiling for Recommendation System

Sumitkumar Kanoje, Sheetal Girase, Debajyoti Mukhopadhyay

Recommendation system is a type of information filtering systems that recommend various objects from a vast variety and quantity of items which are of the user interest. This results in guiding an individual in personalized way to interesting or useful objects in a large space of possible options. Such systems also help many businesses to achieve more profits to sustain in their filed against their rivals. But looking at the amount of information which a business holds it becomes difficult to identify the items of user interest. Therefore personalization or user profiling is one of the challenging tasks that give access to user relevant information which can be used in solving the difficult task of classification and ranking items according to an individuals interest. Profiling can be done in various ways such assupervised or unsupervised, individual or group profiling, distributive or and non distributive profiling. Our focus in this paper will be on the dataset which we will use, we identify some interesting facts by using Weka Tool that can be used for recommending the items from dataset. Our aim is to present a novel technique to achieve user profiling in recommendation system.

CRNov 25, 2014
Modified Apriori Approach for Evade Network Intrusion Detection System

Laxmi Lahoti, Chaitali Chandankhede, Debajyoti Mukhopadhyay

Intrusion Detection System or IDS is a software or hardware tool that repeatedly scans and monitors events that took place in a computer or a network. A set of rules are used by Signature based Network Intrusion Detection Systems or NIDS to detect hostile traffic in network segments or packets, which are so important in detecting malicious and anomalous behaviour over the network like known attacks that hackers look for new techniques to go unseen. Sometime, a single failure at any layer will cause the NIDS to miss that attack. To overcome this problem, a technique is used that will trigger a failure in that layer. Such technique is known as Evasive technique. An Evasion can be defined as any technique that modifies a visible attack into any other form in order to stay away from being detect. The proposed system is used for detecting attacks which are going on the network and also gives actual categorization of attacks. The proposed system has advantage of getting low false alarm rate and high detection rate. So that leads into decrease in complexity and overhead on the system. The paper presents the Evasion technique for customized apriori algorithm. The paper aims to make a new functional structure to evade NIDS. This framework can be used to audit NIDS. This framework shows that a proof of concept showing how to evade a self built NIDS considering two publicly available datasets.

IRNov 25, 2014
Efficient Fuzzy Search Engine with B-Tree Search Mechanism

Simran Bijral, Debajyoti Mukhopadhyay

Search engines play a vital role in day to day life on internet. People use search engines to find content on internet. Cloud computing is the computing concept in which data is stored and accessed with the help of a third party server called as cloud. Data is not stored locally on our machines and the softwares and information are provided to user if user demands for it. Search queries are the most important part in searching data on internet. A search query consists of one or more than one keywords. A search query is searched from the database for exact match, and the traditional searchable schemes do not tolerate minor typos and format inconsistencies, which happen quite frequently. This drawback makes the existing techniques unsuitable and they offer very low efficiency. In this paper, we will for the first time formulate the problem of effective fuzzy search by introducing tree search methodologies. We will explore the benefits of B trees in search mechanism and use them to have an efficient keyword search. We have taken into consideration the security analysis strictly so as to get a secure and privacy-preserving system.

CRNov 25, 2014
Securing the Data in Clouds with Hyperelliptic Curve Cryptography

Debajyoti Mukhopadhyay, Ashay Shirwadkar, Pratik Gaikar et al.

In todays world, Cloud computing has attracted research communities as it provides services in reduced cost due to virtualizing all the necessary resources. Even modern business architecture depends upon Cloud computing .As it is a internet based utility, which provides various services over a network, it is prone to network based attacks. Hence security in clouds is the most important in case of cloud computing. Cloud Security concerns the customer to fully rely on storing data on clouds. That is why Cloud security has attracted attention of the research community. This paper will discuss securing the data in clouds by implementing key agreement, encryption and signature verification/generation with hyperelliptic curve cryptography.

SENov 25, 2014
A Tool to Automate the Sizing of Application Process for SOA based Platform

Debajyoti Mukhopadhyay, Juhi Jariwala, Payal Innani et al.

Service Oriented Architecture is a loosely coupled architecture designed to tackle the problem of Business Infrastructure alignment to meet the needs of an organization. A SOA based platform enables the enterprises to develop applications in the form of independent services. To provide scalable service interactions, there is a need to maintain services performance and have a good sizing guideline of the underlying software platform. Sizing aids in finding the optimum resources required to configure and implement a system that would satisfy the requirements of Business Process Integration being planned. A web based Sizing Tool prototype is developed using Java Application Programming Interfaces to automate the process of sizing the applications deployed on SOA platform that not only scales the performance of the system but also predicts its business growth in the future.

CVNov 28, 2013
An Alternate Approach for Designing a Domain Specific Image Search Prototype Using Histogram

Sukanta Sinha, Rana Dattagupta, Debajyoti Mukhopadhyay

Everyone knows that thousand of words are represented by a single image. As a result image search has become a very popular mechanism for the Web searchers. Image search means, the search results are produced by the search engine should be a set of images along with their Web page Unified Resource Locator. Now Web searcher can perform two types of image search, they are Text to Image and Image to Image search. In Text to Image search, search query should be a text. Based on the input text data system will generate a set of images along with their Web page URL as an output. On the other hand, in Image to Image search, search query should be an image and based on this image system will generate a set of images along with their Web page URL as an output. According to the current scenarios, Text to Image search mechanism always not returns perfect result. It matches the text data and then displays the corresponding images as an output, which is not always perfect. To resolve this problem, Web researchers have introduced the Image to Image search mechanism. In this paper, we have also proposed an alternate approach of Image to Image search mechanism using Histogram.

IRNov 28, 2013
A Hybrid Web Recommendation System based on the Improved Association Rule Mining Algorithm

Ujwala Wanaskar, Sheetal Vij, Debajyoti Mukhopadhyay

As the growing interest of web recommendation systems those are applied to deliver customized data for their users, we started working on this system. Generally the recommendation systems are divided into two major categories such as collaborative recommendation system and content based recommendation system. In case of collaborative recommen-dation systems, these try to seek out users who share same tastes that of given user as well as recommends the websites according to the liking given user. Whereas the content based recommendation systems tries to recommend web sites similar to those web sites the user has liked. In the recent research we found that the efficient technique based on asso-ciation rule mining algorithm is proposed in order to solve the problem of web page recommendation. Major problem of the same is that the web pages are given equal importance. Here the importance of pages changes according to the fre-quency of visiting the web page as well as amount of time user spends on that page. Also recommendation of newly added web pages or the pages those are not yet visited by users are not included in the recommendation set. To over-come this problem, we have used the web usage log in the adaptive association rule based web mining where the asso-ciation rules were applied to personalization. This algorithm was purely based on the Apriori data mining algorithm in order to generate the association rules. However this method also suffers from some unavoidable drawbacks. In this paper we are presenting and investigating the new approach based on weighted Association Rule Mining Algorithm and text mining. This is improved algorithm which adds semantic knowledge to the results, has more efficiency and hence gives better quality and performances as compared to existing approaches.

IRNov 28, 2013
Searching and Establishment of S-P-O Relationships for Linked RDF Graphs : An Adaptive Approach

Ayan Chakraborty, Shiladitya Munshi, Debajyoti Mukhopadhyay

In the coming era of semantic web linked data analysis is a very burning issue for efficient searching and retrieval of information. One way of establishing this link is to implement subject predicate object relationship through Set Theory approach which is already done in our previous work. For analyzing inter relationship between two RDF Graphs, RDF- Schema (RDFS) should also be taken care of. In the present paper, an adaptive combination rule based framework has been proposed for establishment of S P O relationship and RDF Graph searching is reported. Hence the identification of criteria for inter-relationship of RDF Graphs opens up new road in semantic search.

AINov 26, 2013
A Framework for Semi-automated Web Service Composition in Semantic Web

Debajyoti Mukhopadhyay, Archana Chougule

Number of web services available on Internet and its usage are increasing very fast. In many cases, one service is not enough to complete the business requirement; composition of web services is carried out. Autonomous composition of web services to achieve new functionality is generating considerable attention in semantic web domain. Development time and effort for new applications can be reduced with service composition. Various approaches to carry out automated composition of web services are discussed in literature. Web service composition using ontologies is one of the effective approaches. In this paper we demonstrate how the ontology based composition can be made faster for each customer. We propose a framework to provide precomposed web services to fulfil user requirements. We detail how ontology merging can be used for composition which expedites the whole process. We discuss how framework provides customer specific ontology merging and repository. We also elaborate on how merging of ontologies is carried out.

DBNov 26, 2013
Reverse Proxy Framework using Sanitization Technique for Intrusion Prevention in Database

Vrushali Randhe, Archana Chougule, Debajyoti Mukhopadhyay

With the increasing importance of the internet in our day to day life, data security in web application has become very crucial. Ever increasing on line and real time transaction services have led to manifold rise in the problems associated with the database security. Attacker uses illegal and unauthorized approaches to hijack the confidential information like username, password and other vital details. Hence the real time transaction requires security against web based attacks. SQL injection and cross site scripting attack are the most common application layer attack. The SQL injection attacker pass SQL statement through a web applications input fields, URL or hidden parameters and get access to the database or update it. The attacker take a benefit from user provided data in such a way that the users input is handled as a SQL code. Using this vulnerability an attacker can execute SQL commands directly on the database. SQL injection attacks are most serious threats which take users input and integrate it into SQL query. Reverse Proxy is a technique which is used to sanitize the users inputs that may transform into a database attack. In this technique a data redirector program redirects the users input to the proxy server before it is sent to the application server. At the proxy server, data cleaning algorithm is triggered using a sanitizing application. In this framework we include detection and sanitization of the tainted information being sent to the database and innovate a new prototype.

IRNov 25, 2013
A Model Approach to Build Basic Ontology

Debajyoti Mukhopadhyay, Sajeeda Shikalgar

As todays world grows with the technology on the other hand it seems to be small with the World Wide Web. With the use of Internet more and more information can be search from the web. When Users fires a query they want relevancy in obtained results. In general, search engines perform the ranking of web pages in an offline mode, which is after the web pages have been retrieved and stored in the database. But most of the time this method does not provide relevant results as most of the search engines were using some ranking algorithms like page Rank, HITS, SALSA and Hilltop. Where these algorithms does not always provides the results based on the semantic web. So a concept of Ontology is been introduced in search engines to get more meaningful and relevant results with respect to the users query.Ontologies are used to capture knowledge about some domain of interest. Ontology describes the concepts in the domain and also the relationships that hold between those concepts. Different ontology languages provide different facilities. The most recent development in standard ontology languages is OWL (Ontology Web Language) from the World Wide Web Consortium. OWL makes it possible to describe concept to its full extent and enables the search engines to provide accurate results to the user.

IRNov 25, 2013
Web-page Indexing based on the Prioritize Ontology Terms

Sukanta Sinha, Rana Dattagupta, Debajyoti Mukhopadhyay

In this world, globalization has become a basic and most popular human trend. To globalize information, people are going to publish the documents in the internet. As a result, information volume of internet has become huge. To handle that huge volume of information, Web searcher uses search engines. The Webpage indexing mechanism of a search engine plays a big role to retrieve Web search results in a faster way from the huge volume of Web resources. Web researchers have introduced various types of Web-page indexing mechanism to retrieve Webpages from Webpage repository. In this paper, we have illustrated a new approach of design and development of Webpage indexing. The proposed Webpage indexing mechanism has applied on domain specific Webpages and we have identified the Webpage domain based on an Ontology. In our approach, first we prioritize the Ontology terms that exist in the Webpage content then apply our own indexing mechanism to index that Webpage. The main advantage of storing an index is to optimize the speed and performance while finding relevant documents from the domain specific search engine storage area for a user given search query.

IRNov 25, 2013
A Decision Tree Approach to Classify Web Services using Quality Parameters

Shilpa Sonawani, Debajyoti Mukhopadhyay

With the increase in the number of web services, many web services are available on internet providing the same functionality, making it difficult to choose the best one, fulfilling users all requirements. This problem can be solved by considering the quality of web services to distinguish functionally similar web services. Nine different quality parameters are considered. Web services can be classified and ranked using decision tree approach since they do not require long training period and can be easily interpreted. Various decision tree and rules approaches available are applied and tested to find the optimal decision method to correctly classify functionally similar web services considering their quality parameters.

IRNov 25, 2013
Experience of Developing a Meta-Semantic Search Engine

Debajyoti Mukhopadhyay, Manoj Sharma, Gajanan Joshi et al.

Thinking of todays web search scenario which is mainly keyword based, leads to the need of effective and meaningful search provided by Semantic Web. Existing search engines are vulnerable to provide relevant answers to users query due to their dependency on simple data available in web pages. On other hand, semantic search engines provide efficient and relevant results as the semantic web manages information with well defined meaning using ontology. A Meta-Search engine is a search tool that forwards users query to several existing search engines and provides combined results by using their own page ranking algorithm. SemanTelli is a meta semantic search engine that fetches results from different semantic search engines such as Hakia, DuckDuckGo, SenseBot through intelligent agents. This paper proposes enhancement of SemanTelli with improved snippet analysis based page ranking algorithm and support for image and news search.

IRMay 4, 2013
Intelligent Agent Based Semantic Web in Cloud Computing Environment

Debajyoti Mukhopadhyay, Manoj Sharma, Gajanan Joshi et al.

Considering today's web scenario, there is a need of effective and meaningful search over the web which is provided by Semantic Web. Existing search engines are keyword based. They are vulnerable in answering intelligent queries from the user due to the dependence of their results on information available in web pages. While semantic search engines provides efficient and relevant results as the semantic web is an extension of the current web in which information is given well defined meaning. MetaCrawler is a search tool that uses several existing search engines and provides combined results by using their own page ranking algorithm. This paper proposes development of a meta-semantic-search engine called SemanTelli which works within cloud. SemanTelli fetches results from different semantic search engines such as Hakia, DuckDuckGo, SenseBot with the help of intelligent agents that eliminate the limitations of existing search engines.

CRMar 28, 2013
Enhanced Security for Cloud Storage using File Encryption

Debajyoti Mukhopadhyay, Gitesh Sonawane, Parth Sarthi Gupta et al.

Cloud computing is a term coined to a network that offers incredible processing power, a wide array of storage space and unbelievable speed of computation. Social media channels, corporate structures and individual consumers are all switching to the magnificent world of cloud computing. The flip side to this coin is that with cloud storage emerges the security issues of confidentiality, data integrity and data availability. Since the cloud is a mere collection of tangible super computers spread across the world, authentication and authorization for data access is more than a necessity. Our work attempts to overcome these security threats. The proposed methodology suggests the encryption of the files to be uploaded on the cloud. The integrity and confidentiality of the data uploaded by the user is ensured doubly by not only encrypting it but also providing access to the data only on successful authentication.

DBJul 27, 2012
Query Optimization Over Web Services Using A Mixed Approach

Debajyoti Mukhopadhyay, Dhaval Chandarana, Rutvi Dave et al.

A Web Service Management System (WSMS) can be well-thought-out as a consistent and a secure way of managing the web services. Web Service has become a quintessential part of the web world, managing and sharing the resources of the business it is associated with. In this paper, we focus on the query optimization aspect of handling the "natural language" query, queried to the WSMS. The map-select-composite operations are piloted to select specific web services. The main aftermath of our research is ensued in an algorithm which uses cost-based as well as heuristic based approach for query optimization. Query plan is formed after cost-based evaluation and using Greedy algorithm. The heuristic based approach further optimizes the evaluation plan. This scheme not only guarantees an optimal solution, which has a minimum diversion from the ideal solution, but also saves time which is otherwise utilized in generating various query plans using many mathematical models and then evaluating each one.

IRJul 16, 2012
Identify Web-page Content meaning using Knowledge based System for Dual Meaning Words

Sukanta Sinha, Rana Dattagupta, Debajyoti Mukhopadhyay

Meaning of Web-page content plays a big role while produced a search result from a search engine. Most of the cases Web-page meaning stored in title or meta-tag area but those meanings do not always match with Web-page content. To overcome this situation we need to go through the Web-page content to identify the Web-page meaning. In such cases, where Webpage content holds dual meaning words that time it is really difficult to identify the meaning of the Web-page. In this paper, we are introducing a new design and development mechanism of identifying the Web-page content meaning which holds dual meaning words in their Web-page content.

IRJun 25, 2012
Web-page Prediction for Domain Specific Web-search using Boolean Bit Mask

Sukanta Sinha, Rana Duttagupta, Debajyoti Mukhopadhyay

Search Engine is a Web-page retrieval tool. Nowadays Web searchers utilize their time using an efficient search engine. To improve the performance of the search engine, we are introducing a unique mechanism which will give Web searchers more prominent search results. In this paper, we are going to discuss a domain specific Web search prototype which will generate the predicted Web-page list for user given search string using Boolean bit mask.

IRJun 25, 2012
A Survey on Web Service Discovery Approaches

Debajyoti Mukhopadhyay, Archana Chougule

Web services are playing an important role in e-business and e-commerce applications. As web service applications are interoperable and can work on any platform, large scale distributed systems can be developed easily using web services. Finding most suitable web service from vast collection of web services is very crucial for successful execution of applications. Traditional web service discovery approach is a keyword based search using UDDI. Various other approaches for discovering web services are also available. Some of the discovery approaches are syntax based while other are semantic based. Having system for service discovery which can work automatically is also the concern of service discovery approaches. As these approaches are different, one solution may be better than another depending on requirements. Selecting a specific service discovery system is a hard task. In this paper, we give an overview of different approaches for web service discovery described in literature. We present a survey of how these approaches differ from each other.