LGMar 4, 2022
Contrastive Graph Convolutional Networks for Hardware Trojan Detection in Third Party IP CoresNikhil Muralidhar, Abdullah Zubair, Nathanael Weidler et al.
The availability of wide-ranging third-party intellectual property (3PIP) cores enables integrated circuit (IC) designers to focus on designing high-level features in ASICs/SoCs. The massive proliferation of ICs brings with it an increased number of bad actors seeking to exploit those circuits for various nefarious reasons. This is not surprising as integrated circuits affect every aspect of society. Thus, malicious logic (Hardware Trojans, HT) being surreptitiously injected by untrusted vendors into 3PIP cores used in IC design is an ever present threat. In this paper, we explore methods for identification of trigger-based HT in designs containing synthesizable IP cores without a golden model. Specifically, we develop methods to detect hardware trojans by detecting triggers embedded in ICs purely based on netlists acquired from the vendor. We propose GATE-Net, a deep learning model based on graph-convolutional networks (GCN) trained using supervised contrastive learning, for flagging designs containing randomly-inserted triggers using only the corresponding netlist. Our proposed architecture achieves significant improvements over state-of-the-art learning models yielding an average 46.99% improvement in detection performance for combinatorial triggers and 21.91% improvement for sequential triggers across a variety of circuit types. Through rigorous experimentation, qualitative and quantitative performance evaluations, we demonstrate effectiveness of GATE-Net and the supervised contrastive training of GATE-Net for HT detection.
CROct 1, 2022
Detecting Irregular Network Activity with Adversarial Learning and Expert FeedbackGopikrishna Rathinavel, Nikhil Muralidhar, Timothy O'Shea et al.
Anomaly detection is a ubiquitous and challenging task relevant across many disciplines. With the vital role communication networks play in our daily lives, the security of these networks is imperative for smooth functioning of society. To this end, we propose a novel self-supervised deep learning framework CAAD for anomaly detection in wireless communication systems. Specifically, CAAD employs contrastive learning in an adversarial setup to learn effective representations of normal and anomalous behavior in wireless networks. We conduct rigorous performance comparisons of CAAD with several state-of-the-art anomaly detection techniques and verify that CAAD yields a mean performance improvement of 92.84%. Additionally, we also augment CAAD enabling it to systematically incorporate expert feedback through a novel contrastive learning feedback loop to improve the learned representations and thereby reduce prediction uncertainty (CAAD-EF). We view CAAD-EF as a novel, holistic and widely applicable solution to anomaly detection.
CLNov 3, 2022
Overcoming Barriers to Skill Injection in Language Modeling: Case Study in ArithmeticMandar Sharma, Nikhil Muralidhar, Naren Ramakrishnan
Through their transfer learning abilities, highly-parameterized large pre-trained language models have dominated the NLP landscape for a multitude of downstream language tasks. Though linguistically proficient, the inability of these models to incorporate the learning of non-linguistic entities (numerals and arithmetic reasoning) limits their usage for tasks that require numeric comprehension or strict mathematical reasoning. However, as we illustrate in this paper, building a general purpose language model that also happens to be proficient in mathematical reasoning is not as straight-forward as training it on a numeric dataset. In this work, we develop a novel framework that enables language models to be mathematically proficient while retaining their linguistic prowess. Specifically, we offer information-theoretic interventions to overcome the catastrophic forgetting of linguistic skills that occurs while injecting non-linguistic skills into language models.
LGMay 6, 2025Code
Plug-and-Play AMC: Context Is King in Training-Free, Open-Set Modulation with LLMsMohammad Rostami, Atik Faysal, Reihaneh Gh. Roshan et al.
Automatic Modulation Classification (AMC) is critical for efficient spectrum management and robust wireless communications. However, AMC remains challenging due to the complex interplay of signal interference and noise. In this work, we propose an innovative framework that integrates traditional signal processing techniques with Large-Language Models (LLMs) to address AMC. Our approach leverages higher-order statistics and cumulant estimation to convert quantitative signal features into structured natural language prompts. By incorporating exemplar contexts into these prompts, our method exploits the LLM's inherent familiarity with classical signal processing, enabling effective one-shot classification without additional training or preprocessing (e.g., denoising). Experimental evaluations on synthetically generated datasets, spanning both noiseless and noisy conditions, demonstrate that our framework achieves competitive performance across diverse modulation schemes and Signal-to-Noise Ratios (SNRs). Moreover, our approach paves the way for robust foundation models in wireless communications across varying channel conditions, significantly reducing the expense associated with developing channel-specific models. This work lays the foundation for scalable, interpretable, and versatile signal classification systems in next-generation wireless networks. The source code is available at https://github.com/RU-SIT/context-is-king
CLJun 9, 2025Code
Can AI Validate Science? Benchmarking LLMs for Accurate Scientific Claim $\rightarrow$ Evidence ReasoningShashidhar Reddy Javaji, Yupeng Cao, Haohang Li et al.
Large language models (LLMs) are increasingly being used for complex research tasks such as literature review, idea generation, and scientific paper analysis, yet their ability to truly understand and process the intricate relationships within complex research papers, such as the logical links between claims and supporting evidence remains largely unexplored. In this study, we present CLAIM-BENCH, a comprehensive benchmark for evaluating LLMs' capabilities in scientific claim-evidence extraction and validation, a task that reflects deeper comprehension of scientific argumentation. We systematically compare three approaches which are inspired by divide and conquer approaches, across six diverse LLMs, highlighting model-specific strengths and weaknesses in scientific comprehension. Through evaluation involving over 300 claim-evidence pairs across multiple research domains, we reveal significant limitations in LLMs' ability to process complex scientific content. Our results demonstrate that closed-source models like GPT-4 and Claude consistently outperform open-source counterparts in precision and recall across claim-evidence identification tasks. Furthermore, strategically designed three-pass and one-by-one prompting approaches significantly improve LLMs' abilities to accurately link dispersed evidence with claims, although this comes at increased computational cost. CLAIM-BENCH sets a new standard for evaluating scientific comprehension in LLMs, offering both a diagnostic tool and a path forward for building systems capable of deeper, more reliable reasoning across full-length papers.
LGFeb 19, 2024Code
Reinforcement Learning as a Parsimonious Alternative to Prediction Cascades: A Case Study on Image SegmentationBharat Srikishan, Anika Tabassum, Srikanth Allu et al.
Deep learning architectures have achieved state-of-the-art (SOTA) performance on computer vision tasks such as object detection and image segmentation. This may be attributed to the use of over-parameterized, monolithic deep learning architectures executed on large datasets. Although such architectures lead to increased accuracy, this is usually accompanied by a large increase in computation and memory requirements during inference. While this is a non-issue in traditional machine learning pipelines, the recent confluence of machine learning and fields like the Internet of Things has rendered such large architectures infeasible for execution in low-resource settings. In such settings, previous efforts have proposed decision cascades where inputs are passed through models of increasing complexity until desired performance is achieved. However, we argue that cascaded prediction leads to increased computational cost due to wasteful intermediate computations. To address this, we propose PaSeR (Parsimonious Segmentation with Reinforcement Learning) a non-cascading, cost-aware learning pipeline as an alternative to cascaded architectures. Through experimental evaluation on real-world and standard datasets, we demonstrate that PaSeR achieves better accuracy while minimizing computational cost relative to cascaded models. Further, we introduce a new metric IoU/GigaFlop to evaluate the balance between cost and performance. On the real-world task of battery material phase segmentation, PaSeR yields a minimum performance improvement of 174% on the IoU/GigaFlop metric with respect to baselines. We also demonstrate PaSeR's adaptability to complementary models trained on a noisy MNIST dataset, where it achieved a minimum performance improvement on IoU/GigaFlop of 13.4% over SOTA models. Code and data are available at https://github.com/scailab/paser .
CVJan 27Code
Semi-Supervised Masked Autoencoders: Unlocking Vision Transformer Potential with Limited DataAtik Faysal, Mohammad Rostami, Reihaneh Gh. Roshan et al.
We address the challenge of training Vision Transformers (ViTs) when labeled data is scarce but unlabeled data is abundant. We propose Semi-Supervised Masked Autoencoder (SSMAE), a framework that jointly optimizes masked image reconstruction and classification using both unlabeled and labeled samples with dynamically selected pseudo-labels. SSMAE introduces a validation-driven gating mechanism that activates pseudo-labeling only after the model achieves reliable, high-confidence predictions that are consistent across both weakly and strongly augmented views of the same image, reducing confirmation bias. On CIFAR-10 and CIFAR-100, SSMAE consistently outperforms supervised ViT and fine-tuned MAE, with the largest gains in low-label regimes (+9.24% over ViT on CIFAR-10 with 10% labels). Our results demonstrate that when pseudo-labels are introduced is as important as how they are generated for data-efficient transformer training. Codes are available at https://github.com/atik666/ssmae.
LGMar 13, 2025Code
Model-Agnostic Knowledge Guided Correction for Improved Neural Surrogate RolloutBharat Srikishan, Daniel O'Malley, Mohamed Mehana et al.
Modeling the evolution of physical systems is critical to many applications in science and engineering. As the evolution of these systems is governed by partial differential equations (PDEs), there are a number of computational simulations which resolve these systems with high accuracy. However, as these simulations incur high computational costs, they are infeasible to be employed for large-scale analysis. A popular alternative to simulators are neural network surrogates which are trained in a data-driven manner and are much more computationally efficient. However, these surrogate models suffer from high rollout error when used autoregressively, especially when confronted with training data paucity. Existing work proposes to improve surrogate rollout error by either including physical loss terms directly in the optimization of the model or incorporating computational simulators as `differentiable layers' in the neural network. Both of these approaches have their challenges, with physical loss functions suffering from slow convergence for stiff PDEs and simulator layers requiring gradients which are not always available, especially in legacy simulators. We propose the Hybrid PDE Predictor with Reinforcement Learning (HyPER) model: a model-agnostic, RL based, cost-aware model which combines a neural surrogate, RL decision model, and a physics simulator (with or without gradients) to reduce surrogate rollout error significantly. In addition to reducing in-distribution rollout error by 47%-78%, HyPER learns an intelligent policy that is adaptable to changing physical conditions and resistant to noise corruption. Code available at https://github.com/scailab/HyPER.
NIJan 30, 2024
Large Multi-Modal Models (LMMs) as Universal Foundation Models for AI-Native Wireless SystemsShengzhe Xu, Christo Kurisummoottil Thomas, Omar Hashash et al.
Large language models (LLMs) and foundation models have been recently touted as a game-changer for 6G systems. However, recent efforts on LLMs for wireless networks are limited to a direct application of existing language models that were designed for natural language processing (NLP) applications. To address this challenge and create wireless-centric foundation models, this paper presents a comprehensive vision on how to design universal foundation models that are tailored towards the deployment of artificial intelligence (AI)-native networks. Diverging from NLP-based foundation models, the proposed framework promotes the design of large multi-modal models (LMMs) fostered by three key capabilities: 1) processing of multi-modal sensing data, 2) grounding of physical symbol representations in real-world wireless systems using causal reasoning and retrieval-augmented generation (RAG), and 3) enabling instructibility from the wireless environment feedback to facilitate dynamic network adaptation thanks to logical and mathematical reasoning facilitated by neuro-symbolic AI. In essence, these properties enable the proposed LMM framework to build universal capabilities that cater to various cross-layer networking tasks and alignment of intents across different domains. Preliminary results from experimental evaluation demonstrate the efficacy of grounding using RAG in LMMs, and showcase the alignment of LMMs with wireless system designs. Furthermore, the enhanced rationale exhibited in the responses to mathematical questions by LMMs, compared to vanilla LLMs, demonstrates the logical and mathematical reasoning capabilities inherent in LMMs. Building on those results, we present a sequel of open questions and challenges for LMMs. We then conclude with a set of recommendations that ignite the path towards LMM-empowered AI-native systems.
SPOct 30, 2024
NMformer: A Transformer for Noisy Modulation Classification in Wireless CommunicationAtik Faysal, Mohammad Rostami, Reihaneh Gh. Roshan et al.
Modulation classification is a very challenging task since the signals intertwine with various ambient noises. Methods are required that can classify them without adding extra steps like denoising, which introduces computational complexity. In this study, we propose a vision transformer (ViT) based model named NMformer to predict the channel modulation images with different noise levels in wireless communication. Since ViTs are most effective for RGB images, we generated constellation diagrams from the modulated signals. The diagrams provide the information from the signals in a 2-D representation form. We trained NMformer on 106, 800 modulation images to build the base classifier and only used 3, 000 images to fine-tune for specific tasks. Our proposed model has two different kinds of prediction setups: in-distribution and out-of-distribution. Our model achieves 4.67% higher accuracy than the base classifier when finetuned and tested on high signal-to-noise ratios (SNRs) in-distribution classes. Moreover, the fine-tuned low SNR task achieves a higher accuracy than the base classifier. The fine-tuned classifier becomes much more effective than the base classifier by achieving higher accuracy when predicted, even on unseen data from out-of-distribution classes. Extensive experiments show the effectiveness of NMformer for a wide range of SNRs.
LGJan 20, 2025
DenoMAE: A Multimodal Autoencoder for Denoising Modulation SignalsAtik Faysal, Taha Boushine, Mohammad Rostami et al.
We propose Denoising Masked Autoencoder (Deno-MAE), a novel multimodal autoencoder framework for denoising modulation signals during pretraining. DenoMAE extends the concept of masked autoencoders by incorporating multiple input modalities, including noise as an explicit modality, to enhance cross-modal learning and improve denoising performance. The network is pre-trained using unlabeled noisy modulation signals and constellation diagrams, effectively learning to reconstruct their equivalent noiseless signals and diagrams. Deno-MAE achieves state-of-the-art accuracy in automatic modulation classification tasks with significantly fewer training samples, demonstrating a 10% reduction in unlabeled pretraining data and a 3% reduction in labeled fine-tuning data compared to existing approaches. Moreover, our model exhibits robust performance across varying signal-to-noise ratios (SNRs) and supports extrapolation on unseen lower SNRs. The results indicate that DenoMAE is an efficient, flexible, and data-efficient solution for denoising and classifying modulation signals in challenging noise-intensive environments.
LGFeb 25, 2025
DenoMAE2.0: Improving Denoising Masked Autoencoders by Classifying Local PatchesAtik Faysal, Mohammad Rostami, Taha Boushine et al.
We introduce DenoMAE2.0, an enhanced denoising masked autoencoder that integrates a local patch classification objective alongside traditional reconstruction loss to improve representation learning and robustness. Unlike conventional Masked Autoencoders (MAE), which focus solely on reconstructing missing inputs, DenoMAE2.0 introduces position-aware classification of unmasked patches, enabling the model to capture fine-grained local features while maintaining global coherence. This dual-objective approach is particularly beneficial in semi-supervised learning for wireless communication, where high noise levels and data scarcity pose significant challenges. We conduct extensive experiments on modulation signal classification across a wide range of signal-to-noise ratios (SNRs), from extremely low to moderately high conditions and in a low data regime. Our results demonstrate that DenoMAE2.0 surpasses its predecessor, Deno-MAE, and other baselines in both denoising quality and downstream classification accuracy. DenoMAE2.0 achieves a 1.1% improvement over DenoMAE on our dataset and 11.83%, 16.55% significant improved accuracy gains on the RadioML benchmark, over DenoMAE, for constellation diagram classification of modulation signals.
CLApr 2, 2024
Laying Anchors: Semantically Priming Numerals in Language ModelingMandar Sharma, Rutuja Murlidhar Taware, Pravesh Koirala et al.
Off-the-shelf pre-trained language models have become the de facto standard in NLP pipelines for a multitude of downstream tasks. However, the inability of these models to properly encode numerals limits their performance on tasks requiring numeric comprehension. We introduce strategies to semantically prime numerals in any corpus by generating anchors governed by the distribution of numerals in said corpus, thereby enabling mathematically grounded representations of these numeral tokens. We establish the superiority of our proposed techniques through evaluation on a range of numeracy tasks for both in-domain (seen) and out-domain (unseen) numerals. Further, we expand our empirical evaluations to numerals ranging from 1 to 10 billion, a significantly broader range compared to previous studies of the same nature, and we demonstrate significant improvements in the mathematical grounding of our learned embeddings.
LGSep 30, 2025
DiSC-AMC: Token- and Parameter-Efficient Discretized Statistics In-Context Automatic Modulation ClassificationMohammad Rostami, Atik Faysal, Reihaneh Gh. Roshan et al.
Large Language Models (LLMs) can perform Automatic Modulation Classification (AMC) in an open-set manner without LLM fine-tuning when equipped with carefully designed in-context prompts~\cite{rostami2025plug}. Building on this prior work, we target the practical bottlenecks of long prompt contexts and large model sizes that impede in-the-loop deployment. We present Discretized Statistics in-Context Automatic Modulation Classification (DiSC-AMC), a token- and parameter-efficient variant that: (i) discretizes higher-order statistics and cumulants into compact symbolic tokens, (ii) prunes the exemplar list via a lightweight k-top neural prefilter and filters misleading/low-impact features using rationales extracted from prior LLM responses, and (iii) enforces label-only predictions through a calibrated prompt template. Together, these changes reduce both input/output tokens and the model parameter footprint by more than half while maintaining competitive accuracy. On synthetic AMC with ten modulation types under noise, a 7B \textit{DeepSeek-R1-Distill-Qwen} baseline achieves 5.2% accuracy, whereas our system, using an approximately 5B-parameter \textit{Gemini-2.5-Flash}~\cite{comanici2025gemini} model, attains 45.5% accuracy. These results demonstrate that careful discretization and context selection can cut inference cost by over 2x while preserving the advantages of prompt-based AMC and enabling practical in-the-loop use.
LGMay 24, 2025
The Prompt is Mightier than the ExampleShengzhe Xu, Nikhil Muralidhar, Naren Ramakrishnan
Numerous recent prompt optimization approaches like chain-of-thought, have been demonstrated to significantly improve the quality of content generated by large language models (LLMs). In-context learning (ICL), a recent paradigm where a few representative examples guide content generation has also led to strong improvements in generation quality of LLM generated content. This idea has been applied to great effect in synthetic tabular data generation, where LLMs, through effective use of ICL and prompt optimization, can generate data that approximate samples from complex, heterogeneous distributions based on representative examples. However, ensuring high-fidelity synthetic data often requires a very large number of ICL examples which may be unavailable or costly to obtain. At the same time, as LLMs get larger and larger, their in-built prior knowledge becomes vast and can potentially substitute for specific data examples. In this paper, we introduce Knowledge-Guided Prompting (KGP) as a new knob in prompt optimization and explore the ability of KGP-based prompt optimization to offset the cost of ICL. Specifically, we explore the question `how many examples can a prompt substitute for?' and explore knowledge-guided prompting (KGP) where domain knowledge, either inferred or available, is explicitly injected into the prompt, reducing dependence on ICL examples. Our experiments systematically explore the trade-off between ICL and KGP, revealing an empirical scaling law that quantifies how quality of generated synthetic data varies with increasing domain knowledge and decreasing example count. Our results demonstrate that knowledge-guided prompting can be a scalable alternative, or addition, to in-context examples, unlocking new approaches to synthetic data generation.
LGJun 20, 2024
Why LLMs Are Bad at Synthetic Table Generation (and what to do about it)Shengzhe Xu, Cho-Ting Lee, Mandar Sharma et al.
Synthetic data generation is integral to ML pipelines, e.g., to augment training data, replace sensitive information, and even to power advanced platforms like DeepSeek. While LLMs fine-tuned for synthetic data generation are gaining traction, synthetic table generation -- a critical data type in business and science -- remains under-explored compared to text and image synthesis. This paper shows that LLMs, whether used as-is or after traditional fine-tuning, are inadequate for generating synthetic tables. Their autoregressive nature, combined with random order permutation during fine-tuning, hampers the modeling of functional dependencies and prevents capturing conditional mixtures of distributions essential for real-world constraints. We demonstrate that making LLMs permutation-aware can mitigate these issues.
CLJun 20, 2024
Information Guided Regularization for Fine-tuning Language ModelsMandar Sharma, Nikhil Muralidhar, Shengzhe Xu et al.
The pretraining-fine-tuning paradigm has been the de facto strategy for transfer learning in modern language modeling. With the understanding that task adaptation in LMs is often a function of parameters shared across tasks, we argue that a more surgical approach to regularization needs to exist for smoother transfer learning. Towards this end, we investigate how the pretraining loss landscape is affected by these task-sensitive parameters through an information-theoretic lens. We then leverage the findings from our investigations to devise a novel approach to dropout for improved model regularization and better downstream generalization. This approach, named guided dropout, is both task & architecture agnostic and adds no computational overhead to the fine-tuning process. Through empirical evaluations, we showcase that our approach to regularization yields consistently better performance, even in scenarios of data paucity, compared to standardized baselines.
CLMay 14, 2023
Learning Non-linguistic Skills without Sacrificing Linguistic ProficiencyMandar Sharma, Nikhil Muralidhar, Naren Ramakrishnan
The field of Math-NLP has witnessed significant growth in recent years, motivated by the desire to expand LLM performance to the learning of non-linguistic notions (numerals, and subsequently, arithmetic reasoning). However, non-linguistic skill injection typically comes at a cost for LLMs: it leads to catastrophic forgetting of core linguistic skills, a consequence that often remains unaddressed in the literature. As Math-NLP has been able to create LLMs that can closely approximate the mathematical skills of a grade-schooler or the arithmetic reasoning skills of a calculator, the practicality of these models fail if they concomitantly shed their linguistic capabilities. In this work, we take a closer look into the phenomena of catastrophic forgetting as it pertains to LLMs and subsequently offer a novel framework for non-linguistic skill injection for LLMs based on information theoretic interventions and skill-specific losses that enable the learning of strict arithmetic reasoning. Our model outperforms the state-of-the-art both on injected non-linguistic skills and on linguistic knowledge retention, and does so with a fraction of the non-linguistic training data (1/4) and zero additional synthetic linguistic training data.
LGJun 30, 2021
Using AntiPatterns to avoid MLOps MistakesNikhil Muralidhar, Sathappah Muthiah, Patrick Butler et al.
We describe lessons learned from developing and deploying machine learning models at scale across the enterprise in a range of financial analytics applications. These lessons are presented in the form of antipatterns. Just as design patterns codify best software engineering practices, antipatterns provide a vocabulary to describe defective practices and methodologies. Here we catalog and document numerous antipatterns in financial ML operations (MLOps). Some antipatterns are due to technical errors, while others are due to not having sufficient knowledge of the surrounding context in which ML results are used. By providing a common vocabulary to discuss these situations, our intent is that antipatterns will support better documentation of issues, rapid communication between stakeholders, and faster resolution of problems. In addition to cataloging antipatterns, we describe solutions, best practices, and future directions toward MLOps maturity.
LGSep 23, 2020
Steering a Historical Disease Forecasting Model Under a Pandemic: Case of Flu and COVID-19Alexander Rodríguez, Nikhil Muralidhar, Bijaya Adhikari et al.
Forecasting influenza in a timely manner aids health organizations and policymakers in adequate preparation and decision making. However, effective influenza forecasting still remains a challenge despite increasing research interest. It is even more challenging amidst the COVID pandemic, when the influenza-like illness (ILI) counts are affected by various factors such as symptomatic similarities with COVID-19 and shift in healthcare seeking patterns of the general population. Under the current pandemic, historical influenza models carry valuable expertise about the disease dynamics but face difficulties adapting. Therefore, we propose CALI-Net, a neural transfer learning architecture which allows us to 'steer' a historical disease forecasting model to new scenarios where flu and COVID co-exist. Our framework enables this adaptation by automatically learning when it should emphasize learning from COVID-related signals and when it should learn from the historical model. Thus, we exploit representations learned from historical ILI data as well as the limited COVID-related signals. Our experiments demonstrate that our approach is successful in adapting a historical forecasting model to the current pandemic. In addition, we show that success in our primary goal, adaptation, does not sacrifice overall performance as compared with state-of-the-art influenza forecasting approaches.
LGNov 6, 2019
Physics-guided Design and Learning of Neural Networks for Predicting Drag Force on Particle Suspensions in Moving FluidsNikhil Muralidhar, Jie Bu, Ze Cao et al.
Physics-based simulations are often used to model and understand complex physical systems and processes in domains like fluid dynamics. Such simulations, although used frequently, have many limitations which could arise either due to the inability to accurately model a physical process owing to incomplete knowledge about certain facets of the process or due to the underlying process being too complex to accurately encode into a simulation model. In such situations, it is often useful to rely on machine learning methods to fill in the gap by learning a model of the complex physical process directly from simulation data. However, as data generation through simulations is costly, we need to develop models, being cognizant of data paucity issues. In such scenarios it is often helpful if the rich physical knowledge of the application domain is incorporated in the architectural design of machine learning models. Further, we can also use information from physics-based simulations to guide the learning process using aggregate supervision to favorably constrain the learning process. In this paper, we propose PhyDNN, a deep learning model using physics-guided structural priors and physics-guided aggregate supervision for modeling the drag forces acting on each particle in a Computational Fluid Dynamics-Discrete Element Method(CFD-DEM). We conduct extensive experiments in the context of drag force prediction and showcase the usefulness of including physics knowledge in our deep learning formulation both in the design and through learning process. Our proposed PhyDNN model has been compared to several state-of-the-art models and achieves a significant performance improvement of 8.46% on average across all baseline models. The source code has been made available and the dataset used is detailed in [1, 2].