Md. Shamsuzzoha Bayzid

4papers

Novelty55%

AI Score41

Ranked #94,866 of 201,326 authors (top 47%)#31,555 in CV (top 54%)

4 Papers

CVMay 25

DeCoDrift: Stabilizing Decoder Coupling in Closed-Loop Foundation Segmentation

H. M. Shadman Tabib, Md. Shamsuzzoha Bayzid, M Sohel Rahman

Foundation segmentation models such as Segment Anything Model (SAM) are now routinely used in iterative pipelines, where each predicted mask is fed back as the next prompt. This practice turns segmentation into a closed-loop dynamical process, yet the decoder-level behavior of these systems remains largely unexamined. We show that this feedback loop can induce a previously overlooked failure mode, decoder coupling drift, in which the mask decoder's cross-attention progressively loses alignment with the target object, causing errors to accumulate across iterations. We study this phenomenon by instrumenting SAM's mask decoder and deriving ground-truth-free measures of prompt-image coupling, attention stability, and temporal consistency. On volumetric electron microscopy data, these decoder-internal signals reveal that standard iterative prompting systematically degrades attention alignment and temporal coherence relative to oracle-anchored feedback. We then formalize iterative prompting as a discrete-time dynamical system and show how proximal anchoring reduces error amplification in the feedback loop. Building on this analysis, we introduce DeCoDrift, a training-free inference-time stabilization framework that constrains prompt updates and preserves decoder coupling across iterations. Across extensive experiments, DeCoDrift consistently improves attention stability, temporal coherence, and segmentation quality over standard iterative prompting, without retraining or ground-truth supervision. More broadly, our results show that decoder-internal dynamics are not merely diagnostic: they provide actionable signals for stabilizing foundation segmentation models in closed-loop use.

COApr 21

On Threshold Compatibility Graphs

Sheikh Azizul Hakim, Md. Shamsuzzoha Bayzid

Pairwise Compatibility Graphs (PCGs) form a tree-metric graph class that originated in phylogeny and has since attracted sustained interest in graph theory. Several natural generalizations have been proposed in order to overcome the expressive limitations of classical PCGs, including $k$-interval-PCGs, $k$-OR-PCGs, and $k$-AND-PCGs. In this paper, we introduce $(k,t)$-threshold-PCGs, a threshold-based framework that unifies these generalized notions: adjacency is determined by whether at least $t$ among $k$ underlying PCG predicates accept the vertex pair. We investigate the expressive power of this model from both constructive and asymptotic viewpoints. On the positive side, we show that every graph on $n$ vertices is a $(n,t)$-threshold-PCG for every $1 \le t \le n$. On the negative side, we prove that for every fixed pair $(k,t)$, the class of $(k,t)$-threshold-PCGs is asymptotically rare among all graphs. As a consequence, we obtain sharp separations from previously studied models, including a strict expressive gap relative to $k$-interval-PCGs. We also study explicit obstruction families through incidence graphs and derive additional structural consequences for the conjunction case, including the strictness of the $k$-AND-PCG hierarchy and the failure of closure under complement.

CVMar 15, 2021

3D-FFS: Faster 3D object detection with Focused Frustum Search in sensor fusion based networks

Aniruddha Ganguly, Tasin Ishmam, Khandker Aftarul Islam et al.

In this work we propose 3D-FFS, a novel approach to make sensor fusion based 3D object detection networks significantly faster using a class of computationally inexpensive heuristics. Existing sensor fusion based networks generate 3D region proposals by leveraging inferences from 2D object detectors. However, as images have no depth information, these networks rely on extracting semantic features of points from the entire scene to locate the object. By leveraging aggregated intrinsic properties (e.g. point density) of point cloud data, 3D-FFS can substantially constrain the 3D search space and thereby significantly reduce training time, inference time and memory consumption without sacrificing accuracy. To demonstrate the efficacy of 3D-FFS, we have integrated it with Frustum ConvNet (F-ConvNet), a prominent sensor fusion based 3D object detection model. We assess the performance of 3D-FFS on the KITTI dataset. Compared to F-ConvNet, we achieve improvements in training and inference times by up to 62.80% and 58.96%, respectively, while reducing the memory usage by up to 58.53%. Additionally, we achieve 0.36%, 0.59% and 2.19% improvements in accuracy for the Car, Pedestrian and Cyclist classes, respectively. 3D-FFS shows a lot of promise in domains with limited computing power, such as autonomous vehicles, drones and robotics where LiDAR-Camera based sensor fusion perception systems are widely used.

QMDec 6, 2020

Align-gram : Rethinking the Skip-gram Model for Protein Sequence Analysis

Nabil Ibtehaz, S. M. Shakhawat Hossain Sourav, Md. Shamsuzzoha Bayzid et al.

Background: The inception of next generations sequencing technologies have exponentially increased the volume of biological sequence data. Protein sequences, being quoted as the `language of life', has been analyzed for a multitude of applications and inferences. Motivation: Owing to the rapid development of deep learning, in recent years there have been a number of breakthroughs in the domain of Natural Language Processing. Since these methods are capable of performing different tasks when trained with a sufficient amount of data, off-the-shelf models are used to perform various biological applications. In this study, we investigated the applicability of the popular Skip-gram model for protein sequence analysis and made an attempt to incorporate some biological insights into it. Results: We propose a novel $k$-mer embedding scheme, Align-gram, which is capable of mapping the similar $k$-mers close to each other in a vector space. Furthermore, we experiment with other sequence-based protein representations and observe that the embeddings derived from Align-gram aids modeling and training deep learning models better. Our experiments with a simple baseline LSTM model and a much complex CNN model of DeepGoPlus shows the potential of Align-gram in performing different types of deep learning applications for protein sequence analysis.