LGApr 14Code
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic ReasoningAakshita Chandiramani, Aaron Blakeman, Abdullahi Olaoye et al. · amazon-science, cmu
We describe the pre-training, post-training, and quantization of Nemotron 3 Super, a 120 billion (active 12 billion) parameter hybrid Mamba-Attention Mixture-of-Experts model. Nemotron 3 Super is the first model in the Nemotron 3 family to 1) be pre-trained in NVFP4, 2) leverage LatentMoE, a new Mixture-of-Experts architecture that optimizes for both accuracy per FLOP and accuracy per parameter, and 3) include MTP layers for inference acceleration through native speculative decoding. We pre-trained Nemotron 3 Super on 25 trillion tokens followed by post-training using supervised fine tuning (SFT) and reinforcement learning (RL). The final model supports up to 1M context length and achieves comparable accuracy on common benchmarks, while also achieving up to 2.2x and 7.5x higher inference throughput compared to GPT-OSS-120B and Qwen3.5-122B, respectively. Nemotron 3 Super datasets, along with the base, post-trained, and quantized checkpoints, are open-sourced on HuggingFace.
CLSep 29, 2025
Pretraining Large Language Models with NVFP4Felix Abecassis, Anjulie Agrusa, Dong Ahn et al. · nvidia
Large Language Models (LLMs) today are powerful problem solvers across many domains, and they continue to get stronger as they scale in model size, training set size, and training set quality, as shown by extensive research and experimentation across the industry. Training a frontier model today requires on the order of tens to hundreds of yottaflops, which is a massive investment of time, compute, and energy. Improving pretraining efficiency is therefore essential to enable the next generation of even more capable LLMs. While 8-bit floating point (FP8) training is now widely adopted, transitioning to even narrower precision, such as 4-bit floating point (FP4), could unlock additional improvements in computational speed and resource utilization. However, quantization at this level poses challenges to training stability, convergence, and implementation, notably for large-scale models trained on long token horizons. In this study, we introduce a novel approach for stable and accurate training of large language models (LLMs) using the NVFP4 format. Our method integrates Random Hadamard transforms (RHT) to bound block-level outliers, employs a two-dimensional quantization scheme for consistent representations across both the forward and backward passes, utilizes stochastic rounding for unbiased gradient estimation, and incorporates selective high-precision layers. We validate our approach by training a 12-billion-parameter model on 10 trillion tokens -- the longest publicly documented training run in 4-bit precision to date. Our results show that the model trained with our NVFP4-based pretraining technique achieves training loss and downstream task accuracies comparable to an FP8 baseline. These findings highlight that NVFP4, when combined with our training approach, represents a major step forward in narrow-precision LLM training algorithms.
CLDec 24, 2025
NVIDIA Nemotron 3: Efficient and Open IntelligenceAaron Blakeman, Aaron Grattafiori, Aarti Basant et al. · nvidia
We introduce the Nemotron 3 family of models - Nano, Super, and Ultra. These models deliver strong agentic, reasoning, and conversational capabilities. The Nemotron 3 family uses a Mixture-of-Experts hybrid Mamba-Transformer architecture to provide best-in-class throughput and context lengths of up to 1M tokens. Super and Ultra models are trained with NVFP4 and incorporate LatentMoE, a novel approach that improves model quality. The two larger models also include MTP layers for faster text generation. All Nemotron 3 models are post-trained using multi-environment reinforcement learning enabling reasoning, multi-step tool use, and support granular reasoning budget control. Nano, the smallest model, outperforms comparable models in accuracy while remaining extremely cost-efficient for inference. Super is optimized for collaborative agents and high-volume workloads such as IT ticket automation. Ultra, the largest model, provides state-of-the-art accuracy and reasoning performance. Nano is released together with its technical report and this white paper, while Super and Ultra will follow in the coming months. We will openly release the model weights, pre- and post-training software, recipes, and all data for which we hold redistribution rights.
HCOct 18, 2021
DroneStick: Flying Joystick as a Novel Type of InterfaceEvgeny Tsykunov, Aleksey Fedoseev, Ekaterina Dorzhieva et al.
DroneStick is a novel hands-free method for smooth interaction between a human and a robotic system via one of its agents, without training and any additional handheld or wearable device or infrastructure. A flying joystick (DroneStick), being a part of a multi-robot system, is composed of a flying drone and coiled wire with a vibration motor. By pulling on the coiled wire, the operator commands certain motions of the follower robotic system. The DroneStick system does not require the user to carry any equipment before or after performing the required interaction. DroneStick provides useful feedback to the operator in the form of force transferred through the wire, translation/rotation of the flying joystick, and motor vibrations at the fingertips. Feedback allows users to interact with different forms of robotic systems intuitively. A potential application can enhance an automated `last mile' delivery when a recipient needs to guide a delivery drone/robot gently to a spot where a parcel has to be dropped.
CVApr 5, 2020
Hyper-spectral NIR and MIR data and optimal wavebands for detection of apple tree diseasesDmitrii Shadrin, Mariia Pukalchik, Anastasia Uryasheva et al.
Plant diseases can lead to dramatic losses in yield and quality of food, becoming a problem of high priority for farmers. Apple scab, moniliasis, and powdery mildew are the most significant apple tree diseases worldwide and may cause between 50% and 60% in yield losses annually; they are controlled by fungicide use with huge financial and time expenses. This research proposes a modern approach for analyzing the spectral data in Near-Infrared and Mid-Infrared ranges of the apple tree diseases at different stages. Using the obtained spectra, we found optimal spectral bands for detecting particular disease and discriminating it from other diseases and healthy trees. The proposed instrument will provide farmers with accurate, real-time information on different stages of apple tree diseases, enabling more effective timing, and selecting the fungicide application, resulting in better control and increasing yield. The obtained dataset, as well as scripts in Matlab for processing data and finding optimal spectral bands, are available via the link: https://yadi.sk/d/ZqfGaNlYVR3TUA
ROApr 1, 2020
Coupling of localization and depth data for mapping using Intel RealSense T265 and D435i camerasEvgeny Tsykunov, Valery Ilin, Stepan Perminov et al.
We propose to couple two types of Intel RealSense sensors (tracking T265 and depth D435i) in order to obtain localization and 3D occupancy map of the indoor environment. We implemented a python-based observer pattern with multi-threaded approach for camera data synchronization. We compared different point cloud (PC) alignment methods (using transformations obtained from tracking camera and from ICP family methods). Tracking camera and PC alignment allow us to generate a set of transformations between frames. Based on these transformations we obtained different trajectories and provided their analysis. Finally, having poses for all frames, we combined depth data. Firstly we obtained a joint PC representing the whole scene. Then we used Octomap representation to build a map.
ROJan 31, 2020
SwarmCloak: Landing of Two Micro-Quadrotors on Human Hands Using Wearable Tactile Interface Driven by Light IntensityEvgeny Tsykunov, Ruslan Agishev, Roman Ibrahimov et al.
For the human operator, it is often easier and faster to catch a small size quadrotor right in the midair instead of landing it on a surface. However, interaction strategies for such cases have not yet been considered properly, especially when more than one drone has to be landed at the same time. In this paper, we propose a novel interaction strategy to land multiple robots on the human hands using vibrotactile feedback. We developed a wearable tactile display that is activated by the intensity of light emitted from an LED ring on the bottom of the quadcopter. We conducted experiments, where participants were asked to adjust the position of the palm to land one or two vertically-descending drones with different landing speeds, by having only visual feedback, only tactile feedback or visual-tactile feedback. We conducted statistical analysis of the drone landing positions, landing pad and human head trajectories. Two-way ANOVA showed a statistically significant difference between the feedback conditions. Experimental analysis proved that with an increasing number of drones, tactile feedback plays a more important role in accurate hand positioning and operator's convenience. The most precise landing of one and two drones was achieved with the combination of tactile and visual feedback.
RONov 22, 2019
SwarmCloak: Landing of a Swarm of Nano-Quadrotors on Human ArmsEvgeny Tsykunov, Ruslan Agishev, Roman Ibrahimov et al.
We propose a novel system SwarmCloak for landing of a fleet of four flying robots on the human arms using light-sensitive landing pads with vibrotactile feedback. We developed two types of wearable tactile displays with vibromotors which are activated by the light emitted from the LED array at the bottom of quadcopters. In a user study, participants were asked to adjust the position of the arms to land up to two drones, having only visual feedback, only tactile feedback or visual-tactile feedback. The experiment revealed that when the number of drones increases, tactile feedback plays a more important role in accurate landing and operator's convenience. An important finding is that the best landing performance is achieved with the combination of tactile and visual feedback. The proposed technology could have a strong impact on the human-swarm interaction, providing a new level of intuitiveness and engagement into the swarm deployment just right from the skin surface.
RONov 12, 2019
SlingDrone: Mixed Reality System for Pointing and Interaction Using a Single DroneEvgeny Tsykunov, Roman Ibrahimov, Derek Vasquez et al.
We propose SlingDrone, a novel Mixed Reality interaction paradigm that utilizes a micro-quadrotor as both pointing controller and interactive robot with a slingshot motion type. The drone attempts to hover at a given position while the human pulls it in desired direction using a hand grip and a leash. Based on the displacement, a virtual trajectory is defined. To allow for intuitive and simple control, we use virtual reality (VR) technology to trace the path of the drone based on the displacement input. The user receives force feedback propagated through the leash. Force feedback from SlingDrone coupled with visualized trajectory in VR creates an intuitive and user friendly pointing device. When the drone is released, it follows the trajectory that was shown in VR. Onboard payload (e.g. magnetic gripper) can perform various scenarios for real interaction with the surroundings, e.g. manipulation or sensing. Unlike HTC Vive controller, SlingDrone does not require handheld devices, thus it can be used as a standalone pointing technology in VR.
RONov 12, 2019
WiredSwarm: High Resolution Haptic Feedback Provided by a Swarm of Drones to the User's Fingers for VR interactionEvgeny Tsykunov, Dzmitry Tsetserukou
We propose a concept of a novel interaction strategy for providing rich haptic feedback in Virtual Reality (VR), when each user's finger is connected to micro-quadrotor with a wire. Described technology represents the first flying wearable haptic interface. The solution potentially is able to deliver high resolution force feedback to each finger during fine motor interaction in VR. The tips of tethers are connected to the centers of quadcopters under their bottom. Therefore, flight stability is increasing and the interaction forces are becoming stronger which allows to use smaller drones.
ROSep 5, 2019
SwarmTouch: Guiding a Swarm of Micro-Quadrotors with Impedance Control using a Wearable Tactile InterfaceEvgeny Tsykunov, Ruslan Agishev, Roman Ibrahimov et al.
To achieve a smooth and safe guiding of a drone formation by a human operator, we propose a novel interaction strategy for a human-swarm communication which combines impedance control and vibrotactile feedback. The presented approach takes into account the human hand velocity and changes the formation shape and dynamics accordingly using impedance interlinks simulated between quadrotors, which helps to achieve a natural swarm behavior. Several tactile patterns representing static and dynamic parameters of the swarm are proposed. The user feels the state of the swarm at the fingertips and receives valuable information to improve the controllability of the complex formation. A user study revealed the patterns with high recognition rates. A flight experiment demonstrated the possibility to accurately navigate the formation in a cluttered environment using only tactile feedback. Subjects stated that tactile sensation allows guiding the drone formation through obstacles and makes the human-swarm communication more interactive. The proposed technology can potentially have a strong impact on the human-swarm interaction, providing a higher level of awareness during the swarm navigation.
ROSep 5, 2019
SwarmTouch: Tactile Interaction of Human with Impedance Controlled Swarm of Nano-QuadrotorsEvgeny Tsykunov, Luiza Labazanova, Akerke Tleugazy et al.
We propose a novel interaction strategy for a human-swarm communication when a human operator guides a formation of quadrotors with impedance control and receives vibrotactile feedback. The presented approach takes into account the human hand velocity and changes the formation shape and dynamics accordingly using impedance interlinks simulated between quadrotors, which helps to achieve a life-like swarm behavior. Experimental results with Crazyflie 2.0 quadrotor platform validate the proposed control algorithm. The tactile patterns representing dynamics of the swarm (extension or contraction) are proposed. The user feels the state of the swarm at his fingertips and receives valuable information to improve the controllability of the complex life-like formation. The user study revealed the patterns with high recognition rates. Subjects stated that tactile sensation improves the ability to guide the drone formation and makes the human-swarm communication much more interactive. The proposed technology can potentially have a strong impact on the human-swarm interaction, providing a new level of intuitiveness and immersion into the swarm navigation.
ROAug 7, 2019
DronePick: Object Picking and Delivery Teleoperation with the Drone Controlled by a Wearable Tactile DisplayRoman Ibrahimov, Evgeny Tsykunov, Vladimir Shirokun et al.
We report on the teleoperation system DronePick which provides remote object picking and delivery by a human-controlled quadcopter. The main novelty of the proposed system is that the human user continuously gets the visual and haptic feedback for accurate teleoperation. DronePick consists of a quadcopter equipped with a magnetic grabber, a tactile glove with finger motion tracking sensor, hand tracking system, and the Virtual Reality (VR) application. The human operator teleoperates the quadcopter by changing the position of the hand. The proposed vibrotactile patterns representing the location of the remote object relative to the quadcopter are delivered to the glove. It helps the operator to determine when the quadcopter is right above the object. When the "pick" command is sent by clasping the hand in the glove, the quadcopter decreases its altitude and the magnetic grabber attaches the target object. The whole scenario is in parallel simulated in VR. The air flow from the quadcopter and the relative positions of VR objects help the operator to determine the exact position of the delivered object to be picked. The experiments showed that the vibrotactile patterns were recognized by the users at the high recognition rates: the average 99% recognition rate and the average 2.36s recognition time. The real-life implementation of DronePick featuring object picking and delivering to the human was developed and tested.