Diogo Almeida

12papers

47,714citations

Novelty53%

AI Score33

Ranked #133,125 of 201,326 authors (top 66%)#29,338 in LG (top 69%)

12 Papers

CLMar 15, 2023

GPT-4 Technical Report

Josh Achiam, Steven Adler, Sandhini Agarwal et al. · berkeley, deepmind

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.

CLMar 4, 2022

Training language models to follow instructions with human feedback

Long Ouyang, Jeff Wu, Xu Jiang et al.

Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent.

LGJun 2, 2021

A Generalizable Approach to Learning Optimizers

Diogo Almeida, Clemens Winter, Jie Tang et al.

A core issue with learning to optimize neural networks has been the lack of generalization to real world problems. To address this, we describe a system designed from a generalization-first perspective, learning to update optimizer hyperparameters instead of model parameters directly using novel features, actions, and a reward function. This system outperforms Adam at all neural network tasks including on modalities not seen during training. We achieve 2x speedups on ImageNet, and a 2.5x speedup on a language modeling task using over 5 orders of magnitude more compute than the training tasks.

ROMar 18, 2021

A Methodology for Approaching the Integration of Complex Robotics Systems Illustrated through a Bi-manual Manipulation Case-Study

Pavlos Triantafyllou, Rafael Afonso Rodrigues, Sirapoab Chaikunsaeng et al.

The multidisciplinarity of robotics creates a need for robust integration methodologies that can facilitate the adoption of state-of-the-art research components in an industrial application. Unfortunately, there are no clear, community accepted guidelines or standards that define the integration of such components in a single robotic system. In this paper, we propose a methodology that assesses the software components of a candidate system on the basis of the effort required to integrate them and the impact their integration will have on a target system. We demonstrate how this methodology can be applied using an industrial tool packing system as an example. The system integrates a wide range of both in-house and third-party research outputs and software components. We prove the effectiveness of our approach by evaluating system performance with an experimental benchmark that assesses the robustness, reliability and operational speed of the system for the given packing task. We also demonstrate how our methodology can be used to predict the amount of integration time required for a component. The proposed integration methodology can be applied to any robotic system to facilitate its transition from the research to an industrial environment.

SYMay 3, 2019

A Lyapunov-Based Approach to Exploit Asymmetries in Robotic Dual-Arm Task Resolution

Diogo Almeida, Yiannis Karayiannidis

Dual-arm manipulation tasks can be prescribed to a robotic system in terms of desired absolute and relative motion of the robot's end-effectors. These can represent, e.g., jointly carrying a rigid object or performing an assembly task. When both types of motion are to be executed concurrently, the symmetric distribution of the relative motion between arms prevents task conflicts. Conversely, an asymmetric solution to the relative motion task will result in conflicts with the absolute task. In this work, we address the problem of designing a control law for the absolute motion task together with updating the distribution of the relative task among arms. Through a set of numerical results, we contrast our approach with the classical symmetric distribution of the relative motion task to illustrate the advantages of our method.

ROMay 3, 2019

Asymmetric Dual-Arm Task Execution using an Extended Relative Jacobian

Diogo Almeida, Yiannis Karayiannidis

Coordinated dual-arm manipulation tasks can be broadly characterized as possessing absolute and relative motion components. Relative motion tasks, in particular, are inherently redundant in the way they can be distributed between end-effectors. In this work, we analyse cooperative manipulation in terms of the asymmetric resolution of relative motion tasks. We discuss how existing approaches enable the asymmetric execution of a relative motion task, and show how an asymmetric relative motion space can be defined. We leverage this result to propose an extended relative Jacobian to model the cooperative system, which allows a user to set a concrete degree of asymmetry in the task execution. This is achieved without the need for prescribing an absolute motion target. Instead, the absolute motion remains available as a functional redundancy to the system. We illustrate the properties of our proposed Jacobian through numerical simulations of a novel differential Inverse Kinematics algorithm.

RONov 1, 2016

Towards Blended Reactive Planning and Acting using Behavior Trees

Michele Colledanchise, Diogo Almeida, Petter Ögren

In this paper, we show how a planning algorithm can be used to automatically create and update a Behavior Tree (BT), controlling a robot in a dynamic environment. The planning part of the algorithm is based on the idea of back chaining. Starting from a goal condition we iteratively select actions to achieve that goal, and if those actions have unmet preconditions, they are extended with actions to achieve them in the same way. The fact that BTs are inherently modular and reactive makes the proposed solution blend acting and planning in a way that enables the robot to efficiently react to external disturbances. If an external agent undoes an action the robot reexecutes it without re-planning, and if an external agent helps the robot, it skips the corresponding actions, again without replanning. We illustrate our approach in two different robotics scenarios.

LGMay 23, 2016

Genetic Architect: Discovering Genomic Structure with Learned Neural Architectures

Laura Deming, Sasha Targ, Nate Sauder et al.

Each human genome is a 3 billion base pair set of encoding instructions. Decoding the genome using deep learning fundamentally differs from most tasks, as we do not know the full structure of the data and therefore cannot design architectures to suit it. As such, architectures that fit the structure of genomics should be learned not prescribed. Here, we develop a novel search algorithm, applicable across domains, that discovers an optimal architecture which simultaneously learns general genomic patterns and identifies the most important sequence motifs in predicting functional genomic outcomes. The architectures we find using this algorithm succeed at using only RNA expression data to predict gene regulatory structure, learn human-interpretable visualizations of key sequence motifs, and surpass state-of-the-art results on benchmark genomics challenges.

ROApr 22, 2016

Folding Assembly by Means of Dual-Arm Robotic Manipulation

Diogo Almeida, Yiannis Karayiannidis

In this paper, we consider folding assembly as an assembly primitive suitable for dual-arm robotic assembly, that can be integrated in a higher level assembly strategy. The system composed by two pieces in contact is modelled as an articulated object, connected by a prismatic-revolute joint. Different grasping scenarios were considered in order to model the system, and a simple controller based on feedback linearisation is proposed, using force torque measurements to compute the contact point kinematics. The folding assembly controller has been experimentally tested with two sample parts, in order to showcase folding assembly as a viable assembly primitive.

LGMar 25, 2016

Resnet in Resnet: Generalizing Residual Architectures

Sasha Targ, Diogo Almeida, Kevin Lyman

Residual networks (ResNets) have recently achieved state-of-the-art on challenging computer vision tasks. We introduce Resnet in Resnet (RiR): a deep dual-stream architecture that generalizes ResNets and standard CNNs and is easily implemented with no computational overhead. RiR consistently improves performance over ResNets, outperforms architectures with similar amounts of augmentation on CIFAR-10, and establishes a new state-of-the-art on CIFAR-100.

LGMar 16, 2016

Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units

Wenling Shang, Kihyuk Sohn, Diogo Almeida et al.

Recently, convolutional neural networks (CNNs) have been used as a powerful tool to solve many problems of machine learning and computer vision. In this paper, we aim to provide insight on the property of convolutional neural networks, as well as a generic method to improve the performance of many CNN architectures. Specifically, we first examine existing CNN models and observe an intriguing property that the filters in the lower layers form pairs (i.e., filters with opposite phase). Inspired by our observation, we propose a novel, simple yet effective activation scheme called concatenated ReLU (CRelu) and theoretically analyze its reconstruction property in CNNs. We integrate CRelu into several state-of-the-art CNN architectures and demonstrate improvement in their recognition performance on CIFAR-10/100 and ImageNet datasets with fewer trainable parameters. Our results suggest that better understanding of the properties of CNNs can lead to significant performance improvement with a simple modification.

LGNov 21, 2015

GradNets: Dynamic Interpolation Between Neural Architectures

Diogo Almeida, Nate Sauder

In machine learning, there is a fundamental trade-off between ease of optimization and expressive power. Neural Networks, in particular, have enormous expressive power and yet are notoriously challenging to train. The nature of that optimization challenge changes over the course of learning. Traditionally in deep learning, one makes a static trade-off between the needs of early and late optimization. In this paper, we investigate a novel framework, GradNets, for dynamically adapting architectures during training to get the benefits of both. For example, we can gradually transition from linear to non-linear networks, deterministic to stochastic computation, shallow to deep architectures, or even simple downsampling to fully differentiable attention mechanisms. Benefits include increased accuracy, easier convergence with more complex architectures, solutions to test-time execution of batch normalization, and the ability to train networks of up to 200 layers.