AIJul 31, 2024
The Llama 3 Herd of ModelsAaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri et al. · allen-ai, berkeley
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
CLJan 28, 2023
Underwater Robotics Semantic Parser AssistantParth Parekh, Cedric McGuire, Jake Imyak
Semantic parsing is a means of taking natural language and putting it in a form that a computer can understand. There has been a multitude of approaches that take natural language utterances and form them into lambda calculus expressions -- mathematical functions to describe logic. Here, we experiment with a sequence to sequence model to take natural language utterances, convert those to lambda calculus expressions, when can then be parsed, and place them in an XML format that can be used by a finite state machine. Experimental results show that we can have a high accuracy model such that we can bridge the gap between technical and nontechnical individuals in the robotics field.
MLOct 24, 2016
A Bayesian Ensemble for Unsupervised Anomaly DetectionEdward Yu, Parth Parekh
Methods for unsupervised anomaly detection suffer from the fact that the data is unlabeled, making it difficult to assess the optimality of detection algorithms. Ensemble learning has shown exceptional results in classification and clustering problems, but has not seen as much research in the context of outlier detection. Existing methods focus on combining output scores of individual detectors, but this leads to outputs that are not easily interpretable. In this paper, we introduce a theoretical foundation for combining individual detectors with Bayesian classifier combination. Not only are posterior distributions easily interpreted as the probability distribution of anomalies, but bias, variance, and individual error rates of detectors are all easily obtained. Performance on real-world datasets shows high accuracy across varied types of time series data.