SDDec 4, 2024
Embedding-Space Diffusion for Zero-Shot Environmental Sound ClassificationYsobel Sims, Alexandre Mendes, Stephan Chalup
Zero-shot learning enables models to generalise to unseen classes by leveraging semantic information, bridging the gap between training and testing sets with non-overlapping classes. While much research has focused on zero-shot learning in computer vision, the application of these methods to environmental audio remains underexplored, with poor performance in existing studies. Generative methods, which have demonstrated success in computer vision, are notably absent from zero-shot environmental sound classification studies. To address this gap, this work investigates generative methods for zero-shot learning in environmental audio. Two successful generative models from computer vision are adapted: a cross-aligned and distribution-aligned variational autoencoder (CADA-VAE) and a leveraging invariant side generative adversarial network (LisGAN). Additionally, we introduced a novel diffusion model conditioned on class auxiliary data. Synthetic embeddings generated by the diffusion model are combined with seen class embeddings to train a classifier. Experiments are conducted on five environmental audio datasets, ESC-50, ARCA23K-FSD, FSC22, UrbanSound8k and TAU Urban Acoustics 2019, and one music classification dataset, GTZAN. Results show that the diffusion model outperforms all baseline methods on average across six audio datasets. This work establishes the diffusion model as a promising approach for zero-shot learning and introduces the first benchmark of generative methods for zero-shot environmental sound classification, providing a foundation for future research.
LGSep 11, 2018
Comparing Computing Platforms for Deep Learning on a Humanoid RobotAlexander Biddulph, Trent Houlistion, Alexandre Mendes et al.
The goal of this study is to test two different computing platforms with respect to their suitability for running deep networks as part of a humanoid robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU processing. The experiments addressed a number of benchmarking tasks including pedestrian detection using deep neural networks. Some of the results were unexpected but demonstrate that platforms exhibit both advantages and disadvantages when taking computational performance and electrical power requirements of such a system into account.
ROFeb 11, 2015
The NUbots Team Description Paper 2015Josiah Walker, Trent Houliston, Brendan Annable et al.
The NUbots are an interdisciplinary RoboCup team from The University of Newcastle, Australia. The team has a history of strong contributions in the areas of machine learning and computer vision. The NUbots have participated in RoboCup leagues since 2002, placing first several times in the past. In 2014 the NUbots also partnered with the University of Newcastle Mechatronics Laboratory to participate in the RobotX Marine Robotics Challenge, which resulted in several new ideas and improvements to the NUbots vision system for RoboCup. This paper summarizes the history of the NUbots team, describes the roles and research of the team members, gives an overview of the NUbots' robots, their software system, and several associated research projects.
MSDec 18, 2014
FlexDM: Enabling robust and reliable parallel data mining using WEKAMadison Flannery, David M Budden, Alexandre Mendes
Performing massive data mining experiments with multiple datasets and methods is a common task faced by most bioinformatics and computational biology laboratories. WEKA is a machine learning package designed to facilitate this task by providing tools that allow researchers to select from several classification methods and specific test strategies. Despite its popularity, the current WEKA environment for batch experiments, namely Experimenter, has four limitations that impact its usability: the selection of value ranges for methods options lacks flexibility and is not intuitive; there is no support for parallelisation when running large-scale data mining tasks; the XML schema is difficult to read, necessitating the use of the Experimenter's graphical user interface for generation and modification; and robustness is limited by the fact that results are not saved until the last test has concluded. FlexDM implements an interface to WEKA to run batch processing tasks in a simple and intuitive way. In a short and easy-to-understand XML file, one can define hundreds of tests to be performed on several datasets. FlexDM also allows those tests to be executed asynchronously in parallel to take advantage of multi-core processors, significantly increasing usability and productivity. Results are saved incrementally for better robustness and reliability. FlexDM is implemented in Java and runs on Windows, Linux and OSX. As we encourage other researchers to explore and adopt our software, FlexDM is made available as a pre-configured bootable reference environment. All code, supporting documentation and usage examples are also available for download at http://sourceforge.net/projects/flexdm.
CVOct 31, 2014
Addressing the non-functional requirements of computer vision systems: A case studyShannon Fenn, Alexandre Mendes, David Budden
Computer vision plays a major role in the robotics industry, where vision data is frequently used for navigation and high-level decision making. Although there is significant research in algorithms and functional requirements, there is a comparative lack of emphasis on how best to map these abstract concepts onto an appropriate software architecture. In this study, we distinguish between the functional and non-functional requirements of a computer vision system. Using a RoboCup humanoid robot system as a case study, we propose and develop a software architecture that fulfills the latter criteria. The modifiability of the proposed architecture is demonstrated by detailing a number of feature detection algorithms and emphasizing which aspects of the underlying framework were modified to support their integration. To demonstrate portability, we port our vision system (designed for an application-specific DARwIn-OP humanoid robot) to a general-purpose, Raspberry Pi computer. We evaluate performance on both platforms and compare them to a vision system optimised for functional requirements only. The architecture and implementation presented in this study provide a highly generalisable framework for computer vision system design that is of particular benefit in research and development, competition and other environments in which rapid system evolution is necessary.
ROMar 27, 2014
The NUbots Team Description Paper 2014Josiah Walker, Trent Houliston, Brendan Annable et al.
The NUbots team, from The University of Newcastle, Australia, has had a strong record of success in the RoboCup Standard Platform League since first entering in 2002. The team has also competed within the RoboCup Humanoid Kid-Size League since 2012. The 2014 team brings a renewed focus on software architecture, modularity, and the ability to easily share code. This paper summarizes the history of the NUbots team, describes the roles and research of the team members, gives an overview of the NUbots' robots and software system, and addresses relevant research projects within the the Newcastle Robotics Laboratory.