SYMar 29, 2011
Converging an Overlay Network to a Gradient TopologyHåkan Terelius, Guodong Shi, Jim Dowling et al.
In this paper, we investigate the topology convergence problem for the gossip-based Gradient overlay network. In an overlay network where each node has a local utility value, a Gradient overlay network is characterized by the properties that each node has a set of neighbors with the same utility value (a similar view) and a set of neighbors containing higher utility values (gradient neighbor set), such that paths of increasing utilities emerge in the network topology. The Gradient overlay network is built using gossiping and a preference function that samples from nodes using a uniform random peer sampling service. We analyze it using tools from matrix analysis, and we prove both the necessary and sufficient conditions for convergence to a complete gradient structure, as well as estimating the convergence time and providing bounds on worst-case convergence time. Finally, we show in simulations the potential of the Gradient overlay, by building a more efficient live-streaming peer-to-peer (P2P) system than one built using uniform random peer sampling.
DCJul 18, 2023Code
Cloud-native RStudio on Kubernetes for HopsworksGibson Chikafa, Sina Sheikholeslami, Salman Niazi et al.
In order to fully benefit from cloud computing, services are designed following the "multi-tenant" architectural model, which is aimed at maximizing resource sharing among users. However, multi-tenancy introduces challenges of security, performance isolation, scaling, and customization. RStudio server is an open-source Integrated Development Environment (IDE) accessible over a web browser for the R programming language. We present the design and implementation of a multi-user distributed system on Hopsworks, a data-intensive AI platform, following the multi-tenant model that provides RStudio as Software as a Service (SaaS). We use the most popular cloud-native technologies: Docker and Kubernetes, to solve the problems of performance isolation, security, and scaling that are present in a multi-tenant environment. We further enable secure data sharing in RStudio server instances to provide data privacy and allow collaboration among RStudio users. We integrate our system with Apache Spark, which can scale and handle Big Data processing workloads. Also, we provide a UI where users can provide custom configurations and have full control of their own RStudio server instances. Our system was tested on a Google Cloud Platform cluster with four worker nodes, each with 30GB of RAM allocated to them. The tests on this cluster showed that 44 RStudio servers, each with 2GB of RAM, can be run concurrently. Our system can scale out to potentially support hundreds of concurrently running RStudio servers by adding more resources (CPUs and RAM) to the cluster or system.