RCloud enables Data Scientists to find value in data by sharing ideas and techniques with each other and results with any user in the community. RCloud employs notebooks composed of cells where topics may easily be broken up into concepts and where the relationship between the concepts can be easily understood.
In this environment, data scientist teams benefit from easier sharing of scripts and data feeds, experiments, annotations and automated recommendations which are well beyond what traditional individual or locally based development environments provide. Since notebooks are saved and searchable, rather than spending time "recreating the wheel", RCloud Data Scientists may reuse notebooks or parts of notebooks for similar projects.
Notebooks are project packages which contain all the components and dependencies of an analysis, including code, data, comments and other technical documentation, charts and deployment capabilities. This means results may be be verified and reproduced by anyone with access to the notebook.
As opposed to analysis which are stored locally, web-based content allows Data Scientists to collaborate over the Internet without concern for the location or syncing of system parameters, code and data. For example, the public instance of RCloud includes all CRAN packages. Local development environment parameters no longer create hurdles for viewing and running analyses since RCloud is platform independent and user access and controls remain constant which promotes user confidence and engagement.
RCloud supports all R packages including iotools for processing large data files and SparkR - an R frontend for Apache Spark.
An RCloud session runs on both client and server, so it is possible for R functions on the server to call JavaScript functions on the client, and vice versa. This means rich web-based content through easy integration of custom Javascript user interface (UI) and Javascript libraries such as JQuery and D3.
RCloud supports efficient, secure, client/server connections via the FastRWeb package and an adopted discipline known as the Object-Capabilities (ocap). This means that web browsers never directly instruct the RCloud backend to execute arbitrary code which prevents unauthenticated clients from making unauthorized calls to the RCloud runtime environment; read more about ocap on our Documentation introduction page or see the Wikipedia article. RCloud also maintains an automatic git based trail of code modifications which document the development history of an RCloud based project and RCloud notebooks may be encrypted for added security. In the notebook product space, authenticated client-server channeling and notebook encryption are unique to RCloud.
Code, widget, dashboard and analysis libraries are an ancillary benefit that comes with the creation of every RCloud notebook (on the public instance or in private clouds).
These "knowledge assets" may used on similar projects or to train new employees, for example. RCloud notebooks are searchable and forkable (sharable), so anyone with access to the notebooks may view and reuse a notebook and its content or various parts.
Standard analytic workflows often involve performing a data science experiment using one set of tools and deploying the results using other tools.
Rather than stitching pipelines together and/or dealing with inconsistencies between development and production environments, every RCloud notebook is named by a URL so analyses may easily be transformed from code and technical documentation to markdown annotated notebooks or rich web-based dashboards.
RCloud is an ideal environment for research and data analysis departments who commonly use R and/or Python to share, publish and archive their code and data. RCloud is roughly the same in terms of compute efficiency as in memory Business Intelligence (BI) tools like Tableau and Qlikview, but unlike these systems, you can add users to RCloud without licensing concerns.
Open source software also means access free online help and user training through forums like Stackoverflow.
System | Collaboration | Built-in Security | No-cost User Scalability | Versioning / Forking | Dashboards | Multi (Programming) Language Support | Integrated Reports | Integrated Analyses |
---|---|---|---|---|---|---|---|---|
RCloud | X | X | X | X | X | X | X | X |
RStudio | X | X | X | |||||
RStudio Shiny Pro | X | X | X | X | ||||
JSFiddle | X | X | X | X | ||||
bl.ocks | X | X | X | X | ||||
Jupyter | X | X | X | X | X | |||
Tableau | X | X | X | X |
2016 New York R Conference Presentation; RCloud - Collaborative Environment for Visualization and Big Data Analytics
Public Instance
The public RCloud instance is the perfect place to test drive RCloud. Anyone with a GitHub account can have full user access to this instance. If you don't already have a GitHub account you can create one at the GitHub login page; see the next section Data Scientist Access (or Logging in) for details.
Local Installation
A local instance has the advantage of granting you complete control of your notebooks and data. Local instances also afford the opportunity to evaluate and experiment with the RCloud infrastructure. Windows is not supported at this time, but the following describes the requirements for the OSX download:
Requirements:
Alternatively, you may use our RCloud Docker package.
Local Enterprise Installation
Complete detailed instructions for installing a local enterprise RCloud instance may be found on GitHub Setting up (Installing) RCloud.
Both the public instance of RCloud and any local installation come with two methods for accessing notebooks - Anonymous (unregistered) User access and Data Scientist (registered user) access.
Anonymous Users may view and interact with RCloud notebooks without having an RCloud account. This is done by navigating to the RCloud notebook web page using a URL / hyperlink. The purpose of Anonymous User access is to allow Data Scientists to easily share their work as a web page (hyperlink) with non-developers and/or users who do not have an RCloud account at any stage of the development process.
To find out how a Data Scientist may share their work through hyperlinks, please refer the the Sharing Notebooks tutorial.
Data Scientist Access
An RCloud Data Scientist is someone with an RCloud account who can create, edit, fork (copy) and share notebooks. Since RCloud provides automatic version control by storing notebooks as Github gists, creating an RCloud account requires that you have a Github account. If you already have a GitHub account, you may skip to Step 3.
Note: For new GitHub accounts, you will need to navigate back to the Log In page to get to the GitHub Authorize application page, or you can find it in your GitHub profile settings.
Anonymous User versus Data Scientist (Logging In) Access Video Tutorial
The RCloud Integrated Development Enviornment is composed of navigation (header) bar, a left windowshade panel, a right windowshade panel and in the center are Prompt (R/Python) and Markdown cells. As an RCloud Data Scientist, you have the ability to create, run (execute), fork, edit and share every notebook in the RCloud Integrated Development Environment (IDE). This access is unique to RCloud and allows Data Scientists to leverage existing work and recreate (reproduce) past work.
Notebooks may be created using the + (plus) symbol located at the top of the left windowshade panel and run using the play button in the navigation bar. Prompt cells are the equivalent of R or Python command line sessions. Prompt cells may be switched to Markdown cells using the dropdown menu in each cell. Cells make be edited and deleted using the icons associated with each cell.
Entire notebooks may be deleted using the x symbol which appears when you place your mouse next to the notebook name. Likewise, notebooks may be starred ("liked"), made hidden and placed into groups using the respective icons next to the notebook name.
RCloud Data Scientists may view past versions of notebooks by clicking on the clock icon next to the notebook name: versions are automatically created every time you run or save a notebook. Popular notebooks may be viewed by clicking the Discover menu item in the navigation bar. The RCloud UI functionality in the left and right windowshade panels includes File Upload (covered in more detail in the Data - Loading and Saving tutorial), notebook Comments, Search, Workspace, Dataframe and Session information and Help access. Assets located in the right windowshade panel may be code or images and are access using RCloud API
Introduction to the RCloud Integrated Development Environment Video Tutorial
To create an RCloud notebook, you must be logged in as an RCloud Data Scientist. Once you are logged in, you may create a notebook using two methods:
For example, to get started fork an existing notebook (you must be logged in to access this notebook):
RCloud Creating Notebooks Video Tutorial
RCloud supports Markdown and RMarkdown - select the desired markdown by changing the cell type in the RCloud editing environment:
There are several methods to access data in RCloud including the following:
Files may be uploaded to a User's home directory by using the GUI interface found in the right windowshade panel.
Note: In this example, the 'Upload to notebook' box is not ticked. Since the data is now located in your home directory, you would access the path to the data using the following RCloud API 'rcloud.home()':
# Use RCloud API to read data # Data source: http://www2.census.gov/geo/docs/maps-data/data/rel/zcta_county_rel_10.txt' fn1 <- read.csv(rcloud.home('zcta_county_rel_10.txt'), sep=',', colClasses="character") summary(fn1)
Files may be uploaded directly to a specific RCloud Notebook by using the same process as in the first method, but also ticking the 'Upload to notebook' box in the GUI.
Note: In this example, the 'Upload to notebook' box is ticked. Since the data is now an RCloud 'asset', you would access the path to the data using the following RCloud API 'rcloud.get.asset()':
# Use RCloud API to read data # Data source: http://archive.ics.uci.edu/ml/datasets/Zoo fn2 = rcloud.get.asset('zoo_data.txt', as.file=TRUE) t2 = read.table(fn2,sep=",",header=TRUE)
Data may also be uploaded to RCloud by dragging and dropping files from your local machine to the Asset windowpane; the 'Drop File to Asset' GUI will automatically appear as you drag files.
Note: Using this method, file size is currently limited to 75KB on the public instance of RCloud.
Since the data is now and RCloud asset, it is referenced using the same method as in #2:
# Use RCloud API to read data # Data source: http://archive.ics.uci.edu/ml/datasets/Zoo fn3 = rcloud.get.asset('Wholesale_customers_data.csv', as.file=TRUE); t3 = read.table(fn3,sep=",",header=TRUE)
RCloud Data Scientists may also manually enter data as an RCloud Asset. First click on the 'New Asset' tab in the Asset panel:
Type a file name:
Then either type data in the tab or use keyboard shortcut keys to copy and paste (e.g., Ctrl-A, Ctrl-C, Ctrl-V).
Since the data is now an RCloud asset, it is referenced using the same method as in #2:
# Use RCloud API to read data # Data source: http://www.cs.waikato.ac.nz/ml/weka/ fn4 = rcloud.get.asset('Play_tennis.csv', as.file=TRUE); t4 = read.table(fn4,sep=",",header=TRUE)
There are several methods to save data in RCloud including the following:
Save data in your RCloud home directory by specifying the path:
oDir = rcloud.home() outFn = paste(oDir,"/outputTest.txt",sep="") # Write to file write.table(t4, outFn)
# Create the output file; "wb" = write binary # f = file( outFn, "wb") # # Standard R write binary function # writeBin(t4,f); ### Some useful commands for data processing are: 1. file.remove(rcloud.upload.path("foo.txt")) 2. list.files(rcloud.home()).
This information may also be viewed in the RCloud Sample Notebook: Data - Loading and Saving.
Loading Data into RCloud Video Tutorial
RCloud Data Scientists may obtain a hyperlink (URL) to a notebook they have created by selecting the kind of URL they would like to share with registered and unregistered users. The simplest form of sharing is the view.html option which may be selected in the drop down menu of the share icon in the navigation bar.
Clicking the share icon will produce a web page (URL) that registered users can share with other registered users. RCloud Data Scientists (registered users) may view the underlying code by clicking the edit icon or run the notebook by clicking the play icon in the navigation bar. This means that by default users who wish to view notebooks must be logged into RCloud.
However, if the Publish Notebook box is checked in the Advanced Menu (located in the navigation bar), any user who has network access to the notebook's URL will be able to execute (run), view and share the notebook.
RCloud notebooks are not static web pages. Executing a notebook will fetch data and return live / updated results. Unregistered users may view the source code by selected Show Source in the Advanced menu located in the navigation bar. Alternatively, source code may be viewed as a GitHub repository by selecting Open in GitHub in the Advanced menu.
RCloud Introduction to Sharing Video Tutorial
Make a notebook protected (private) by clicking on the "eye" icon next to the notebook name in the left windowpane:
Protected notebooks are readable only by the owner and (optionally) a select group of users and will not show up in search results (although previously unprotected versions might).
Use the second tab of the protection dialog to create/rename groups and/or assign other users as administrators/members of groups you administrate. Alternatively, you can select Manage Groups from the Advanced menu item in the navigation bar — note that the Notebook tab will be grayed out in that case, as Manage Groups is not notebook specific.
This tutorial describes how to write and deploy an R Shiny application on RCloud.
Hosting Shiny applications on RCloud allows you to enjoy the convenience of RCloud development and distribution with the elegant user interface features of Shiny.
This tutorial assumes you are familiar with the basics of Shiny application development. To learn more about constructing Shiny apps or for a refresher on the Shiny architecture visit the RStudio Shiny Tutorial page.
library(rcloud.shiny) library(shiny) # Put code here that will run once when the app is loaded df = iris # This is your Shiny User Interface (UI) layout ui = fluidPage( titlePanel("My Shiny RCloud Example"), helpText(a("View Source in RCloud UI", target="_blank", href=paste0("/edit.html?notebook=", rcloud.session.notebook.id()))), verticalLayout( sidebarPanel( selectInput("src0","Sepal",c("Length","Width")), selectInput("src1","Petal",c("Length","Width")), hr(), helpText(paste("This data set has",nrow(df),"rows")) ), mainPanel( plotOutput("thePlot") ) ) ) # This is the standard Shiny server function server = function(input,output) { output$thePlot = renderPlot({ x = df[[paste0("Petal.",input$src0)]] y = df[[paste0("Sepal.",input$src1)]] plot(x,y,xlab=input$src0,ylab=input$src1) }) } # Start the Shiny app in your browser. rcloud.shinyApp(ui=ui,server=server)
View sample Shiny notebooks on the public instance, by logging in as a Data Scientist and navigating to the 'RCloud Sample Notebooks/Dashboarding/RCloud shiny.html' directory or click on the image below to view a Word Cloud example.