Dremio provides SQL interface to various data sources such as MongoDB, JSON file, Redshift, etc. It is often considered as Data Fabric because it can take care of the query optimization and data cache management across all the different type of data sources so users don’t need to deal with the difference among the data sources. And it can accelerate the query performance sometimes up to 1000 times by utilizing highly optimized physical representations of source data with Apache Parquet, leveraging columnar in-memory processing with Apache Arrow and advanced push-downs into the underlying data sources (when dealing with RDBMS or NoSQL sources).
In this post, I’m going to walk you through how you can install Dremio on your local machine and connect it to Dremio from R and Exploratory.
If your local machine does not have Java yet, you need to install it. You can download Java from here.
First, you need to download Dremio from Dremio Download Site. In this blog post, I’ll explains the installation on Mac for Dremio Community Edition.
Once you downloaded the installation file (my case .dmg file), double click the file. And you’ll see something like this. So drag the Dremio icon and move it to Applications folder.
That’s it! Now let’s start Dremio.
To start Dremio, double click Dremio you just installed
And controller window pops up like this. So click Start button.
Once it started Open Dremio button becomes clickable like this. So click it.
So this will open a browser (or a new tab if you already opened your browser) So enter required fields for your Admin account.
Click Next, and congratulation! Now your can see Dremio page like below.
Dremio prepares some Sample Source so let’s check it. Click
Add Sample Source Button.
Now you can see
sample.dremio.com is added to Samples data source. Click the
And you can see there are three samples under samples.dremio.com. Let’s click the first one called ‘SF_incidents2016.json’, which is a JSON file about criminal incidents happened in San Francisco in 2016.
And this will opens a dialog like below where you can preview it’s data so click
And now it shows Dremio’s query dialog. And this means that you can access this JSON with SQL! And not only from this Dremio’s query dialog, but also you can access from many other applications including R and Exploratory. And that’s what I’m going to do in the next section.
Before accessing the data in Dremio from R and Exploratory, you need to setup ODBC on your local machine. Below is an example for Mac.
Now let’s download ODBC driver so that you can connect to your Dremio from R and Exploratory. From the same Dremo Download Site, you can download Dremio ODBC Driver.
Double clicked downloaded file (my case .dmg file). This will open up a window like below.
And double click the Dremio ODBC.pkg file. And follow the instruction on dialog (Basically, click continue and agree license).
Open Connection Dialog either from Exploratory Desktop Project List page or inside Project.
If you already opened a project, then from a project header menu, select Connections menu.
Click Add button on Connection List Dialog.
From Connection chooser, click Dremio icon.
Enter following fields
Host (in this example, localhost)
Port (by default it’s 31010)
Username (Username that you setup on Dremio Admin Account)
Password (Password that you setup for Dremio Admin Account)
Once you entered these fields click Test Connection button to test it and make sure you connection test went well.
After confirming it, click Add button to save it.
On left hand side tree, click plus (+) button next to Data Frames label and select
And select Dremio
On Data Import Dialog, select Dremio Connection (i.e. Dremio Local Mac) that you just created and expand
Samples.'sample.dremio.com' and you can see
SF_incidents2016.json that you added at Dremio from their sample. Click the Table name, which would automatically generate a SQL query to get the whole data. By clicking Preview button, you will see the data returned in Exploratory !
Currently, non ascii column names are not imported correctly.
So now you know how you can install/setup Dremio and ODBC driver on your machine and access Dremio data from Exploratory. So on the next couple of blog posts, I’ll talk about how you can query YOUR data (JSON file, MongoDB, Redshift, etc) with Dremio and Exploratory.