# Amazon Redshift

You can quickly import data from your Amazon Redshift Database into Exploratory.

Here is a [blog post](https://blog.exploratory.io/exploratory-data-analysis-for-amazon-redshift-with-r-dplyr-9a14441020eb#.aqcbfa6h8) introducing this support in detail.

## 1. Create a Connection to use

Create a connection following [this instruction](https://docs.exploratory.io/data_import/database-data/connection).

## 2. Open Redshift Import dialog

Click '+' button next to 'Data Frames' and select 'Database Data'.

![](/files/-M4oN-qBMezITjny_pQQ)

Click Amazon Redshift to select.

![](/files/-M4oN66k4YJQfHTwMU5I)

## 3. Preview and Import

Click Preview button to see the data back from your Redshift db.

![](/files/-M4oN66mzrfS3x8GJpWw)

If it looks ok, then you can click 'Import' to import the data into Exploratory.

## 4. Querying Random Sample Data

You might want to take a random sample of the data that would be reasonable size for your analysis.

You can use [md5](http://docs.aws.amazon.com/redshift/latest/dg/r_MD5.html) function to get random number generated and use it like below to get the random sample of the data.

```
SELECT *
   FROM airline_2016_01
   ORDER BY md5('randomSeed' || flight_num)
   LIMIT 100000
```

## 5. Using Parameters in SQL

First, click a parameter link on the SQL Data Import Dialog.

![](/files/-M4oN2yAtTsqxNt3r0_M)

Second, define a parameter and click Save button.

![](/files/-M4oN2yC4K_GmMFFjzoo)

Finally, you can use @{} to surround a variable name inside the query like below.

```
select *
from airline_2016_01
where carrier = @{carrier}
```

If you type @ then it suggests parameters like below.

![](/files/-M4oN2yEXevqQptAWUEl)

Here's a [blog post](https://exploratory.io/note/kanaugust/An-Introduction-to-Parameter-in-Exploratory-WCO4Vgn7HJ) for more detail.

## 6. AWS Security Group Setup

![](/files/-M4oN66v_0i1voeo7Yp-)

If you encounter a database connection error, please go to AWS console and make sure you added your client PC's IP address to your Security Group (Inbound) associated with the Redshift cluster.

## 7. Number of rows

From performance point of view, we no longer show actual number of rows which can be only fetched by executing whole query again.

![](/files/-M4oN66xWkcCxqo19jUZ)

If you still want to show the actual number of query for your query, you can do so by setting System Configuration.

![](/files/-M4oN66zCHttnANqtQHx)

Then set "Yes" For "Show Actual Number of Rows on SQL Data Import Dialog"

![](/files/-M4oN670aVGCuTHA4oG4)

This will show you Actual Number of Rows like below.

![](/files/-M4oN6725Fzzj5A208wn)

## 8. Exploratory Data Analysis for Amazon Redshift with R & dplyr

Here is the link to the blog post [Exploratory Data Analysis for Amazon Redshift with R & dplyr](https://blog.exploratory.io/exploratory-data-analysis-for-amazon-redshift-with-r-dplyr-9a14441020eb)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.exploratory.io/data_import/database-data/redshift.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
