Correlation by Column
Calculates Correlations among columns.
Input data should contain numeric columns.
- Variable Columns - Numeric columns among which correlations are calculated.
- Group By - Categorical column to group by. If you specify this, the analytics runs for each group.
- Method - Method to calculate correlations. The default is "Pearson". This can be:
- Show Only Lower Triangle - Show only lower triangle of the matrix, avoiding showing correlations for same pairs twice.
- Show Diagonal Values - Show values on diagonal of the matrix, which should always have correlation of 1.
How to Use This Feature
- Click Analytics View tab.
- If necessary, click "+" button on the left of existing Analytics tabs, to create a new Analytics.
- Select "Correlation by Columns" for Analytics Type.
- Click Variable Columns and open Column Selector Dialog.
- Select Numeric Columns that you want to calculate correlation.
- Click Run button to run the analytics.
- Select view type (explained below) by clicking view type link to see each type of generated visualization.
"Correlation Matrix" View
"Correlation Matrix" View displays correlations with Heatmap. Red color means it has positive correlation and Blue color means it has negative correlation. The darker the color, the stronger the correlation.
"Scatter Matrix" View
"Scatter Matrix" View displays the actual data distributions for each variable combination.
"Positive Correlations" View
"Positive Correlations" View displays the most positive 100 correlations. You can click Column Header for Correlation to sort the data by Correlation value.
"Negative Correlations" View
"Negative Correlations" View displays the most negative 100 correlations. You can click Column Header for Correlation to sort the data by Correlation value.
Correlation by Column uses
cor function from
stats R Package under the hood.
Exploratory R Package
For details about
stats usage in Exploratory R Package, please refer to the github repository