Exploratory
  • Introduction
  • Product Features
    • Summary View
    • Table View
    • Row Filter
    • Column Filter
    • Dashboard
    • Dashboard (日本語)
    • Note
    • Note (日本語)
    • Steps (Right-hand side)
    • Branch
    • Parameter
    • Parameter (日本語)
    • Export
    • Share
      • Share Type
      • Chart / Analytics
      • Data
      • Report (Note / Dashboard)
      • Notification
      • Version History
      • Restore Older Version
      • CSV API
    • Share (日本語)
      • 共有のタイプ
      • チャート / アナリティクス
      • データ
      • レポート (ノート / ダッシュボード)
      • 通知
      • バージョンの履歴
      • 古いバージョンの復元
      • CSV API
    • Schedule
      • Manage Schedules
      • Notification
      • Scheduling History
    • Schedule (日本語)
      • スケジュールの設定
      • 通知
      • スケジュールの履歴
    • Team
      • Manage Teams
    • Team (日本語)
      • チームの設定
    • Project
      • Import
      • Export
      • Search
  • Data Import
    • File Data
      • CSV / Delimited File
      • Amazon S3
      • Google Drive
      • Google Cloud Storage
      • Excel
      • JSON
      • Log File
      • Microsoft Azure
      • Stats - SAS / SPSS / STATA
      • RData / RDS
      • Parquet File
      • EDF - Exploratory
    • Database Data
      • SQL Troubleshooting
      • Create Connection
      • Amazon Athena
      • Amazon Aurora
      • Amazon Redshift
      • Amazon Redshift (日本語)
      • Google BigQuery
      • HP Vertica
      • MariaDB / MySQL DB
      • MariaDB / MySQL DB (日本語)
      • Microsoft Access
      • MongoDB
      • ODBC
      • Oracle
      • PostgreSQL
      • PostgreSQL (日本語)
      • Presto
      • Snowflake
      • SQLServer (DSN)
      • SQLServer
      • Teradata
      • Treasure Data
    • Cloud Apps Data
      • Create Connection
      • FRED - Federal Reserve of Economic Data
      • Github Issues
      • Google Analytics
      • Google Analytics (日本語)
      • Google Spreadsheet
      • Google Cloud Storage
      • Salesforce
      • Twitter Search
      • Stripe
      • Weather Data
      • Stock Price Data
    • Write R Script as Data
      • Currency Exchange Rate
    • Write R Script as Data (日本語)
    • Web Page Scraping
    • Text Input Data
    • Data Source Extension
      • Quandl
      • Holiday
      • RSS Data
    • Create Custom Data Source
  • Data Wrangling
    • Command Line mode for faster and more flexible data interaction in Exploratory
    • Select / Remove Columns
    • Reorder Columns
    • Create New Calculation
    • Create New Calculation for Multiple Columns
    • Summarize (Aggregate)
    • Group
    • Filter
    • Rename
    • Arrange (Sort)
    • Top / Bottom N
    • Join
    • Merge
    • Gather
    • Spread
    • Pivot
    • Expand
    • Complete
    • Separate
    • Unite
    • Bind Rows
    • Bind Columns
    • Keep Only Unique Rows
    • Keep Only Duplicated Rows
    • Slice
    • Drop NA
    • Sample
    • Impute NA
    • Fill
    • Create Buckets
    • Assign New Values to Existing Values - Recode
    • Assign New Values by Setting Conditions - Case When
    • Work with Categories
    • Data Type Conversion
    • Row as header
    • Ungroup
    • Unnest
    • Separate List Items into Columns (Unnest Wider)
    • Separate List Items into Rows (Unnest Longer)
    • Separate Address (Japan)
    • Hoist
    • Remove Empty Rows
    • Remove Empty Columns
    • Clean Column Names
    • Window Calculation
    • Window Calculation (日本語)
    • Add Row
    • Text Wrangling
    • Regular Expression Cheat Sheet
    • Regular Expression Cheat Sheet (日本語)
  • Visualization
    • Types
      • Pivot
      • Summarize Table
      • Table
      • Bar
      • Line
      • Area
      • Pie/Ring
      • Radar
      • Histogram
      • Density Plot
      • Scatter (No Aggregation)
      • Scatter (With Aggregation)
      • Boxplot
      • Violin
      • Error Bar
      • Error Bar (Summarized Data)
      • Map - Standard
      • Map - Extension
      • Map - Long/Lat
      • Map - Heatmap
      • Heatmap
      • Contour
      • Number
      • Word Cloud
      • Word Cloud (日本語)
    • Features
      • Trend Line
      • Reference Line
      • Repeat By
      • Window Calculation
      • Date/Time Aggregation
      • Show Range
      • Highlight
      • Change Marker
      • Multiple Y-Axis Columns
      • Layout Configuration
      • Column Configuration
      • Column Configuration Dialog
      • Color and Group Setting
      • Color and Group Setting (日本語)
      • Color Setting
      • User Color Palette Setting
      • Pin
      • Save as PNG/SVG
      • Save as Exploratory Data File
      • Share/Schedule
      • URL Link
      • Category (Binning)
      • Highlight
      • Limit Values
      • 'Others' Group
      • Edit Display Name
      • Missing Value Handling
      • Rename Column Names
      • Axis Setting
      • Axis Formatting
      • Show Detail
      • Fit to Screen (Table)
      • Number of Unique Values Check
      • Number of Unique Values Check (日本語)
  • Analytics
    • Correlation
    • Distance
    • K-Means Clustering
    • Principal Component Analysis
    • Factor Analysis
    • Correspondence Analysis
    • Linear Regression Analysis
    • Logistic Regression Analysis
    • Generalized Linear Models
    • Survival Curve
    • Cox Regression
    • Random Survival Forest
    • Decision Tree
    • Random Forest
    • XGBoost
    • Time Series Forecasting (Prophet)
    • Time Series Forecasting (ARIMA)
    • Time Series Clustering
    • Anomaly Detection
    • Word Count
    • Text Clustering with Topic Model (LDA)
    • Market Basket Analysis
    • T Test
    • T Test (Aggregated Data)
    • ANOVA
    • Wilcoxon Test
    • Kruskal-Wallis Test
    • Chi-Square Test
    • A/B Test
    • Normality Test
    • Prediction
    • Dictionaries for Text Analysis
  • Statistics
    • Correlation
    • Distance
    • Cosine Similarity
    • SVD
    • Multi Dimensional Scaling
    • T-test
    • F-test
    • Chi-square test
    • A/B Test (Bayesian)
  • Machine Learning
    • Linear Regression
    • Logistic Regression
    • GLM
    • Multinomial Logistic Regression
    • K-means Clustering
    • Random Forest
    • XGBoost
    • Forecasting
    • Time Series Clustering
    • Anomaly Detection
    • Survival Curve
    • Survival Model (Cox Regression)
    • Market Basket
    • Causal Impact
    • Evaluate Prediction - Regression
    • Evaluate Prediction - Binary
    • Calculate ROC
    • Evaluate Prediction - Multiclass
    • Prediction
    • Prediction - Binary Classification
    • Prediction - Survival Model
    • Simulate Survival Curve
    • Extract Summary of Fit
    • Extract Parameter Estimates
    • Run ANOVA Test
    • Fix Imbalanced Data (SMOTE)
  • Text Analysis
    • Tokenize Text
    • Create N-gram Tokens
    • Calculate tf-idf
    • Count Text Pairs
  • Extend with R
    • R Package Install
    • Custom R Script
    • Custom Model Function
  • Setup
    • Disable McAfee virus scan
    • Change Repository Location
    • Change Repository Location (日本語)
    • Holidays Data for Forecast
    • Possible Reasons for Install Error
    • Upgrade Microsoft .NET Framework
  • Diagnostics
    • Log file for debugging
    • Log file for debugging (日本語)
    • Startup Log file for debugging
    • Startup Log file for debugging (日本語)
    • Check version of Exploratory Desktop
    • How to Recover the History Data
  • Keyboard shortcuts
Powered by GitBook
On this page
  • Creating Categories in General
  • Creating a Category at Color
  • Category Types
  • Equal Width
  • Equal Frequency
  • Equal Step
  • Manual
  • Outliers
  • Logical Condition
  • None

Was this helpful?

  1. Visualization
  2. Features

Category (Binning)

You can categorize numeric values inside the chart. Category is supported in following chart types.

  • Pivot Table (Row, Column)

  • Summarize Table (Group By)

  • Bar (X-Axis, Color, Repeat By)

  • Line (X-Axis, Color, Repeat By)

  • Area (X-Axis, Color, Repeat By)

  • Ring / Pie (Repeat By)

  • Histogram (Color, Repeat By)

  • Density Plot (Color, Repeat By)

  • Scatter (With Aggregation) (X-Axis, Y-Axis, Color, Group By, Repeat By)

  • Boxplot (X-Axis, Color, Repeat By)

  • Violin (X-Axis, Color, Repeat By)

  • Error Bar (X-Axis, Color, Repeat By)

  • Error Bar (Summarized Data) (X-Axis, Color, Repeat By)

  • Map - Extension (Color)

  • LongLat Map (Group By, Color)

  • Heatmap (X-Axis, Y-Axis, Repeat By)

  • Radar (Color)

Creating Categories in General

If you assign a numeric column, it automatically changes the function option to ‘As Category’ and divide values into 5 groups by default. The following example shows assigning a numeric column to X-Axis of a Boxplot chart.

You can see ‘As Category’ option is assigned.

You can open the property dialog by clicking on the green text.

You can change the setting in the property dialog. The following example shows setting the number of categories to 10 from 5.

Creating a Category at Color

Category Types

It supports the following category types.

Equal Width

It divides numeric values into groups by the data range. Each group has an equal data range. This the default type.

Following options are available.

  • Number of categories: Number of categories to create.

  • Label Text: Names for categories separated by commas.

  • Target Group: Target data group to create categories.

    • All: Create categories against the whole data set. If you use Repeat By, all charts will have the same data range.

    • Repeat By: Create categories for each chart if you use Repeat By. Each chart will have a different data range.

  • Set 0 as Center: It uses 0 as a center value when it creates categories.

  • Edge Value Handling: Which end to include in each range. The following options are available. Default is "Include Lower Range".

    • Include Upper Edge

    • Include Lower Edge

  • Upper Range: You can set the upper value range for creating categories. If you don't specify, the max value will be used.

  • Lower Range: You can set the lower value range for creating categories. If you don't specify, the min value will be used.

  • For Values Outside of Range: If you specify the Upper/Lower Range, you can set how to treat values outside the range. The following options are available.

    • Create Groups: Create extra groups for the values outside the range.

    • Ignore

If you don't specify the Upper/lower Range, the edge value (min/max value) will be handled by the following rule.

  • If the "Edge Value Handling" is set to "Include Upper Edge", the min value will be included in the lowest category.

  • If the "Edge Value Handling" is set to "Include Lower Edge", the max value will be included in the highest category.

Equal Frequency

It divides the numeric values into groups by the number of data points. Each group will have the same amount of data points (data rows). Note that it won't be exactly the same number among groups if you have tied values.

Following options are available.

  • Number of categories: Number of categories to create.

  • Label Text: Names for categories separated by commas.

  • Target Group: Target data group to create categories.

    • All: Create categories against the whole data set. If you use Repeat By, all charts will have the same data range.

    • Repeat By: Create categories for each chart if you use Repeat By. Each chart will have a different data range.

  • Edge Value Handling: Which end to include in each range. The following options are available. Default is "Include Lower Range".

    • Include Upper Edge

    • Include Lower Edge

If you don't specify the Upper/lower Range, the edge value (min/max value) will be handled by the following rule.

  • If the "Edge Value Handling" is set to "Include Upper Edge", the min value will be included in the lowest category.

  • If the "Edge Value Handling" is set to "Include Lower Edge", the max value will be included in the highest category.

Equal Step

It divides numeric values into groups by the specified step (range). For example, if you specify "10", the group will be like "10-20", "20-30", "30-40" and so on. Each group has an equal data range.

Following options are available.

  • Step: Range for each group.

  • Edge Value Handling: Which end to include in each range. The following options are available. Default is "Include Lower Range".

    • Include Upper Edge

    • Include Lower Edge

  • Upper Range: You can set the upper value range for creating categories. If you don't specify, the max value will be used.

  • Lower Range: You can set the lower value range for creating categories. If you don't specify, the min value will be used.

  • For Values Outside of Range: If you specify the Upper/Lower Range, you can set how to treat values outside the range. The following options are available.

    • Create Groups: Create extra groups for the values outside the range.

    • Ignore

Manual

It divides numeric values into groups by the Cutting Point values that user-specified.

Following options are available.

  • Cutting Points: Boundary values of categories separated by commas. For example, if you enter "10, 20", it will create categories by splitting data at 10 and 20.

  • Edge Value Handling: Which end to include in each range. The following options are available. Default is "Include Lower Range".

    • Include Upper Edge

    • Include Lower Edge

  • Label Text: Names for categories separated by commas.

  • Include Outside of the Range: If you check this, it will include the values outside of the Cutting Points. For example, if you enter "10, 20" at the Cutting Points, it will create 3 categories "-Inf - 10", "10 - 20" and "20 - Inf". If you uncheck this, it will ignore the values outside the Cutting Points. In this example, it will create 1 category "10 - 20".

Outliers

It divides numeric values into groups by the outlier detection rules.

Following options are available.

  • Outlier Type: Type of outlier detection. Following outlier types are available.

    • IQR

    • Percentile

    • Standard Deviation

  • Threshold: You can specify the threshold values. Available for Percentile and Standard Deviation outlier types.

  • Label Text: Names for categories separated by commas.

  • Target Group: Target data group to create categories.

    • All: Create categories against the whole data set. If you use Repeat By, all charts will have the same data range.

    • Repeat By: Create categories for each chart if you use Repeat By. Each chart will have a different data range.

Logical Condition

You can define a logical condition to divide the data into 2 groups (TRUE and FALSE). Currently, this option is supported only at Color.

Following operators are availbale.

Numeric

  • Equal To

  • Not Equal To

  • Is In (Multiple Values)

  • Is Not In (Multiple Values)

  • Less Than

  • Less Than or Equal To

  • Greater than

  • Greater Than or Equal To

  • Between

  • Not Between

  • Not NA

  • Is NA

  • Not Outliers

  • Is Outliers"

Categorical (character, factor)

  • Equal To

  • Not Equal To

  • Is In (Multiple Values)

  • Is Not In (Multiple Values)

  • Starts With

  • Not Start With

  • Ends With

  • Not End With

  • Contains

  • Not Contain

  • Is NA

  • Not NA

  • Keep Only Empty

  • Remove Empty

  • Keep Only Stopword

  • Remove Stopword

  • Keep Only Alphabet

  • Remove Alphabet

Date/POSIXct

  • Relative Dates

  • Equal To

  • Not Equal To

  • Is In (Multiple Values)

  • Is Not In (Multiple Values)

  • Earlier Than

  • Earlier Than or Equal To

  • Later Than

  • Later Than or Equal To

  • Between

  • Not Between

  • Is NA

  • Not NA

Logical

  • Is TRUE

  • Is FALSE

  • Is NA

  • Not NA

None

Do nothing.

PreviousURL LinkNextHighlight

Last updated 2 years ago

Was this helpful?

You can refer the for the overview of the Category feature.

See for how to create categories at Color.

Exploratory v5.3 Released!
Color