Exploratory
  • Introduction
  • Product Features
    • Summary View
    • Table View
    • Row Filter
    • Column Filter
    • Dashboard
    • Dashboard (日本語)
    • Note
    • Note (日本語)
    • Steps (Right-hand side)
    • Branch
    • Parameter
    • Parameter (日本語)
    • Export
    • Share
      • Share Type
      • Chart / Analytics
      • Data
      • Report (Note / Dashboard)
      • Notification
      • Version History
      • Restore Older Version
      • CSV API
    • Share (日本語)
      • 共有のタイプ
      • チャート / アナリティクス
      • データ
      • レポート (ノート / ダッシュボード)
      • 通知
      • バージョンの履歴
      • 古いバージョンの復元
      • CSV API
    • Schedule
      • Manage Schedules
      • Notification
      • Scheduling History
    • Schedule (日本語)
      • スケジュールの設定
      • 通知
      • スケジュールの履歴
    • Team
      • Manage Teams
    • Team (日本語)
      • チームの設定
    • Project
      • Import
      • Export
      • Search
  • Data Import
    • File Data
      • CSV / Delimited File
      • Amazon S3
      • Google Drive
      • Google Cloud Storage
      • Excel
      • JSON
      • Log File
      • Microsoft Azure
      • Stats - SAS / SPSS / STATA
      • RData / RDS
      • Parquet File
      • EDF - Exploratory
    • Database Data
      • SQL Troubleshooting
      • Create Connection
      • Amazon Athena
      • Amazon Aurora
      • Amazon Redshift
      • Amazon Redshift (日本語)
      • Google BigQuery
      • HP Vertica
      • MariaDB / MySQL DB
      • MariaDB / MySQL DB (日本語)
      • Microsoft Access
      • MongoDB
      • ODBC
      • Oracle
      • PostgreSQL
      • PostgreSQL (日本語)
      • Presto
      • Snowflake
      • SQLServer (DSN)
      • SQLServer
      • Teradata
      • Treasure Data
    • Cloud Apps Data
      • Create Connection
      • FRED - Federal Reserve of Economic Data
      • Github Issues
      • Google Analytics
      • Google Analytics (日本語)
      • Google Spreadsheet
      • Google Cloud Storage
      • Salesforce
      • Twitter Search
      • Stripe
      • Weather Data
      • Stock Price Data
    • Write R Script as Data
      • Currency Exchange Rate
    • Write R Script as Data (日本語)
    • Web Page Scraping
    • Text Input Data
    • Data Source Extension
      • Quandl
      • Holiday
      • RSS Data
    • Create Custom Data Source
  • Data Wrangling
    • Command Line mode for faster and more flexible data interaction in Exploratory
    • Select / Remove Columns
    • Reorder Columns
    • Create New Calculation
    • Create New Calculation for Multiple Columns
    • Summarize (Aggregate)
    • Group
    • Filter
    • Rename
    • Arrange (Sort)
    • Top / Bottom N
    • Join
    • Merge
    • Gather
    • Spread
    • Pivot
    • Expand
    • Complete
    • Separate
    • Unite
    • Bind Rows
    • Bind Columns
    • Keep Only Unique Rows
    • Keep Only Duplicated Rows
    • Slice
    • Drop NA
    • Sample
    • Impute NA
    • Fill
    • Create Buckets
    • Assign New Values to Existing Values - Recode
    • Assign New Values by Setting Conditions - Case When
    • Work with Categories
    • Data Type Conversion
    • Row as header
    • Ungroup
    • Unnest
    • Separate List Items into Columns (Unnest Wider)
    • Separate List Items into Rows (Unnest Longer)
    • Separate Address (Japan)
    • Hoist
    • Remove Empty Rows
    • Remove Empty Columns
    • Clean Column Names
    • Window Calculation
    • Window Calculation (日本語)
    • Add Row
    • Text Wrangling
    • Regular Expression Cheat Sheet
    • Regular Expression Cheat Sheet (日本語)
  • Visualization
    • Types
      • Pivot
      • Summarize Table
      • Table
      • Bar
      • Line
      • Area
      • Pie/Ring
      • Radar
      • Histogram
      • Density Plot
      • Scatter (No Aggregation)
      • Scatter (With Aggregation)
      • Boxplot
      • Violin
      • Error Bar
      • Error Bar (Summarized Data)
      • Map - Standard
      • Map - Extension
      • Map - Long/Lat
      • Map - Heatmap
      • Heatmap
      • Contour
      • Number
      • Word Cloud
      • Word Cloud (日本語)
    • Features
      • Trend Line
      • Reference Line
      • Repeat By
      • Window Calculation
      • Date/Time Aggregation
      • Show Range
      • Highlight
      • Change Marker
      • Multiple Y-Axis Columns
      • Layout Configuration
      • Column Configuration
      • Column Configuration Dialog
      • Color and Group Setting
      • Color and Group Setting (日本語)
      • Color Setting
      • User Color Palette Setting
      • Pin
      • Save as PNG/SVG
      • Save as Exploratory Data File
      • Share/Schedule
      • URL Link
      • Category (Binning)
      • Highlight
      • Limit Values
      • 'Others' Group
      • Edit Display Name
      • Missing Value Handling
      • Rename Column Names
      • Axis Setting
      • Axis Formatting
      • Show Detail
      • Fit to Screen (Table)
      • Number of Unique Values Check
      • Number of Unique Values Check (日本語)
  • Analytics
    • Correlation
    • Distance
    • K-Means Clustering
    • Principal Component Analysis
    • Factor Analysis
    • Correspondence Analysis
    • Linear Regression Analysis
    • Logistic Regression Analysis
    • Generalized Linear Models
    • Survival Curve
    • Cox Regression
    • Random Survival Forest
    • Decision Tree
    • Random Forest
    • XGBoost
    • Time Series Forecasting (Prophet)
    • Time Series Forecasting (ARIMA)
    • Time Series Clustering
    • Anomaly Detection
    • Word Count
    • Text Clustering with Topic Model (LDA)
    • Market Basket Analysis
    • T Test
    • T Test (Aggregated Data)
    • ANOVA
    • Wilcoxon Test
    • Kruskal-Wallis Test
    • Chi-Square Test
    • A/B Test
    • Normality Test
    • Prediction
    • Dictionaries for Text Analysis
  • Statistics
    • Correlation
    • Distance
    • Cosine Similarity
    • SVD
    • Multi Dimensional Scaling
    • T-test
    • F-test
    • Chi-square test
    • A/B Test (Bayesian)
  • Machine Learning
    • Linear Regression
    • Logistic Regression
    • GLM
    • Multinomial Logistic Regression
    • K-means Clustering
    • Random Forest
    • XGBoost
    • Forecasting
    • Time Series Clustering
    • Anomaly Detection
    • Survival Curve
    • Survival Model (Cox Regression)
    • Market Basket
    • Causal Impact
    • Evaluate Prediction - Regression
    • Evaluate Prediction - Binary
    • Calculate ROC
    • Evaluate Prediction - Multiclass
    • Prediction
    • Prediction - Binary Classification
    • Prediction - Survival Model
    • Simulate Survival Curve
    • Extract Summary of Fit
    • Extract Parameter Estimates
    • Run ANOVA Test
    • Fix Imbalanced Data (SMOTE)
  • Text Analysis
    • Tokenize Text
    • Create N-gram Tokens
    • Calculate tf-idf
    • Count Text Pairs
  • Extend with R
    • R Package Install
    • Custom R Script
    • Custom Model Function
  • Setup
    • Disable McAfee virus scan
    • Change Repository Location
    • Change Repository Location (日本語)
    • Holidays Data for Forecast
    • Possible Reasons for Install Error
    • Upgrade Microsoft .NET Framework
  • Diagnostics
    • Log file for debugging
    • Log file for debugging (日本語)
    • Startup Log file for debugging
    • Startup Log file for debugging (日本語)
    • Check version of Exploratory Desktop
    • How to Recover the History Data
  • Keyboard shortcuts
Powered by GitBook
On this page
  • Input Data
  • Analytics Properties
  • How to Use This Feature
  • "Summary" View
  • "Survival Curve" View
  • "Survival Table" View
  • "Data" View
  • R Package
  • Exploratory R Package

Was this helpful?

  1. Analytics

Survival Curve

Calculates Survival Curve (Kaplan-Meier estimator) from start time, end time, and survival status.

Input Data

Input data should be a survival data. Each row should represent one observation (e.g. one user of a subscription service). It should have following columns.

(Note that this analytics calculates survival time from Start Time and End time, rather than taking numeric survival time as input.)

  • Start Time - A Date or POSIXct column with the beginning of the observation of the subject.

  • End Time - A Date or POSIXct column with the end of the observation of the subject.

  • Event Status - A boolean or binary numeric value (can take value of 1 or 0) column with whether the event of interest (death) happened. When this column is true or 1, it means the event of interest happened to the subject at the End Time. If it is false or 0, it means we know that the event had not happened to the subject at least until the End Time, but we don't know what happened or will happen to the subject after that point.

  • Other columns to group observations - Optionally, if there are other columns that groups subjects, they can be used to draw separate survival curves for each group, for comparison between groups.

Analytics Properties

  • Fill NA in End Time - How to fill NA in End Time column. One of the following.

    • Use Latest Date/Time - Use the latest date/time that appear in the data.

    • Use Today - Use today to fill NA.

    • Enter Manually - Use manually entered value

  • Date to Fill NA in End Time - Date to fill NA in End Time column. This value takes effect when "Enter Manually" is selected for "Fill NA in End Time".

How to Use This Feature

  1. Click Analytics View tab.

  2. If necessary, click "+" button on the left of existing Analytics tabs, to create a new Analytics.

  3. Select "Survival Analysis" for Analytics Type.

  4. Select Start Time column with "Start Time" column selector.

  5. Select End Time column with "End Time" column selector.

  6. Select unit of time (Day, Week...) with "Period By" dropdown. The default "Auto" is mean of observations with event.

  7. Select Event Status column with "Event Status" column selector.

  8. (Optional) Select a column to group subjects with "Color By" column selector. For each group, a survival curve with a different color will be drawn in a same chart.

  9. (Optional) Select a column to group subjects with "Repeat By" column selector. For each group, a separate small chart will be displayed.

  10. Click Run button to run the analytics.

  11. Select view type (explained below) by clicking view type link to see each type of generated visualization.

"Summary" View

"Summary" View shows up only when Color By column is set. It shows the result of log-rank test that tests if the difference in survival of the subjects that belongs to different Color-By column groups is statistically significant or not.

  • Chi-Square - The test statistic of the log-rank test.

  • P Value (Log-Rank Test) - The test result.

  • Degree of Freedom - The degree of freedom of the chi-square distribution used for the log-rank test.

  • Number of Rows - Number of observations.

  • Number of Events - Number of events (death) in the entire observation data.

"Survival Curve" View

"Survival Curve" View displays how the subjects survives (without experiencing event/death) as time goes by with a line chart. This example is with Color By.

This example is with Repeat By in addition to Color By.

By checking "With Confidence Range" checkbox, range for 95% confidence interval can be displayed.

"Survival Table" View

"Survival Table" View displays how the subjects survives as time goes by with a table with color.

"Data" View

"Data" View displays raw data for the survival curve.

Output Data

Following is the list of columns in the raw survival curve data displayed in the "Data" View.

  • Group Columns - If Color By or Repeat By is specified, the columns appears in the output data frame. Survival curve data rows for each group comes with corresponding group column values.

  • time - Survival time. This column should be used as X-axis when drawing survival carve in line chart.

  • n_risk - The number of subjects that were surviving at the time.

  • n_event - Out of n_risk, to how many of the subjects the event occurred at the time.

  • n_censor - Out of n_risk, how many of the subjects we lost track of at the time.

  • estimate - Estimated survival probability at the time. This column should be used as Y-axis when drawing survival carve in line chart.

  • std_error - Standard error of the estimated survival probability value.

  • conf_high - Upper bound of the confidence interval for the estimated survival probability value.

  • conf_low - Upper bound of the confidence interval for the estimated survival probability value.

R Package

Exploratory R Package

PreviousGeneralized Linear ModelsNextCox Regression

Last updated 3 years ago

Was this helpful?

The Survival Analysis uses R Package under the hood.

For details about survival usage in Exploratory R Package, please refer to the

survival
github repository