# Cosine Similarity

## Introduction

Calculate cosine similarity of each of the pairs of categories. This is often used as similarity of documents.

## How to Access?

You can access from 'Add' (Plus) button.

![](/files/-M4oNDaJ-SzpC8OiNdjS)

## How to Use?

### Calculate Distances Among Categories

![](/files/-M4oNDaL7CxPGB0mkaUc)

#### Column Selection

Category, dimension and measure are like this.

![](/files/-M4oNDaNPjridRfO5eqE)

Category column is a column that has categories. They are parameterized by measures with the dimensions.

![](/files/-M4oNDaSq8iW09ffLg4h)

In this case, similarities of airline carriers are calculated based on count of flight. Think that each carrier is represented as a vector of flight count in each week day and cosine similarities of them are calculated.

If there are duplicated values, they will be aggregated by "Aggregate with".

### Parameters

* Keep Only Unique Pairs (Optional) - The default is FALSE. Whether the pair of output should be unique. If this is TRUE, a pair appears only once but if it's FALSE, a pair appears twice in swapped order. If you want to filter the pairs by names, it's better to be FALSE.
* Keep Diagonal Pairs (Optional) - The default is FALSE. Whether the output should contain the similarity of documents with itself.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.exploratory.io/statistics/sim.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
