Open In App

Pandas.cut() method in Python

Last Updated : 07 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The cut() function in Pandas is used to divide or group numerical data into different categories (called bins). This is helpful when we have a list of numbers and want to separate them into meaningful groups.

Sometimes, instead of working with exact numbers, we want to group them into ranges. For example, suppose we have students' marks data, instead of listing every score, we might want to categorize them into "Low", "Average", and "High":

Python
import pandas as pd

d = {'Student': ['Aryan', 'Prajjwal', 'Vishakshi', 'Brijkant', 'Kareena'],
        'Marks': [77, 72, 19, 68, 45]}
df = pd.DataFrame(d)

bins = [0, 50, 75, 100]  # Ranges: 0-50, 51-75, 76-100
lab = ['Low', 'Average', 'High']

# Step 3: Use cut() to categorize the marks
df['Category'] = pd.cut(df['Marks'], bins=bins, labels=lab, include_lowest=True)

print(df)

Output
     Student  Marks Category
0      Aryan     77     High
1   Prajjwal     72  Average
2  Vishakshi     19      Low
3   Brijkant     68  Average
4    Kareena     45      Low

This process is called binning, and it helps in data analysis by making large sets of numbers easier to understand and compare.

Syntax

pd.cut(x, bins, right=True, labels=None, retbins=False, precision=3, include_lowest=False, duplicates="raise")

Parameters:

  • x: The 1D input array to be binned.
  • bins: Defines the bin edges for segmentation.
  • right (default: True): If True, bins include the rightmost edge.
  • labels: Assigns labels to bins. If False, only integer indicators are returned.
  • retbins (default: False): If True, returns the bin edges.

Return Type:

1. When applied to a Pandas Series (DataFrame column), it returns a pandas.Series with categorized bins.

2. When applied to a NumPy array or list, it returns a numpy.ndarray of categorized bins.

3. If retbins=True is used, it returns a tuple:

  • First element: A Series or array with categorized values.
  • Second element: The array of bin edges.

Examples of .cut() method:

Example 1: Categorizing Random Numbers into Bins

Let's create an array of 10 random numbers from 1 to 100 and categorize them into 5 bins:

Python
import pandas as pd
import numpy as np

# Creating a DataFrame with random numbers
df = pd.DataFrame({'number': np.random.randint(1, 100, 10)})

# Using cut() to categorize numbers into 5 bins
df['bins'] = pd.cut(df['number'], bins=[1, 20, 40, 60, 80, 100])

print(df)

# Checking unique bins
print(df['bins'].unique())

Output
   number           bins
0       1            NaN
1      83  (80.0, 100.0]
2      33   (20.0, 40.0]
3      11    (1.0, 20.0]
4      32   (20.0, 40.0]
5       6    (1.0, 20.0]
6       9    (1.0, 20.0]
...

Explanation:

  • The numbers are assigned to bins (1,20], (20,40], etc.
  • cut() function automatically determines which bin each number belongs to.

Example 2: Adding Labels to Bins

We can also assign labels to our bins to make the output more readable:

Python
import pandas as pd
import numpy as np

df = pd.DataFrame({'number': np.random.randint(1, 100, 10)})

# Categorizing numbers with labels
df['bins'] = pd.cut(df['number'], bins=[1, 20, 40, 60, 80, 100],
                    labels=['1 to 20', '21 to 40', '41 to 60', '61 to 80', '81 to 100'])

print(df)

# Checking unique bins
print(df['bins'].unique())

Output
   number       bins
0      55   41 to 60
1       8    1 to 20
2      51   41 to 60
3      26   21 to 40
4       5    1 to 20
5       7    1 to 20
6      48   41 to 60
7      50   41 to 60
8      37  ...

Explanation:

  • Instead of bin ranges (1,20], we now see labels like '1 to 20', '41 to 60', etc.
  • This improves readability and makes it easier to analyze categorized data.

Example 2: Applying pd.cut() to a NumPy Array

Python
import numpy as np
import pandas as pd

n = np.array([10, 25, 45, 68, 90])

b_res = pd.cut(n, bins=[0, 20, 50, 100])

print(b_res)
print(type(b_res))

Output
[(0, 20], (20, 50], (20, 50], (50, 100], (50, 100]]
Categories (3, interval[int64, right]): [(0, 20] < (20, 50] < (50, 100]]
<class 'pandas.core.arrays.categorical.Categorical'>

The result is a NumPy array with categorized values.


Next Article
Article Tags :
Practice Tags :

Similar Reads