Selecting rows and columns of Pandas DataFrame by row and column position

Use cases for code example

Use case examples – Selecting Pandas DataFrame rows and columns by position

When it comes to preparing data for analytics, we often need to select specific or range of rows and columns from our Pandas DataFrame.

For example, if we have a dataset that captures historical events and we only want to capture certain captured metrics (columns) and only a fixed set of observations (e.g. last x observations, last x rows)

Sample Dataset for this example

You can follow the examples by using our dataset. Our dataset contains historical international football results.

Walkthrough and explanation of code

Walkthrough of code – Selecting rows and columns of Pandas DataFrame by row and column position

We use the following code to access specific range of rows and columns

df.iloc[row number, column number]

or

df.iloc[range of rows, range of columns]
  • row or column numbers start from zero

See our illustration of our dataframe:

1st row is 0 and last row is 4, 1st column is 0 and last columns is 4

Selecting rows and columns of Pandas DataFrame

Accessing specific row and column within the dataframe

df.iloc[1,1]
You access the following shaded cells

Accessing a cell in pandas dataframe

Accessing range of rows and columns

df.iloc[0:3,0:4]

 

Selecting rows and columns of pandas dataframes by position

Select all rows or all columns

Select all columns

df.iloc[0:3,:]

Select all rows

df.iloc[:.0:4]

Lets take a look at a example scenario:

  • You are analysing a dataset that contains Japan’s historical international football matches since 7th May 1917
  • You are interested in a subset of the data:
    • 1st 7 historical matches starting from  7th May 1917
    • the last 7 matches as at 3rd July 2018
  • We will use the position of records in the dataset to filter the subset

Example code

Example Code – Select specific or range of rows and columns from Pandas DataFrame example

  1. Load the full dataset of data containing Japan’s international football match stats
import pandas as pd 
#source files to read historical international games 
df_games_src = pd.read_csv("data//japan_results.csv") 

#view the first 10 match details df_games_src.tail()
df_games_src.head(10)

 

Load Pandas CSV dataset and subset of interest

Filter 1st 7 rows and select all columns in Pandas Dataframe

# get 1st 7 records and all columns 
first7_matches = df_games_src.iloc[0:7,:] 
first7_matches

Filter Pandas Dataframe by 1st 7 records and all columns

Subset 2nd to 7th record in the dataframe

Filter 2nd to 7th rows and select all columns of Pandas DataFrame

# get 2nd to7th records and all columns 
secto7th_matches = df_games_src.iloc[1:7,:]

Filtered 2nd to 7th row in the dataframe

We look at the last 10 records in the dataframe  and identify the interested subset of data

#view the last 10 match details df_games_src.tail()
df_games_src.tail(10)

Last 10 records in dataframe

Filter out last 7 rows and select all columns of DataFrame

#select last 7 records and all columns except last

last7_matches = df_games_src.iloc[-7:,:] 
last7_matches

Filtered last 7 rows and all columns selected in dataframe

Filter out 2nd from last to 7th from last rows and select 1st 9 columns of Pandas DataFrame

In this example we are interested in the 2nd last record to the 7th from last record and only want to select the 1st 9 out of 10 columns in the dataframe. See illustration below for the identified subset.

Subset of interest in the Dataframe

We run the following code with the stated positions and get the output

last7to2ndlast_matches_drop_last_col =  df_games_src.iloc[-7:-1,0:9]
last7to2ndlast_matches_drop_last_col

Filtered 2nd from last to 7th from last rows and selected 1st 9 columns from DataFrame

Was this article helpful?