Last Updated on July 13, 2023 by Mayank Dham
The iloc() function provides a straightforward and intuitive way to access specific rows and columns in a pandas DataFrame using integer-based indexing. It allows you to locate and retrieve data based on its position within the DataFrame, rather than relying on labels or names. By leveraging iloc(), you can quickly slice, filter, and analyze data with precision and speed.
This article serves as a comprehensive guide to the iloc() function in Python. It will explore the various capabilities and use cases of iloc(), starting from the basics and gradually diving into more advanced techniques. Whether you’re looking to extract specific rows, columns, or both, this article will equip you with the knowledge and skills to harness the full potential of iloc(). Firstly let’s discuss, what is iloc in python.
What is iloc in Python?
The iloc() function in Python is a method provided by the pandas library, which is widely used for data analysis and manipulation. It stands for "integer location" and is primarily used for accessing and retrieving data from pandas DataFrame objects using integer-based indexing.
The iloc() function allows you to access specific rows and columns of a DataFrame by providing the integer-based indices. It provides a powerful and flexible way to extract subsets of data based on their position within the DataFrame, regardless of the labels or names associated with the rows and columns.
Syntax of iloc() Function in Python
The syntax of the iloc function in Python is as follows:
df.iloc[row_start:row_end, column_start:column_end]
In this syntax, “df” is the DataFrame that we want to select data from. The “row_start” and “row_end” arguments specify the starting and ending positions of the rows that we want to select. The “column_start” and “column_end” arguments specify the starting and ending positions of the columns that we want to select.
Parameters of iloc() Function in Python
The iloc function in Python takes one or two arguments to select specific rows and columns in a Pandas DataFrame. The arguments can take on different values depending on the specific use case. Here’s an overview of the different parameters of the iloc function:
- row_start: This argument specifies the integer position of the starting row for the selection. If this parameter is not specified, it defaults to 0, which is the first row of the DataFrame.
- row_end: This argument specifies the integer position of the ending row for the selection. If this parameter is not specified, it defaults to the last row of the DataFrame.
- column_start: This argument specifies the integer position of the starting column for the selection. If this parameter is not specified, it defaults to 0, which is the first column of the DataFrame.
- column_end: This argument specifies the integer position of the ending column for the selection. If this parameter is not specified, it defaults to the last column of the DataFrame.
Note that the row_end and column_end parameters are non-inclusive, meaning that the final row or column specified in the range is not included in the selection.
Return value of iloc() Function in Python
The iloc function in Python returns a view of the selected rows and columns from a Pandas DataFrame. This view can be used to access, modify, or delete the selected data.
The returned view is a Pandas DataFrame or Series, depending on the number of rows or columns selected. If a single row or column is selected, the returned object is a Pandas Series. If multiple rows or columns are selected, the returned object is a Pandas DataFrame.
Note that the returned view is a reference to the original DataFrame, rather than a copy. This means that any modifications made to the selected data will also affect the original DataFrame.
Here’s an example of using the iloc function and examining the returned view:
import pandas as pd # Create a sample DataFrame data = {'name': ['John', 'Mary', 'Alex', 'Emma'], 'age': [28, 35, 42, 25], 'gender': ['M', 'F', 'M', 'F']} df = pd.DataFrame(data) # Use iloc to select the first two rows and all columns selected = df.iloc[0:2, :] # Print the selected view print(selected) # Modify the selected view selected['age'] = [30, 37] # Print the original DataFrame to show the modification print(df)
Output:
name age gender
0 John 28 M
1 Mary 35 F
name age gender
0 John 30 M
1 Mary 37 F
2 Alex 42 M
3 Emma 25 F
Explanation – In this example, the iloc function is used to select the first two rows and all columns of a DataFrame. The selected view is then printed, showing the selected data. Next, the age column of the selected view is modified. This modification also affects the original DataFrame, as demonstrated by printing the DataFrame again.
Examples of iloc() Function in Python
The iloc function is a powerful tool for selecting and manipulating data in Pandas DataFrames. Here are some examples of using the iloc function in Python, along with explanations of the code:
Example 1 – Selecting specific rows and columns
To select specific rows and columns from a DataFrame, you can use the iloc function with the row and column positions as arguments. For example:
import pandas as pd # Create a sample DataFrame data = {'name': ['John', 'Mary', 'Alex', 'Emma'], 'age': [28, 35, 42, 25], 'gender': ['M', 'F', 'M', 'F']} df = pd.DataFrame(data) # Select the second row and the age column selected = df.iloc[1, 1] print(selected)
Output:
35
Explanation: In this example, the iloc function is used to select the second row and the age column of a DataFrame. The selected value is then printed, showing the age of the second person in the DataFrame.
Example 2 – Selecting subsets of rows and columns
To select subsets of rows and columns from a DataFrame, you can use the iloc function with range values for the row and column positions. For example:
import pandas as pd # Create a sample DataFrame data = {'name': ['John', 'Mary', 'Alex', 'Emma'], 'age': [28, 35, 42, 25], 'gender': ['M', 'F', 'M', 'F']} df = pd.DataFrame(data) # Select the first two rows and all columns selected = df.iloc[0:2, :] print(selected)
Output:
name age gender
0 John 28 M
1 Mary 35 F
Explanation: In this example, the iloc function is used to select the first two rows and all columns of a DataFrame. The selected data is then printed, showing the first two people in the DataFrame.
Example 3 – Slicing rows and columns
To slice rows and columns from a DataFrame, you can use the iloc function with a step value for the row and column positions. For example:
import pandas as pd # Create a sample DataFrame data = {'name': ['John', 'Mary', 'Alex', 'Emma'], 'age': [28, 35, 42, 25], 'gender': ['M', 'F', 'M', 'F']} df = pd.DataFrame(data) # Select every other row and column selected = df.iloc[::2, ::2] print(selected)
Output:
name gender
0 John M
2 Alex M
Explanation: In this example, the iloc function is used to select every other row and column of a DataFrame. The selected data is then printed, showing the name and gender of the first and third people in the DataFrame.
Features and Capabilities of iloc function in Python
- Integer-based indexing: iloc() allows you to locate and extract data based on its position within the DataFrame, using integer indices or ranges.
- Single element or subset extraction: You can use iloc() to access individual elements by providing a single integer index or extract subsets of data by specifying multiple indices or ranges.
- Supports slicing: iloc() supports slicing operations for both rows and columns, enabling you to extract contiguous sections of data from the DataFrame.
- Handles negative indices: iloc() handles negative indices, allowing you to access elements or subsets from the end of the DataFrame.
- Flexibility with integer indexing: iloc() provides flexibility by accepting various indexing techniques, including single integers, lists, arrays, or boolean arrays.
- Can be combined with other DataFrame operations: iloc() can be combined with other pandas DataFrame operations, such as filtering, aggregation, or computation, to perform complex data manipulations efficiently.
Conclusion
In this article, we explored the iloc function in Python and its significance in data analysis using the pandas library. We learned that iloc python function provides a powerful and efficient way to access and retrieve data from DataFrames using integer-based indexing. By leveraging iloc(), data scientists and analysts can easily extract specific rows, columns, or subsets of data based on their positions, regardless of the labels or names associated with the data.
Throughout the article, we covered the basic syntax and usage of iloc(), including how to access individual elements and perform slicing operations. We also discussed various techniques and best practices for maximizing the potential of iloc python function, such as combining it with other DataFrame operations and considering performance considerations.
Summary
- The iloc function is a tool in the Pandas library for selecting and manipulating data in DataFrames and Series.
- It works by selecting rows and columns by their integer positions, rather than by their names.
- The syntax of the iloc function includes arguments for row start, row end, column start, and column end.
- The iloc function can be used to select specific rows and columns or to slice rows and columns from a DataFrame.
FAQs Related to iloc Function
Here are some frequently asked questions on iloc function in Python
Q1: What is the difference between iloc and loc in Pandas?
A: The iloc function is used to select rows and columns by their integer positions, while the loc function is used to select rows and columns by their labels or names.
Q2: What are some use cases for the iloc function?
A: The iloc function is particularly useful when working with large datasets, as it provides a faster and more efficient way to select data. Some use cases include selecting subsets of data, slicing data, and performing operations on specific rows or columns.
Q3: How can I use the iloc function to select specific rows and columns?
A: To select specific rows and columns using the iloc function, you can specify the row and column positions as arguments. For example, you can use the syntax df.iloc[row_position, column_position] to select a specific cell in a DataFrame.
Q4: Can I use iloc to filter rows based on a condition?
A: No, the iloc function is not used for filtering rows based on a condition. For filtering, you would typically use the Boolean indexing method in Pandas, which involves creating a Boolean expression to select the rows that meet a certain condition.
Q5: Is iloc the only way to select data in Pandas?
A: No, there are several methods for selecting and manipulating data in Pandas, including loc, ix, at, and iat. Each method has its own specific use cases and advantages.