R Program to Drop Columns from a Dataframe

🎓 Top 15 Udemy Courses (80-90% Discount): My Udemy Courses - Ramesh Fadatare — All my Udemy courses are real-time and project oriented courses.

▶️ Subscribe to My YouTube Channel (178K+ subscribers): Java Guides on YouTube

▶️ For AI, ChatGPT, Web, Tech, and Generative AI, subscribe to another channel: Ramesh Fadatare on YouTube

1. Introduction

Data manipulation is a fundamental step in data analysis. At times, we might have redundant or unnecessary columns in our dataframe that we'd like to remove for clarity. In R, dropping columns from a dataframe can be achieved using a few different techniques. This guide will focus on the use of the select function from the dplyr package.

2. Program Overview

1. Create a sample dataframe.

2. Drop columns using negative selection.

3. Drop columns by name.

3. Code Program

# Load necessary library
library(dplyr)

# Create a sample dataframe
df <- data.frame(
  Name = c('John', 'Jane', 'Doe'),
  Age = c(25, 28, 22),
  Gender = c('Male', 'Female', 'Male'),
  Score = c(85, 90, 78)
)

# Display the original dataframe
print("Original Dataframe:")
print(df)

# Drop the 'Gender' and 'Score' columns using negative selection
df1 <- df %>% select(-c(Gender, Score))

# Display the dataframe after dropping columns
print("Dataframe after Dropping 'Gender' and 'Score' Columns:")
print(df1)

# Another method: Drop the 'Age' column by name
df2 <- df[, -which(names(df) %in% c("Age"))]

# Display the dataframe after dropping the 'Age' column
print("Dataframe after Dropping 'Age' Column:")
print(df2)

Output:

[1] "Original Dataframe:"
  Name Age Gender Score
1 John  25   Male    85
2 Jane  28 Female    90
3  Doe  22   Male    78

[1] "Dataframe after Dropping 'Gender' and 'Score' Columns:"
  Name Age
1 John  25
2 Jane  28
3  Doe  22

[1] "Dataframe after Dropping 'Age' Column:"
  Name Gender Score
1 John   Male    85
2 Jane Female    90
3  Doe   Male    78

4. Step By Step Explanation

- We initiate by creating a sample dataframe df with columns: Name, Age, Gender, and Score.

- To drop columns, we use the select function from the dplyr package. By placing a - in front of the column name(s) we wish to exclude, we're effectively telling R to keep all columns except those specified.

- In another method, if you want to exclude columns without the dplyr package, you can use base R's negative indexing with the help of which and names functions.

My Top and Bestseller Udemy Courses. The sale is going on with a 70 - 80% discount. The discount coupon has been added to each course below:

Comments

Spring Boot 3 Paid Course Published for Free
on my Java Guides YouTube Channel

Subscribe to my YouTube Channel (165K+ subscribers):
Java Guides Channel

Top 10 My Udemy Courses with Huge Discount:
Udemy Courses - Ramesh Fadatare