R Program to Drop Columns from a Dataframe

1. Introduction

Data manipulation is a fundamental step in data analysis. At times, we might have redundant or unnecessary columns in our dataframe that we'd like to remove for clarity. In R, dropping columns from a dataframe can be achieved using a few different techniques. This guide will focus on the use of the select function from the dplyr package.

2. Program Overview

1. Create a sample dataframe.

2. Drop columns using negative selection.

3. Drop columns by name.

3. Code Program

# Load necessary library
library(dplyr)

# Create a sample dataframe
df <- data.frame(
  Name = c('John', 'Jane', 'Doe'),
  Age = c(25, 28, 22),
  Gender = c('Male', 'Female', 'Male'),
  Score = c(85, 90, 78)
)

# Display the original dataframe
print("Original Dataframe:")
print(df)

# Drop the 'Gender' and 'Score' columns using negative selection
df1 <- df %>% select(-c(Gender, Score))

# Display the dataframe after dropping columns
print("Dataframe after Dropping 'Gender' and 'Score' Columns:")
print(df1)

# Another method: Drop the 'Age' column by name
df2 <- df[, -which(names(df) %in% c("Age"))]

# Display the dataframe after dropping the 'Age' column
print("Dataframe after Dropping 'Age' Column:")
print(df2)

Output:

[1] "Original Dataframe:"
  Name Age Gender Score
1 John  25   Male    85
2 Jane  28 Female    90
3  Doe  22   Male    78

[1] "Dataframe after Dropping 'Gender' and 'Score' Columns:"
  Name Age
1 John  25
2 Jane  28
3  Doe  22

[1] "Dataframe after Dropping 'Age' Column:"
  Name Gender Score
1 John   Male    85
2 Jane Female    90
3  Doe   Male    78

4. Step By Step Explanation

- We initiate by creating a sample dataframe df with columns: Name, Age, Gender, and Score.

- To drop columns, we use the select function from the dplyr package. By placing a - in front of the column name(s) we wish to exclude, we're effectively telling R to keep all columns except those specified.

- In another method, if you want to exclude columns without the dplyr package, you can use base R's negative indexing with the help of which and names functions.

Comments