![]() |
VOOZH | about |
When importing data from Excel files in R, you often encounter situations where certain rows are irrelevant to your analysis and should be skipped. This could be due to header rows, metadata, or other non-data rows. R provides several methods to handle this using packages like readxl and openxlsx. This article covers how to skip rows while reading Excel files using readxl and openxlsx in R Programming Language.
readxlThereadxl package is a popular choice for reading Excel files into R. It supports both .xls and .xlsx file formats and provides a straightforward interface for reading Excel data.
First, you need to install and load the readxl package if you haven't already:
# Install the package if you haven't already
install.packages("readxl")
# Load the package
library(readxl)
To skip rows while reading an Excel file using readxl, you use the skip argument in the read_excel function. This argument specifies the number of rows to skip before reading the data. You can also specify a particular sheet to read from, especially if the relevant data is on a different sheet:
Here we are taking an simple dataset to perform these examples.
Dataset Link: Order Dataset
Output:
# A tibble: 6 × 12
order_no order_date buyer ship_city ship_state sku description quantity item_total
<chr><chr><chr><chr><chr><chr><chr><chr><chr>
1 405-976… Sun, 18 J… Mr. CHANDIGA… CHANDIGARH SKU:… 100% Leath… 1 ₹449.00
2 404-396… Tue, 19 O… Minam PASIGHAT, ARUNACHAL… SKU:… Women's Se… 1 ₹449.00
3 171-810… Sun, 28 N… yati… PASIGHAT, ARUNACHAL… SKU:… Women's Se… 1 ₹449.00
4 405-317… Wed, 28 J… aciya DEVARAKO… TELANGANA SKU:… Pure 100% … 1 NA
5 402-891… Tue, 28 S… Susm… MUMBAI, MAHARASHT… SKU:… Pure Leath… 1 ₹1,099.00
6 406-929… Thu, 17 J… Subi… HOWRAH, WEST BENG… SKU:… Women's Tr… 1 ₹200.00
# A tibble: 6 × 12
`404-3964908-7850720` Tue, 19 Oct, 2021, 6:0…¹ Minam `PASIGHAT,` `ARUNACHAL PRADESH`
<chr><chr><chr><chr><chr>
1 171-8103182-4289117 Sun, 28 Nov, 2021, 10:2… yati… PASIGHAT, ARUNACHAL PRADESH
2 405-3171677-9557154 Wed, 28 Jul, 2021, 4:06… aciya DEVARAKOND… TELANGANA
3 402-8910771-1215552 Tue, 28 Sept, 2021, 2:5… Susm… MUMBAI, MAHARASHTRA
4 406-9292208-6725123 Thu, 17 Jun, 2021, 9:12… Subi… HOWRAH, WEST BENGAL
5 404-5794317-7737924 Thu, 12 Aug, 2021, 8:03… shai… ORAI, UTTAR PRADESH
6 405-8702211-4054722 Wed, 29 Sept, 2021, 2:5… Prat… BAREILLY, UTTAR PRADESH
openxlsxThe openxlsx package is another powerful tool for reading and writing Excel files. It offers additional functionalities and customization options compared to readxl. In openxlsx, skipping rows is done by specifying the startRow argument in the read.xlsx function. This argument defines the row number from which to start reading.
Output:
404-3964908-7850720 Tue,.19.Oct,.2021,.6:05.pm.IST Minam PASIGHAT, ARUNACHAL.PRADESH SKU:.DN-0WDX-VYOT
1 171-8103182-4289117 Sun, 28 Nov, 2021, 10:20 pm IST yatipertin PASIGHAT, ARUNACHAL PRADESH SKU: DN-0WDX-VYOT
2 405-3171677-9557154 Wed, 28 Jul, 2021, 4:06 am IST aciya DEVARAKONDA, TELANGANA SKU: AH-J3AO-R7DN
3 402-8910771-1215552 Tue, 28 Sept, 2021, 2:50 pm IST Susmita MUMBAI, MAHARASHTRA SKU: KL-7WAA-Z82I
4 406-9292208-6725123 Thu, 17 Jun, 2021, 9:12 pm IST Subinita HOWRAH, WEST BENGAL SKU: HH-FOWV-5YWO
5 404-5794317-7737924 Thu, 12 Aug, 2021, 8:03 pm IST shailendra ORAI, UTTAR PRADESH SKU: TQ-OE6K-9DIK
6 405-8702211-4054722 Wed, 29 Sept, 2021, 2:55 pm IST Pratima BAREILLY, UTTAR PRADESH SKU: S1-A92Q-JU3X
404-3964908-7850720 Minam ARUNACHAL.PRADESH
1 171-8103182-4289117 yatipertin ARUNACHAL PRADESH
2 405-3171677-9557154 aciya TELANGANA
3 402-8910771-1215552 Susmita MAHARASHTRA
4 406-9292208-6725123 Subinita WEST BENGAL
5 404-5794317-7737924 shailendra UTTAR PRADESH
6 405-8702211-4054722 Pratima UTTAR PRADESH
Skipping rows while reading Excel files is essential for cleaning up data and focusing on the relevant information. Both readxl and openxlsx packages in R provide functionalities to skip rows efficiently. Use readxl for straightforward tasks and openxlsx for more advanced needs. Understanding these options allows for better control over data import processes and helps streamline your data analysis workflow.