Extracting data in the same cell locations from multiple excel files into one single excel file – Excel

by
Ali Hasan
excel llama-cpp-python pandas vba

The Problem:

You have multiple Excel files containing customer information (contact name, company name, phone number, email address) in the same cell locations. Your task is to extract this data from all the individual Excel files and consolidate it into a single Excel file called "Customers" under four headings: Contact Name, Company Name, Phone Number, and Email Address.

The Solutions:

Solution 1: Iterating through Excel files and copying specific data range

This solution involves iterating through multiple Excel files located in a designated folder and extracting a specific range of data from each file into a single “Customers” Excel file.

Here’s a breakdown of the steps involved in the solution:

  1. Initialize variables: Declare variables to represent the source folder, file collection, source file, source workbook, destination worksheet, and destination row.

  2. Disable screen updating and display alerts: This is done to improve performance while the code is running.

  3. Set source folder path: Specify the path to the folder where the Excel files with data are located.

  4. Set destination worksheet: Indicate the name of the worksheet in the "Customers" Excel file where the extracted data will be pasted.

  5. Initialize destination row: This variable will keep track of the row where the data will be placed in the destination worksheet.

  6. Create FileSystemObject: Create an instance of FileSystemObject to access and manipulate files in the specified folder.

  7. Loop through Excel files in the folder: Use a loop to iterate through each file in the source folder.

  8. Check file type: Check if each file is an Excel file by verifying its extension.

  9. Open source workbook: For each Excel file, open the workbook using Workbooks.Open method.

  10. Copy data range: Select the specific range of data (e.g., "B4:B7") from the source workbook using the Range object.

  11. Paste data to destination worksheet: Paste the copied data into the destination worksheet, starting at a specific cell in column A of the specified destination row.

  12. Update destination row: Increment the destination row counter to move to the next row for the next set of data.

  13. Close source workbook: Close the source workbook without saving changes.

  14. Clear clipboard: Ensure that the clipboard is cleared to prevent any potential data conflicts.

  15. Display message: Display a message box to indicate that the copying of customer information from files is complete.

  16. Re-enable screen updating and display alerts: Restore the default settings for screen updating and display alerts.

Solution 2: Importing Data from Multiple Files Using PowerQuery

PowerQuery, also known as “Get Data,” offers a low-code/no-code solution to import data from multiple Excel files:

  1. Within Excel, navigate to the Data tab, select “Get Data” > “From File” > “From Folder.”
  2. Select the folder containing the Excel files you want to combine.
  3. In the <Folder path> dialog box, verify that the desired files are listed.
  4. Choose “Combine” > “Combine & Load.” This will open the Combine Files dialog box.
  5. In the Combine Files dialog box:
    • Select one of the files as the “sample data” used to create the queries.

Note: You can find more detailed instructions and additional resources by searching online for “import data from a folder with multiple files power query.”

Solution 3: Using openpyxl library

To extract data from multiple Excel files and consolidate it into a single file while maintaining cell locations, you can use the openpyxl library.

Here’s an improved explanation of the solution:

  1. Load the Excel files: Iterate through the files in the directory, load each spreadsheet using openpyxl, and select the first sheet.
  2. Extract data: For each file, create a dictionary to store the extracted data. Use specific cell references (e.g., “B4”, “B5”) to retrieve the contact name, company name, phone number, and email address.
  3. Append to records: Add the extracted data from each file to a list of dictionaries (records).
  4. Create new workbook and worksheet: Create a new workbook and an active worksheet (ws) within it.
  5. Write header row: Write the header row by appending the column names from the first record to the worksheet.
  6. Write records: Iterate through the records list and append each record’s values to the worksheet, maintaining the same order as the header.
  7. Save and launch the consolidated file: Save the workbook as “combined.xlsx” and launch it for viewing.

Solution 4: INDIRECT Function

To extract data from multiple Excel files into a single file based on specific cell locations without using Power Query or VBA, you can utilize the INDIRECT function:

  1. Obtain File List: Create a new Excel file to store the extracted data. Make a list of the target Excel files’ paths in column A and their filenames in column B.
  2. INDIRECT Formula: In cell C1 of the new file, enter the following formula:
    =INDIRECT("'"&A1&"\["&B1&"]Sheet1'!$B$4")
  3. Drag Formula: Drag the formula in C1 down to the desired number of rows to extract data from multiple cells in each file.

This formula works by combining the path and filename from columns A and B with the desired cell reference (in this example, Sheet1!$B$4) using the INDIRECT function. The result is the value from the specified cell in the target Excel file.

Video Explanation:

The following video, titled "Easiest way to COMBINE Multiple Excel Files into ONE (Append ...", provides additional insights and in-depth exploration related to the topics discussed in this post.

Play video

what if I have 22 excel files with multiple sheets inside of each excel with different headers ...how to combine all that data in one sheet.