When analyzing data, we often encounter the need to find information based on the value of another data point. For example, let's say you need to look up the price of a part based on the SKU, or the hometown of an employee based on the employee ID. You may be used to calling this a “VLOOKUP” if using a spreadsheet (such as Microsoft Excel or Google Sheets) or a “JOIN” if you are frequently using databases.
At Gigasheet, we’ve made this way easier for you with our Cross File VLookup function! You don't have to remember any syntax or type in a formula. Simply follow the easy prompts and voila, you will have the data that you are looking for. Plus, it can handle really huge files that are too big for Excel or Sheets and under normal circumstances would require a database.
The best way to illustrate how it works is to walk through an example. Below, we are going to use two files:
You can see both files in the first two rows of our Gigasheet library:
The Sales Leads file contains information on 1.7M leads, including first name, last name, corporation, and location. We think some of these leads probably work for companies that are in the Fortune 500, but a lot of them aren't. This is a common use case for sales or marketing teams, taking a list of leads and seeing how many match your account-based marketing or targeting list. In this case, we'll try to match companies in the Fortune 500.
The Fortune 500 file is a comma delimited file (CSV) that we found online lising the Fortune 500 from 2018. If we click to view the contents, we see that the file contains a bunch of data such as Twitter handles, names of CEOs, sector, industry, headquarter location, etc.
We are going to use Gigasheet to identify which sales leads are in the Fortune 500. For those that match between files using the company name, we are going to bring additional information back into the Sales Lead file.
Begin in the Sales Leads file by clicking on the file name. Once opened, we navigate to the Corp Name column that we are going to use to match against the Fortune 500 file. By clicking on the column heading, we can access a menu that contains Apply Functions and then selecting Cross File VLookup.
The lookup function already selected "Corp Name" since we chose it from the column menu. Let’s match this column with one in the List of Fortune 500 companies. Once that file is selected, Gigasheet will display all of the available columns and we will choose the column to match. Since we want to match the corporate name with the company name, we’ll choose "Company Name".
Once we've selected the columns to match, there will be additional options. We’re going to select “Near Match”, which ignores the punctuation and white space. The matching algorithm will now ignore dots, dashes, commas, or variations in capitalization in the company name, and treat it as an exact match.
Once a match has occurred, we need to indicate what we want returned from the Fortune 500 file and added to the Leads file. We can either choose to bring back a value (e.g. address) or a column that just indicates if a match occurred with the values True/False. In our example, a true would indicate that the company is in the Fortune 500.
We are interested in the sector and the industry for each matching Fortune 500 company. Of course, you could choose any additional columns in the Fortune 500 file. Then we'll hit apply. A new column will be added in the Sales Leads file for each column that we selected.
That's it! New columns have now been added to the Sales Leads file. They will be populated with values from the Fortune 500 file wherever there was a match on the company name. We can see that out of our list of 1.7 million leads, just over 91,000 are in the Fortune 500.
If we apply a grouping, using the group feature, we can see that "Financials" is the most matched sector. We can then apply a second grouping on the industry to see that "Commercial Banks" is the leading industry within financials.
And that is how to do Cross File VLookup in Gigasheet.