How To
Feb 6, 2023

Identify Fortune 500 Email Domains Using Gigasheet

One of our brilliant customers threw a great request to the support@gigasheet.com line the other day:

“Hey Gigasheet! Love your product, and I’m hoping you can help me out. I’ve got a couple million email addresses as contacts in our CRM, and my boss wants to know how many Fortune 500 companies we are talking to. Can you help me answer?”

We do love a challenge! In this blog, we are going to take you through how we came up with this resource, but if you just want the file, you can open it for free in Gigasheet:

Get it here 👉 Fortune 500 Email Domains List 3,312 Domains Associated With Fortune 500 Email Addresses

And, in case you haven't heard of us, we are the free big data spreadsheet that allows you to open and explore csv files online, when files are too big for Excel or Google Sheets.

Free Email Verification Using Enrichments

Extracting the Email Domain

The customer’s dataset had Name, Email, and SKU purchased. It looked something like this:

(these are not the customer’s real email dataset, and the data shown here is representational)

Email Address List From Salesforce

The first step is easy. How do you strip out domains on a list of that many email addresses? Well, it just so happens Gigasheet’s email verification enrichment does this very thing with a click of a button.

Email Enrichments

Select Email and "Email Format Check"

Email Validation

The email domain will appear to the right of the email column, highlighted in blue.

Email Domains of Fortune 500

Gigasheet’s Email Format Check option, as part of the free Enrichments function, produced a list of the email domains from the email addresses in the Email column in the customer’s data. Now that we have a list of domains, how do we identify if they are part of the Fortune 500?

There are a number of places online to get the Fortune 500 list, and those lists often have the main domain associated with that company. Amazon is listed with amazon.com, Disney is paired with disney.com, Ford Motors has ford.com, and so on. But Amazon owns amazon.ca, and Disney owns disneylandparis.com, and Ford owns fordlincoln.com: the problem quickly presents itself that there’s no good definitive list associating a company to all the domains it owns.

Using GPT-3 to Search for Fortune 500 websites

The OpenAI GPT-3 language learning model is, as so many have already said, an amazing achievement of our time. Ask a question, get an answer. So ask what domains a company owns, and you get this:

GTP3 Email Enrichment

Since GPT-3 has an API, it’s trivial to ask this question of every company in the Fortune 500! After cycling through the 500 companies, we had a list of 25,000 potential domains.

How Good is GPT-3? Mind-Blowing, and NOT Good Enough

In just a few weeks since release, it has been pointed out that AI language learning models do a better job telling the questioner what they want to hear then actually giving an accurate answer. The learning model knows that Delta Airlines owns delta.com and deltaairlines.com, and assumes then that Frontier Airlines owns frontier.com and frontierairlines.com. Or that a company of a certain size owns its name across many country-code Top Level Domains, and assigns those same ccTLDs (.ca, .co.in, .com.uk, etc) to companies of similar sizes. Essentially, it’s making a very educated guess.

These guesses can be wrong! Since we knew we were dealing with domains that had sent an email, we knew we could cull down the list to just those with an active mail server. The DNS information of each domain told whether there was an active mail exchange (MX) record. It was simple enough to use the dns.resolver python library to see if the domains suggested by ChatGPT had an active mail server set up. The MX response held the answer.

MX record checking Fortune 500

That culled down the list to about 3,500 known addresses.

To be perfectly clear: we do not believe this list represents the entire universe of domains owned by Fortune 500 companies. But as a first pass, involving a loop through GPT-3 and a loop through DNS records, we think it’s as good as anything out there. We’re putting the list on GitHub as a resource that we hope others will update and help maintain.

The Final Fortune 500 Email List in Gigasheet

In the meantime, we can take the list back to Gigasheet and use it in a Cross-File VLookup. You can find the list I used here:

Get it here 👉 Fortune 500 Email Domains List

Matching Domains to Build a Fortune 500 Email List

Now, let's use Cross-File lookup to go through the original customer list and search for each domain in the Fortune 500 Domain List. If there was a match, the lookup can return the name of the Fortune 500 company.  Pretty cool!

Comparing an email list against the Fortune 500 Email List

Here’s the data returned:

Resulting customer list with Fortune 500 email addresses identified

From here, it’s straightforward to filter to only the rows with a returned Fortune 500 Company:

Filtering the Fortune 500 Companies Email List

We see 695,539 of our 4.8 million records have email addresses associated with Fortune 500 domains:

Fortune 500 email List achieved by filtering on our added columns

We can do further groupings and analysis as we see fit:

Grouping the Fortune 500 Email List

Expanding a group reveals the individuals at each company.

Viewing email addresses from the Fortune 500 list

We hope this lookup can be of great help to users with similar use cases. Save A Copy of that lookup sheet into your Gigasheet library and use it! And if other users have similar challenges they need help with: you know how to reach us.

And the best part? All of this is completely free! Sign up here.

The ease of a spreadsheet with the power of a data warehouse.

No Code
No Training
No Installation
Sign Up, Free

Similar posts

By using this website, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.