One of our brilliant customers threw a great request to the support@gigasheet.com line the other day:
“Hey Gigasheet! Love your product, and I’m hoping you can help me out. I’ve got a couple million email addresses as contacts in our CRM, and my boss wants to know how many Fortune 500 companies we are talking to. Can you help me answer?”
We do love a challenge! In this blog, we are going to take you through how we came up with this resource, but if you just want the file, you can open it for free in Gigasheet:
Get it here 👉 Fortune 500 Email Domains List 3,312 Domains Associated With Fortune 500 Email Addresses
And, in case you haven't heard of us, we are the free big data spreadsheet that allows you to open and explore csv files online, when files are too big for Excel or Google Sheets.
The customer’s dataset had Name, Email, and SKU purchased. It looked something like this:
(these are not the customer’s real email dataset, and the data shown here is representational)
The first step is easy. How do you strip out domains on a list of that many email addresses? Well, it just so happens Gigasheet’s email verification enrichment does this very thing with a click of a button.
Select Email and "Email Format Check"
The email domain will appear to the right of the email column, highlighted in blue.
Gigasheet’s Email Format Check option, as part of the free Enrichments function, produced a list of the email domains from the email addresses in the Email column in the customer’s data. Now that we have a list of domains, how do we identify if they are part of the Fortune 500?
There are a number of places online to get the Fortune 500 list, and those lists often have the main domain associated with that company. Amazon is listed with amazon.com, Disney is paired with disney.com, Ford Motors has ford.com, and so on. But Amazon owns amazon.ca, and Disney owns disneylandparis.com, and Ford owns fordlincoln.com: the problem quickly presents itself that there’s no good definitive list associating a company to all the domains it owns.
The OpenAI GPT-3 language learning model is, as so many have already said, an amazing achievement of our time. Ask a question, get an answer. So ask what domains a company owns, and you get this:
Since GPT-3 has an API, it’s trivial to ask this question of every company in the Fortune 500! After cycling through the 500 companies, we had a list of 25,000 potential domains.
In just a few weeks since release, it has been pointed out that AI language learning models do a better job telling the questioner what they want to hear then actually giving an accurate answer. The learning model knows that Delta Airlines owns delta.com and deltaairlines.com, and assumes then that Frontier Airlines owns frontier.com and frontierairlines.com. Or that a company of a certain size owns its name across many country-code Top Level Domains, and assigns those same ccTLDs (.ca, .co.in, .com.uk, etc) to companies of similar sizes. Essentially, it’s making a very educated guess.
These guesses can be wrong! Since we knew we were dealing with domains that had sent an email, we knew we could cull down the list to just those with an active mail server. The DNS information of each domain told whether there was an active mail exchange (MX) record. It was simple enough to use the dns.resolver python library to see if the domains suggested by ChatGPT had an active mail server set up. The MX response held the answer.
That culled down the list to about 3,500 known addresses.
To be perfectly clear: we do not believe this list represents the entire universe of domains owned by Fortune 500 companies. But as a first pass, involving a loop through GPT-3 and a loop through DNS records, we think it’s as good as anything out there. We’re putting the list on GitHub as a resource that we hope others will update and help maintain.
In the meantime, we can take the list back to Gigasheet and use it in a Cross-File VLookup. You can find the list I used here:
Get it here 👉 Fortune 500 Email Domains List
Now, let's use Cross-File lookup to go through the original customer list and search for each domain in the Fortune 500 Domain List. If there was a match, the lookup can return the name of the Fortune 500 company. Pretty cool!
Here’s the data returned:
From here, it’s straightforward to filter to only the rows with a returned Fortune 500 Company:
We see 695,539 of our 4.8 million records have email addresses associated with Fortune 500 domains:
We can do further groupings and analysis as we see fit:
Expanding a group reveals the individuals at each company.
We hope this lookup can be of great help to users with similar use cases. Save A Copy of that lookup sheet into your Gigasheet library and use it! And if other users have similar challenges they need help with: you know how to reach us.
And the best part? All of this is completely free! Sign up here.