Firewall Lockdown Gory Details: The Nightmare Continues

In the previous blog post, we introduced a common problem facing firewall engineers: locking down a production firewall configured with an “allow-all-by-default” policy. We presented a simple firewall lockdown process (illustrated below) and discussed the challenge of using Microsoft Excel to open and analyze large spreadsheets containing hundreds of thousands to millions of rows.

In this blog post, we will continue on the discussion by presenting the next challenge: analyzing firewall logs to identify unique network connections to then translate into configurable rulesets.

When analyzing firewall logs, there really isn’t a single recipe that you can follow each and every time to identify unique network connections. In my opinion, analyzing firewall logs is more an art than it is a science, but repeatability is key. Whatever technique you use, it has to be repeatable to reduce the number of errors and inconsistencies. In its simplest form, a firewall rule contains four elements:

Source IP address/subnet
Destination IP address/subnet
Destination port
Protocol

Whatever technique you use must be able to identify all unique combinations of these four elements which you would then use to write the firewall rules.

In a future post, we’ll show you how you can complete this task in just a few clicks using Gigasheet across tens of millions or even hundreds of millions of rows. Want to help test it out? Join our beta!

What I am going to describe next is a technique that I have used in the past to analyze firewall logs and I will highlight some of my pain points along the way. In the interest of simplification, I will focus the discussion on analyzing firewall logs generated by connection-oriented protocols, such as TCP. I will also use a sample of Cisco ASA firewall logs generated in a test environment.

A Cisco ASA firewall generates at least two log messages per TCP connection: one when a connection is built, and another one when the same connection is torn down. An example of both logs can be seen below for a connection initiated by host 192.168.1.2 to host 192.168.2.1 on TCP port 443:

Built outbound TCP connection 2 for inside:192.168.1.2/31450 (192.168.1.2/31450) to outside:192.168.2.1/443 (192.168.2.1/443)
Teardown TCP connection 2 for inside:192.168.1.2/31450 (192.168.1.2/31450) to outside:192.168.2.1/443 (192.168.2.1/443) duration 0:03:00 bytes 819 TCP FINs from outside

Both logs contain similar information but the log generated when a connection is torn down contains the most valuable details for analysis because it reveals whether a connection is fully established (i.e., TCP three-way handshake) as well as the amount of data transmitted. Ultimately, the analysis should focus on identifying network connections where data is transmitted, and we can do that by analyzing the “bytes” and “TCP flags” fields in the “teardown” log, as illustrated below:

Teardown TCP connection 2 for inside:192.168.1.2/31450 (192.168.1.1/31450) to outside:192.168.2.1/443 (192.168.2.1/443) duration 0:03:00 bytes 819 TCP FINs from outside

The technique that I’ve used in the past includes the following steps:

Eliminate logs generated when TCP connections are built
From the remaining logs (or the logs generated when connections are torn down):
Eliminate logs generated by half-open connections or logs containing a TCP SYN flag, and
Eliminate logs for fully established connections where no data is transmitted, or logs with zero bytes
Identify all unique connections, or unique combinations of source IP, destination IP, and destination port, and protocol

To put this technique into practice, I start by opening a CSV export of the firewall logs in Microsoft Excel:

I then convert all the text to columns using a space and a forward slash (/) as delimiters (I use / to separate the IP address from the port and place each value in a different column):

Next, I enable the automatic filter and deselect the word “Built” from column A to remove all logs generated at connection-built time:

I then delete any columns containing irrelevant information, for example column C, E, J, O, S, and U seen in the previous screenshot:

Next, I filter out the word “SYN” from column O to remove half-opened connections. And lastly, I deselect the number “0” from column N to remove connections with no data transmitted, generating a list of network connections with data transmission.

The final step in the process involves identifying the unique network connections, or unique combinations of source IP, destination IP, and destination port, so that appropriate rulesets can be configured on the firewall. As you can see from the previous screenshot, the network connections in rows 51-52,55-57, and 63-64 would be allowed by the same firewall rule because the same two hosts are communicating over the same destination port. To complete the final step, I delete all columns except the ones containing the source IP, destination IP, and destination port (or columns D, H, and I in the previous screenshot). I then copy the remaining cells into a different Excel sheet and deduplicate all columns, the output resulting in all unique combinations of source IP, destination IP, and destination port.

Eight steps and a couple of hours later, I arrive at the unique network connections that I can now translate into firewall rulesets. Needless to say, this process is very lengthy, tedious, and prone to errors. But luckily, there is a simpler, quicker, and better way to arrive at the same results using Gigasheet.

Gigasheet’s unique data analysis and manipulation capabilities combined with its robust and fine-grained filters make analyzing and extracting valuable data from firewall logs very easy, requiring very little time and effort to complete. No more slicing and dicing spreadsheets, simply upload your CSV log extract to Gigasheet, select your filters, and let Gigasheet do the rest.

Big Data

Firewall Lockdown Gory Details: The Nightmare Continues

The ease of a spreadsheet with the power of a data warehouse.

Similar posts

Big Data

Data Teams - Work Smarter, Not Harder

Big Data

AI Meets Big Data: Leveraging Gigasheet for Retail Optimization

Big Data

Best Practices for Mastering Self-Service Analytics

Webinar: AI for Big Data Analysis
Aug 7 @ 1PM ET

Big Data

Firewall Lockdown Gory Details: The Nightmare Continues

Luciana Obregon

The ease of a spreadsheet with the power of a data warehouse.

Similar posts

Big Data

Data Teams - Work Smarter, Not Harder

Big Data

AI Meets Big Data: Leveraging Gigasheet for Retail Optimization

Big Data

Best Practices for Mastering Self-Service Analytics

Webinar: AI for Big Data AnalysisAug 7 @ 1PM ET

Webinar: AI for Big Data Analysis
Aug 7 @ 1PM ET