Using Linux grep and Windows findstr to Manipulate Files

Published: 2023-03-31
Last Updated: 2023-04-01 14:24:24 UTC
by Guy Bruneau (Version: 1)
0 comment(s)

Over the years I have found grep to be very versatile. The most common use of grep is to find if the logs have a string that match an IP address, a domain, a service or protocol, some application was logged, etc. 

Years ago, when I initially built my first DNS Sinkhole [1], I used several combination of grep to parse and compare files to build the bind lists of domains to sinkhole. I now use Pi-Hole [2] which uses the same principals which is now managed via its interface.

My early sinkhole used a series of grep commands to compare two files. The following example use a wildcard list of country codes that were already blocked by the sinkhole against a list of known bad domains published on various websites [3]. To demonstrate how to use a list to compare and remove the blocked domains, I will use my pi-hole domain list.

First step is to create the file the filter list called; toremove, which contains the following blocked top-level domains that are already blocked (it could be as many as the organization need). Another list could be applied for domains already blocked (i.e. google.com, sans.isc):

.*\.bazar
.*\.biz
.*\xyz

Before we start, lest get a count of how may lines we have the file list.2.pihole.xxxx.ca.domains with wc -l to establish a baseline:

This picture shows there is 505196 records in this file. The options use with grep are as follow:

  • w - Select only those lines containing matches that form whole words.
  • h - Suppress the prefixing of file names on output.
  • v - Invert the sense of matching, to select non-matching lines.
  • f - Obtain patterns from FILE, one per line.

The next step is to compare the top-level domain list against a downloaded domain list:

grep -whvf /root/toremove list.2.pihole.xxxx.ca.domains 

This picture shows when grep was first run with the result above the command. Re-run of the same command and this time grepping for any domains ending with .xyz$ have been removed from the list. The $ at the end of xyz is to indicate the 'end of the line'.

Let’s recheck what we have left after removing the 3 top-domains from the list:

We now have 375686 domains left in the list. The command removed 129510 records.

Windows findstr

It is possible to repeat the same search using Windows findstr. Lest list the options used to filter the file:

  • /v                - Prints only lines that don't contain a match.
  • /g:filename - Gets search strings from the specified file.

This is how to do it:

findstr /v /g:toremove list.2.pihole.xxxx.ca.domains | findstr .xyz$

This is the options used with find (find /?) to count the number of lines left:

  • /V - Displays all lines NOT containing the specified string.
  • /C - Displays only the count of lines containing the string.
  • "" - Specifies the text string to find.

Let’s recheck to confirm that findstr (findstr /?) remove the 3 top-domains from the list:

findstr /v /g:toremove list.2.pihole.xxxx.ca.domains | find /v /c ""

This output the same result as grep: 375686 domains left in the list. The command removed 129510 records.

This highlight the versality of both of these tools to work through large amout of data quickly and still obtain the same result. This is another example of Living Off the Land Binaries (LOLBins).

[1] https://www.sans.org/white-papers/33523/
[2] https://pi-hole.net/
[3] https://raw.githubusercontent.com/jonschipp/mal-dnssearch/master/mandiant_apt1.dns
[4] https://isc.sans.edu/diary/Linux+LOLBins+Applications+Available+in+Windows/29296

-----------
Guy Bruneau IPSS Inc.
My Handler Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu

0 comment(s)

Use of X-Frame-Options and CSP frame-ancestors security headers on 1 million most popular domains

Published: 2023-03-31
Last Updated: 2023-03-31 12:57:26 UTC
by Jan Kopriva (Version: 1)
0 comment(s)

In my last Diary[1], I shortly mentioned the need for correctly set Content Security Policy and/or the obsolete[2] X-Frame-Options HTTP security headers (not just) in order to prevent phishing pages, which overlay a fake login prompt over a legitimate website, from functioning correctly. Or, to be more specific, to prevent them from dynamically loading a legitimate page in an iframe under the fake login prompt, since this makes such phishing websites look much less like a legitimate login page and thus much less effective.

Discussion of the aforementioned headers has led me to a question of how common use of these headers is and how they are commonly set. Which is what we will take a short look at today.

Although data about general trends in the use of these headers may be found online[3,4], I wanted to go a little bit more in-depth. I have therefore written a short Python script, which would go through the current Tranco list of one million most popular domains[5] and gather data about which HTTP security-related headers were used on each one (provided the domains pointed to a HTTP server).

In total, the script gathered data about 21 different headers (e.g., X-XSS-Protection, Strict-Transport-Security, Cross-Origin-Resource-Policy, etc.) and their specific settings. Since results for the other headers might be interesting as well, I might write another diary discussing those once I’ve had more time to go over the data. For now, however, let us take a look at how common the use of the two headers which may be used to set restrictions for embedding a websites in an iframe or other object is. Specifically, we will look at the use of X-Frame-Options header and the use of CSP policies containing the frame-ancestors directive (since CSP doesn’t block the behavior we are interested in – the so called “framing attacks” – without this directive in place, we will only focus on CSP headers in which the directive is present).

As you may see from the following chart, at the time of writing, over 27.1% of the top 1000 most popular domains according to the Tranco list used either one or both of the aforementioned headers to prevent embedding of their content on undesirable domains. Of the top 100k domains, it was however only 20.6% and of the top 1 million domains, it was only a little more than 14.4%.

On the following charts, you may see how different X-Frame-Options and CSP frame-ancestors directives were represented in different sample sizes.

From the available data, it appears that while the X-Frame-Options HTTP header is used on more than 13.84% of the top 1 million domains, CSP with the frame-ancestors directive is set only by 1.91% web servers hosted on such domains.

Since one can reasonably assume that the domains listed on the Tranco list would have above-average, or at least average security measures in place, it would seem that when it comes to protecting us from framing attacks (and from the aforementioned phishing pages which take advantage of them), the deprecated X-Frame-Options header is still the most commonly used mechanism on the internet... Which is supported even by results from Shodan[6,7], which, at the time of writing, detected X-Frame-Options header on more than 41 million public IPs and CSP with the frame-ancestors directive only on approximately 12 million IPs.

[1] https://isc.sans.edu/forums/diary/IPFS+phishing+and+the+need+for+correctly+set+HTTP+security+headers/29638/
[2] https://w3c.github.io/webappsec-csp/#frame-ancestors-and-frame-options
[3] https://trends.builtwith.com/docinfo/X-Frame-Options
[4] https://trends.builtwith.com/docinfo/Content-Security-Policy
[5] https://tranco-list.eu/
[6] https://www.shodan.io/search?query=HTTP+%22x-frame-options%22
[7] https://www.shodan.io/search?query=HTTP+%22content-security-policy%22+%22frame-ancestors%22

-----------
Jan Kopriva
@jk0pr
Nettles Consulting

0 comment(s)
ISC Stormcast For Friday, March 31st, 2023 https://isc.sans.edu/podcastdetail.html?id=8434

Comments


Diary Archives