Favicons are the small icons that you see in the browser tab next to the website title or in your bookmarks. For example, the Shodan logo on the left side of the browser tab is the favicon:
They typically contain the logo of the company which gives them 2 functions:
- An easy way to find the tab of a website when you have multiple open tabs.
- A sense of authenticity that the website you’re visiting belongs to the right company.
Shodan collects the favicons for websites and stores the information in the http.favicon
property:
{
"data": "AAABAAIAEBAAAAAAIABoBAAAJgAAACAgAAAAACAAqBAAAI4EAAAoAAAAEAAAACAAAAABACAAAAAAnAEAEAAAAAAAAAAAAAAAAAAAAAAAA////Af///wH///8B////Af///wH///8B////ASWX/RUne/MXn////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////ASan/EEnnbvThJm/05yai/EP///8B////Af///wH///8B////Af///wH///8B////Af///wH///8BJqP8Ayajn/I0mi/v5J1Tt/SdT7Psmi/z5JqL8j////wH///8B////Af///wH///8B////...",
"hash": 516963061,
"location": "https://about.gitlab.com:443/ico/favicon.ico"
}
-
data
contains the image as a base64-encoded string. -
hash
is the MurmurHash3 of thedata
property. The Shodan API has a search filter calledhttp.favicon.hash
to search based on this value. -
location
lets you know where the favicon was found. Historically, thefavicon.ico
file was located at the root of the web server but it can be put in any arbitrary location by referencing it in the HTML. For example:html
<link rel="icon" type="image/png" href="https://blog.shodan.io/assets/favicon-yellow-018213ceb87b472388095d0264be5b4319ef47471dacea03c83ecc233ced2fd5.png" />
At Shodan, we developed the technique of hashing the favicon to make it possible to search across the Internet for identical favicon images. We developed it nearly a decade ago to help with 2 use cases:
- Identify Phishing Websites: bad actors will commonly use the same favicon as the website they’re imitating. By searching for the favicon of a company you can identify potential phishing websites.
- Origin IP Disclosure: websites that are hosted behind a CDN (ex. Cloudflare) should restrict access to their web server to only accept connections from the CDN. By searching for the favicon of a website you can confirm that a website has been correctly configured and isn’t responding to requests from its origin IP.
The favicon hash is calculated by applying the MurmurHash3 algorithm to the http.favicon.data
property on the banner.
Why MMH3? The key considerations when we developed the technique were speed of the hashing algorithm and size of the resulting hash. We didn’t need the cryptographic guarantees of MD5/ etc.
favscan
We provide a simple tool called favscan
that calculates the favicon hash given a URL, hostname or local file path.
$ favscan -h
Calculate the favicon hash of a local file, hostname or URL
Usage: favscan [OPTIONS] <LOCATION>
Arguments:
<LOCATION>
Options:
-v, --verbose
-h, --help Print help
-V, --version Print version
favscan
will first look for the favicon in the common /favicon.ico
path and if that fails it will check the frontpage for a shortcut icon
link. The tool is available for download across many platforms:
For example, to get the favicon hash for google.com
you would run:
favscan google.com
You can also specify ports as part of the URL:
favscan https://test.shodan.io:6993
Or calculate it for a local file:
favscan favicon.ico
Example
Lets say we want to find public instances of Gitlab using favicons. We start off by grabbing the favicon hash of a known Gitlab instance:
$ favscan gitlab.com
1265477436
We then take that hash and use it in a search query of:
http.favicon.hash:1265477436
The search query can be used on the website, CLI or API. For now, lets just see how many instances there are based on the favicon:
$ shodan count http.favicon.hash:1265477436
29558
And this is what it looks like on the website:
https://www.shodan.io/search/report?query=http.favicon.hash%3A1265477436
Note: Shodan already fingerprints Gitlab services so you can search for
product:gitlab
instead of using favicons.