Open Source Intelligence
Open Source tools and techniques for information gathering
Background
Open Source Intelligence (OSINT) is the process of gathering information about a target from publicly available sources, such as websites, social media, government databases, and other sources, in order to aid with penetration testing.
Techniques
There are many techniques for gathering OSINT that include utilizing search engines, social media, public databases, forums, doument sharing websites, and the darkweb.
Search Engines
Google, Bing, Yahoo, and other search engines are some of the most commonly used tools for OSINT. They allow you to search for information on a particular topic or individual.
See Google Dorks for information on using Google for OSINT
Search Engine News aggregators: News aggregators such as Google News, Yahoo News, and the BBC News website can be used to search for news articles on a particular topic or individual.
Reverse image search engines: Reverse image search engines such as Google Images, TinEye, and Karma Decay allow you to search for other instances of a particular image on the Internet.
Online maps: Online maps such as Google Maps and Bing Maps can be used to gather information about a particular location or to visualize geographical data.
Social Media
Social media platforms such as Twitter, Facebook, Instagram, and LinkedIn provide a wealth of information about individuals and organizations.
Tools such as Social Mention, Brand24, and Hootsuite can be used to monitor social media for specific keywords and mentions.
Public Databases
Publicly available databases such as the Securities and Exchange Commission (SEC) EDGAR database, the United States Patent and Trademark Office (USPTO) database, and the General Services Administration (GSA) database can provide information about companies and individuals.
Forums
Online forums and discussion boards such as Reddit, Quora, and Yahoo Answers can provide valuable insights into a particular topic or individual.
Document Sharing Websites
*Websites such as Scribd, Slideshare, and Docstoc can be used to find publicly available documents such as presentations, reports, and white papers. GitHub can also be a source of valuable OSINT.
Darkweb
The dark web, also known as the deep web, is a portion of the internet that is not indexed by search engines and can only be accessed using special software such as Tor. While it is often associated with illegal activities, the dark web can also be a valuable source of information for OSINT.
Email Address Enumeration
site:organisation.com intext:@organisation.com
https://github.com/sherlock-project/sherlock
https://haveibeenpwned.com/
Tools
The following tools can be used to gather OSINT, including Google Dorks, Malteo, OSINT Framework, TheHarvester, Recon-ng, OWASP Amass, SpiderFoot, Shodan, Whois Lookup Tools, WhoisXML API, and URL Scanners.
Google Dorks
Google Dorks are advanced search queries that can be used to find specific types of information on the Internet.
Here is a simple list of Google Search Operators:
cache:
cache:www.google.com web
Shows the version of the web page that Google has in its cache
link:
link:www.google.com
List webpages with links pointing to the Google homepage
related:
related:www.google.com
List web pages that are similar to the Google homepage
info:
info:www.google.com
Shows information about the Google homepage
define:
define:google
Provides a definition of the words you enter after it
stocks:
stocks: intc yhoo
Will show information about Intel and Yahoo, use ticker
site:
help site:com
Restricts the results to those websites in the given domain
allintitle:
allintitle: google search
Restricts the results to those with all of the query words in the title
intitle:
intitle:google
Restricts the results to documents containing that word in the title
allinurl:
allinurl: foo/bar
Restricts the results to those with all of the query words in the url
inurl:
inurl:google
Restricts the results to documents containing that word in the url
Resources
Exploit-DB Google Hacking Database: This is a comprehensive database of Google Dorks, maintained by Offensive Security.
Google Dorking Tool: A free online tool that generates Google Dorks based on keywords and search parameters.
Maltego
Maltego is an OSINT tool that allows you to visualize and analyze relationships between people, companies, domains, IP addresses, and other entities.
The community edition of Maltego can be used for free by following this Maltego guide
OSINT Framework
OSINT Framework is a comprehensive list of OSINT resources, including websites, databases, and tools, organized by category.
TheHarvester
TheHarvester is an OSINT tool included in Kali that allows you to gather information about a target from various public sources such as search engines, social media, and other websites. The tool gathers emails, names, subdomains, IPs and URLs using multiple public data sources that include:
Here is an example of TheHarvester usage where
-d kali.org
specifies the domain to search,-l 500
limits the search to 500 results, &-b google
specifies the data source to use:
SimplyEmail
SimplyEmail can be used to enumerate email addreses. SimplyEmail can then verify the discovered email addresss after gathering. The tool works better when using proxies or long throttle times to avoid Captchas.
Recon-ng
Recon-ng is a web reconnaissance framework included in Kali that allows you to gather information about a target from various sources such as search engines, social media, and other websites. The Recon-ng wiki is found here: https://github.com/lanmaster53/recon-ng/wiki
OWASP Amass
OWASP Amass is an OSINT tool that allows you to gather information about a target's network, including subdomains, IP addresses, and other assets.
Resources
Installation Guide: https://github.com/OWASP/Amass/blob/master/doc/install.md
SpiderFoot
SpiderFoot is an OSINT tool that allows you to gather information about a target from various sources such as search engines, social media, and other websites.
Resources
SpiderFoot GitHub Repo: https://github.com/smicallef/spiderfoot
SpiderFoot Guide: https://nixintel.info/osint-tools/getting-started-with-spiderfoot/
Shodan
Shodan is a search engine for Internet-connected devices and can be used to gather information about a target's network, including IP addresses, open ports, and running services.
Shodan General Filters
Name
Description
Type
after
Only show results after the given date (dd/mm/yyyy) string
string
asn
Autonomous system number string
string
before
Only show results before the given date (dd/mm/yyyy) string
string
category
Available categories: ics, malware string
string
city
Name of the city string
string
country
2-letter country code string
string
geo
Accepts between 2 and 4 parameters. If 2 parameters: latitude,longitude. If 3 parameters: latitude,longitude,range. If 4 parameters: top left latitude, top left longitude, bottom right latitude, bottom right longitude.
string
hash
Hash of the data property integer
integer
has_ipv6
True/ False boolean
boolean
has_screenshot
True/ False boolean
boolean
hostname
Full hostname for the device string
string
ip
Alias for net filter string
string
isp
ISP managing the netblock string
string
net
Network range in CIDR notation (ex. 199.4.1.0/24) string
string
org
Organization assigned the netblock string
string
os
Operating system string
string
port
Port number for the service integer
string
postal
Postal code (US-only) string
string
product
Name of the software/ product providing the banner string
string
region
Name of the region/ state string
string
state
Alias for region string
string
version
Version for the product string
string
vuln
CVE ID for a vulnerability string
string
Shodan HTTP Filters
Name
Description
Type
http.component
Name of web technology used on the website
string
http.component_category
Category of web components used on the website
string
http.html
HTML of web banners
string
http.html_hash
Hash of the website HTML
integer
http.status
Response status code
integer
http.title
Title for the web banners website
string
Shodan NTP Filters
Name
Description
Type
ntp.ip
IP addresses returned by monlist
string
ntp.ip_count
Number of IPs returned by initial monlist
integer
ntp.more
True/ False; whether there are more IP addresses to be gathered from monlist
boolean
ntp.port
Port used by IP addresses in monlist
integer
Shodan SSL Filters
Name
Description
Type
has_ssl
True / False
boolean
ssl
Search all SSL data
string
ssl.alpn
Application layer protocols such as HTTP/2 ("h2")
string
ssl.chain_count
Number of certificates in the chain
integer
ssl.version
Possible values: SSLv2, SSLv3, TLSv1,TLSv1.1, TLSv1.2
string
ssl.cert.alg
Certificate algorithm
string
ssl.cert.expired
True / False
boolean
ssl.cert.extension
vNames of extensions in the certificate
string
ssl.cert.serial
Serial number as an integer or hexadecimal string
integer / string
ssl.cert.pubkey.bits
Number of bits in the public key
integer
ssl.cert.pubkey.type
Public key type
string
ssl.cipher.version
SSL version of the preferred cipher
string
ssl.cipher.bits
Number of bits in the preferred cipher
integer
ssl.cipher.name
Name of the preferred cipher
string
Shodan Telnet Filters
Name
Description
Type
telnet.option
Search all the options
string
telnet.do
The server requests the client do support these options
string
telnet.dont
The server requests the client to not support these options
string
telnet.will
The server supports these options
string
telnet.wont
The server doesnt support these options
string
Shodan Resources
Whois Lookup Tools
Whois lookup tools allow you to gather information about the owner of a domain name or IP address.
WhoisXML API
WhoisXML API is a database of information about domains, IP addresses, and other network assets.
URL Scanners
URL scanners such as VirusTotal and URLScan allow you to scan a website for potential security threats and to gather information about the website's content and structure.
GitLeaks
Gitleaks is a SAST tool for detecting and preventing hardcoded secrets like passwords, api keys, and tokens in git repos. Gitleaks is an easy-to-use, all-in-one solution for detecting secrets, past or present, in your code.
Installation:
For usage see GitLeak on GitHub
Metagoofil
Metagoofil is an information gathering tool designed for extracting metadata of public documents (pdf,doc,xls,ppt,docx,pptx,xlsx) belonging to a target.
Metagoofil is included in Kali Linux. The help menu can be launched running:
FOCA
FOCA (Fingerprinting Organizations with Collected Archives) is an application used mainly to find metadata and hidden information in the documents it scans. These documents may be on web pages, and can be downloaded and analysed with FOCA.
FOCA is available here.
Resources
Last updated
Was this helpful?