Hi, these are the notes I took while watching the “Practical recon techniques for bug hunters & pen testers” talk given by Bharath Kumar on LevelUp 0x02 / 2018.
Links
- Video
- Github repo: slides & scripts mentioned in this talk
About
This talk is about some practical recon techniques for bug hunters & pentesters. It’s a continuation of Bharath’s talk about niche subdomain enumeration techniques.
Demo environment
- Nameservers:
- ns1.insecuredns.com
- ns2.insecuredns.com
- Domains:
- totallylegit.in
- insecuredns.com
- Bharath authorizes running only the DNS & DNSSEC attacks mentioned later against them
What is recon?
Reconnaissance is the act of gathering preliminary data or intelligence on your target. The data is gathered in order to better plan for your attack. Reconnaissance can be performed actively or passively.
What to look for during recon?
- Info to increase attack surface (domains, net blocks)
- Credentials (email, passwords, API keys)
- Sensitive information
- Infrastructure details (technologies used)
Enumerating domains
- Goal: find/correlate all domain names owned by a single entity
- Types of domain correlation:
- Horizontal: Find domains related to a domain
- Vertical: Find subdomains
- Source: https://0xpatrik.com/asset-discovery/
Subdomain enumeration
Using popular search engines
- Use advanced search operators of Google & Bing:
site:
for vertical correlation- Automated with tools like sublist3r
ip:
for horizontal correlation- Useful if the app is on shared hosting
Using 3rd party information aggregators
VirusTotal
- VirusTotal runs its own passive DNS replication service, built by storing DNS resolutions performed when visiting URLs submitted by users
- Can be queried using API: https://www.virustotal.com/ui/domains/example.com/subdomains?limit=40
- Tool to automate it: virustotal_subdomain_enum.py
python virustotal_subdomain_enum.py example.com 20
- Tip: Use shell functions to quickly perform some recon tasks (add them to
~/.bashrc
)declare -f find-subdomains-vt find-subdomains-vt() { curl -s https://www.virustotal.com/ui/domains/$1/subdomains\?limit\=$2 | jq.data[].id }
find-subdomains-vt eff.org 20
Viewdns.info
- Handy service for recon (DNS, WHOIS, reverse WHOIS…)
- Use emails addresses found for horizontal domain correlation
Certificate Transparency
- Created to mitigate attacks againt CAs
- Under CT, a CA has to publish all SSL/TLS certificates they issue in a public log
- Anyone can look through the CT logs & find certificates issued for a domain
- Details of known CT log files
- Certificate Transparency Part 2 — The bright side
Certificate Transparency - Side effect
- CT logs by design contain all the certificates issued by a CA for any given domain
- They allow attackers to passively gather a lot of information about an organization’s infrastructure: domains including internal domains, subdomains, email addresses
- Certificate Transparency Part 3— The dark side
Searching through CT logs
- CT logs search engines:
- https://crt.sh/
- Search query:
%eff.org
- Script to extract subdomains from crt.sh: crtsh_enum_psql.sh
python crtsh_enum_psql.py eff.org
- Search query:
- https://censys.io/
- https://developers.facebook.com/tools/ct/
- Use Certificate Transparency Monitoring functionality to keep track of an organization’s subdomains via email notifications
- https://google.com/transparencyreport/https/ct/
- https://crt.sh/
- Run your own CT logs monitor
- Certstream
- Real-time CT log update stream
Downside of CT for recon
- CT logs are append-only => There is no way to delete existing entries
- => Many false positives: Domain names found in CT logs may not exist anymore & thus can’t be resolved to an IP address
- Not all subdomains found will be accessible, some are internal domains. But they would be useful for internal pentests
- A penetration tester’s guide to subdomain enumeration
Solution: CT logs + Massdns
- Use Massdns with CT logs script to quickly identify resolvable domain names
python3 ct.py example.com | ./bin/massdns -r resolvers.txt -t A -a -o -w results.txt -
Using Certspotter
- Does vertical & horizontal domain correlation
find-cert() { curl -s https://certspotter.com/api/v0/certs?domain=$1 | jq -c '.[].dns_names' | grep -o '"[^"\+"'; }
Using Certdb.com
- Crt.sh gets data from CT logs only where “legit” CA submit the certs to a log
- https://certdb.com not only gets certificates from CT logs, but also scans the IPv4 address space, domains and “finds & analyzes” all the certificates
curl -L -sd "api_key=API-KEY&q=Organization:\"tesla\"&response_type=0
Finding vulnerable CMS using CT
- CT logs can expose (in real time) HTTPS apps during their CMS (Wordpress, Joomla…) install process where the installer has no form of authentication
- => If you are fast enough, you could take over the server
- This is a known attack technique
- Abusing Certificate Transparency or How to hack Web applications before installation. by Hanno Böck at Defcon 25
- Modern Internet-Scale Network Reconnaissance by HD Moore at BSidesLV 2017
Censys.io
- Aggregates SSL certificates from CT logs & the results of SSL scans on IPv4 address space
- Good source of domains & email addresses
- censys_subdomain_enum.py
- Tool to extract domains/emails from SSL/TLS certs using Censys
python censys_enumeration.py --verbose domains.txt
- Censys.io Guide: Discover SCADA and Phishing Sites
Content Security Policy (CSP)
- CSP HTTP headers allow devs to create a whitelist of sources of trusted content & instruct the browser to only execute or render resources from those sources
- => It’s basically a list of domains!
- Extract domains from CSP headers with domains-from-csp
python csp_parser.py https://flipkart.com
python csp_parser.py https://flipkart.com -r
to resolve the domains
Sender Policy Framework (SPF)
- DNS record used to indicate to receiving mail exchanges which hosts are authorized to send mail for a given domain
- => Lists all hosts authorized to send email on behalf of a domain
- => Exposes subdomains & IP ranges
dig +short TXT icann.org | grep spf
- Extract net blocks/domains from SPF record with assets-from-spf
python assets_fom_spf.py reddit.com
python assets_fom_spf.py reddit.com --asn | jq .
=> returns ASN info of all the assets
DNSSEC
- DNS zone transfer (misconfiguration rarely found today)
dig AXRF @ns1.insecuredns.com totallylegit.in
- Authenticated Denial of Existence (RFC 7129)
- In DNS, when clients query for a non-existent domain, the server must deny its existence. It’s harder to do in DNSSEC do to cryptographic signing. It can be done using NSEC or NSEC3 records
- => DNSSEC zone walking is the new zone transfer
- If DNSSEC is enabled on your target & NSEC records are used you’ll get all the domains:
- Install Ldnsutils on Kali/Debian/Ubuntu:
sudo apt-get install ldnsutils
- Try zone walking NSEC - LDNS
ldns-walk
(part ofldnsutils
) can be used to zone walk DNSSEC signed zone that uses NSECldns-walk iana.org
ldns-walk @ns1insecuredns.com totallylegit.com
- Install Ldnsutils on Kali/Debian/Ubuntu:
- NSEC3 records are like NSEC records but provide a signed gap of hashes of domain names, to prevent zone enumeration or make it expensive
- An attacker can still collect all the subdomain hashes & crack them offline using Nsec3walker & nsec3map
- Example of zone walking NSEC3 protected zone :
# Detect if DNSSEC NSEC or NSEC3 is used
ldns-walk icann.org
# Collect NSEC3 hashes of a domain
$ ./collect insecuredns.com > insecuredns.com.collect
# Undo the hashing, expose the subdomain information
$ ./unhash insecuredns.com.collect > insecuredns.com.unhash
# Check the number of successfully cracked subdomain hashes
$ cat insecuredns.com.unhash | grep "icann" | wc -l
# List only the subdomain part from the unhashed data
$ cat icann.org.unhash | grep "icann" | awk '{print $2;}'
Few things that changed with the advent of APIs/Devops
- Storage
- Authentication
- API keys & token based authentication instead of username & password
- More & more code
- CI/CD pipelines
Cloud storage
- Cloud storage has become popular (inexpensive & easy to setup), especially object/block storage
- Object storage is ideal for storing static, unstructured data like audio, video, documents, images, logs & large amounts of text
- Popular object storage services:
- AWS S3 buckets
- Digital Ocean Spaces
What’s the catch with object storage?
- It’s a treasure trove of information
- Users store anything on 3rd-party services, including passwords in plain text files, log files, backup files…
Amazon S3 buckets
- AWS S3 is an object storage service by Amazon
- Buckets allow users to store & serve large amounts of data
- System Shock: How A Cloud Leak Exposed Accenture’s Business
Hunting for publicly accessible S3 buckets
- Users can store Files (Objects) in a Bucket
- Each Bucket will get a unique, predictable URL
- Just go to the Bucket’s URL to find out if it’s a public or private Bucket
- Each file in a Bucket will get a unique URL
- There are Access Control Mechanisms available at both Bucket & Object level
- Finding buckets
- Google Dorks
site:s3.amazonaws.com file:pdf
site:s3.amazonaws.com password
- Do a dictionary based attack (since Buckets have a predictable URL)
- AWSBucketDump
- Slurp
./slurp-linux-amd64 keyword -t paytm
./slurp-linux-amd64 certstream
- Google Dorks
Digital Ocean Spaces
- Spaces is an object storage service by DigitalOcean, similar to AWS S3 buckets
- Spaces API aims to be interoperable with Amazon’s AWS S3 API
- => All tools & attacks used to test AWS S3 can be used against Spaces!
- Spaces URL pattern
- Users can store Files in a “Space”
- Each Space will get a unique, predictable URL
- Each file in a Space will get a unique URL
- Access Control Mechanisms are available at Space & file level
Hunting for publicly accessible Spaces
- A Space is typically considered:
- “public” if any user can list the contents of the Space
- “private” if the Space’s contents can only be listed or written by certain users (When access, gives AccessDenied errors)
Spaces finder
- Spaces finder is a tool to look for publicly accessible Digital Ocean Spaces using a wordlist, list all the accessible files on a public Space & download the files
- It’s AWSBucketDump tweaked to work with DO Spaces, since Spaces API is interoperable with Amazon’s S3 API
python3 spaces_finder.py -l sample_spaces.txt -g interesting_keywords.txt -D -m 500000 -t 2
Authentication
- With almost every service exposing an API, keys have become critical in authenticating
- API keys are treated as keys to the kingdom
- For apps, API keys tend to be Achilles heel: APIs are 2FA’s Achilles Heel
Code repos for recon
- Code repos (Github, GitLab, Bitbucket…) are a treasure trove during recon
- They can reveal a lot: credentials, potential vulnerabilities, infrastructure details
Github for recon
- Github has a powerful search feature with advanced operators
- It has a well designed REST API that can be used to automate your Github recon
- Github for Bug Bounty Hunters
Things to focus on in Github
- Repositories
- Code
- Commits (Bharat’s favorite!)
-
Issues
- Examples:
- “Delete the private ssh” commit message
- “Multiple XSS vulnerabilities” issue
Mass cloning on Github
- Clone all the target organization’s repos & analyze them locally
- Use GithubCloner
python githubcloner.py --org organization -o /tmp/output
Static code analysis
Manual search
- Try to understand the code, language used & architecture
- Look for keywords & patterns:
- API and key (To get more endpoints & find API keys)
- token
- secret
- vulnerable
- http://
- Tools for finding secrets
Github Dorks
- The new Google dorks
- Github search is powerful & can be used to find sensitive data on repos
- Collection of Github dorks
- Github-dorks: Tool to run Github dorks against a repo
Passive recon using public datasets
- Various projects gather Internet wide scan data & make it public, including: port scans, DNS data, SSL/TLS cert data, data breach dumps…
- Find your needle in the haystack
Why use public data sets for recon?
- To reduce dependency on 3rd party APIs & services (be paranoid!)
- To reduce active probing of target infrastructure
- More the sources better the coverage (better understanding of the history of the organization & how its infrastructure evolved)
- Build your own recon platforms
Data sources
Name | Description | Price |
---|---|---|
Sonar | FDNS, RDNS, UDP, TCP, TLS, HTTP, HTTPS scan data | FREE |
Censys.io | TCP, TLS, HTTP, HTTPS scan data | FREE (non-commercial) |
Censys.io | TLS | FREE |
CZDS | DNS zone files for “new” global TLDs | FREE |
ARIN | American IP registry information (ASN, Org, Net, Poc) | FREE |
CAIDA PFX2AS IPv4 | Daily snapshots of ASN to IPv6 mappings | FREE |
US Gov | Daily snapshots of ASN to IPv6 mappings | FREE |
US Gov | US government domain names | FREE |
UK Gov | UK government domain names | FREE |
RIR Delegations | Regional IP allocations | FREE |
PremiumDrops | DNS zone files for com/net/info/org/biz/xxx/sk/us TLDs | $24.95/mo |
WhoisXMLAPI.com | New domain whois data | $24.95/mo |
Source: https://github.com/hdm/inetdata
Rapid7 Forward DNS dataset
- FDNS is a massive dataset, 20+ GB compressed & 300+GB uncompressed
- Rapid7 publishes it on scans.io
- It aims to discover all domains found on the Internet
- There is also RDNS dataset
Hunting subdomain in FDNS dataset
- The data format is a gzip-compressed JSON file
- Use jq to extract subdomains of a specific domain:
curl --silent -L https://scans.io/data/rapid7/sonar.fdns_v2/20170417-fdns.json.gz | pigz -dc | head -n 10 | jq .
cat 20170417-fdns.json.gz | pigz -dc | grep "\.example\.com"
- https://sonar.labs.rapid7.com
- Analyzing the results:
# Extract subdomain names for a given domain from FDNS data
cat 20170417-fdns.json.gz | pigz -dc | grep "\.example\.com" | jq .name > example.com.domains.fdns
# Display first 15 subdomains from all the unique subdomains gathered
cat example.com.domains.fdns | grep "\.example\.com" | uniq | head -n 15
# Total number of unique subdomains enumerated
cat example.com.domains.fdns | grep "\.example\.com" | uniq | wc -l
- These methods are slow. See HD Moore’s talk on how to do faster lookups
Takeways
- Subdomain enumeration cheat sheet
- The Art of Subdomain enumeration Gitbook
- Comparison on the number of unique, resolvable subdomains of icann.org found by each tool/technique:
Tool | Number of subdomains | What the tool finds |
---|---|---|
Sublist3r | 278 | Apps indexed by search engines |
CT logs (crt.sh) | 46 | Domains that have SSL/TLS certs on CT logs |
Zone walking NSEC3 | 182 | Depends on your computation power. Can’t find domains named very unconventionnally |
FDNS dataset | 3681 | Doesn’t have these restrictions (app being hosted or having an SSL/TLS cert…) |
References
- https://www.certificate-transparency.org/
- https://www.cloudflare.com/dns/dnssec/how-dnssec-works/
- https://www.cloudflare.com/dns/dnssec/dnssec-complexities-and-considerations/
- https://info.menandmice.com/blog/bid/73645/Take-your-DNSSEC-with-a-grain-of-salt
- https://opendata.rapid7.com/sonar.fdns_v2/
See you next time!
Comments