Conference notes: Practical recon techniques for bug hunters & pen testers (LevelUp 0x02 / 2018)

Hi, these are the notes I took while watching the “Practical recon techniques for bug hunters & pen testers” talk given by Bharath Kumar on LevelUp 0x02 / 2018.



This talk is about some practical recon techniques for bug hunters & pentesters. It’s a continuation of Bharath’s talk about niche subdomain enumeration techniques.

Demo environment

  • Nameservers:
  • Domains:
  • Bharath authorizes running only the DNS & DNSSEC attacks mentioned later against them

What is recon?

Reconnaissance is the act of gathering preliminary data or intelligence on your target. The data is gathered in order to better plan for your attack. Reconnaissance can be performed actively or passively.

What to look for during recon?

  • Info to increase attack surface (domains, net blocks)
  • Credentials (email, passwords, API keys)
  • Sensitive information
  • Infrastructure details (technologies used)

Enumerating domains

  • Goal: find/correlate all domain names owned by a single entity
  • Types of domain correlation:


Subdomain enumeration

  • Use advanced search operators of Google & Bing:
    • site: for vertical correlation
      • Automated with tools like sublist3r
    • ip: for horizontal correlation
      • Useful if the app is on shared hosting

Using 3rd party information aggregators


  • VirusTotal runs its own passive DNS replication service, built by storing DNS resolutions performed when visiting URLs submitted by users
  • Can be queried using API:
  • Tool to automate it:
    • python 20
  • Tip: Use shell functions to quickly perform some recon tasks (add them to ~/.bashrc)
    declare -f find-subdomains-vt
    find-subdomains-vt() {
      curl -s$1/subdomains\?limit\=$2 |[].id

find-subdomains-vt 20

  • Handy service for recon (DNS, WHOIS, reverse WHOIS…)
  • Use emails addresses found for horizontal domain correlation

Certificate Transparency

Certificate Transparency - Side effect

  • CT logs by design contain all the certificates issued by a CA for any given domain
  • They allow attackers to passively gather a lot of information about an organization’s infrastructure: domains including internal domains, subdomains, email addresses
  • Certificate Transparency Part 3— The dark side

Searching through CT logs

Downside of CT for recon

  • CT logs are append-only => There is no way to delete existing entries
  • => Many false positives: Domain names found in CT logs may not exist anymore & thus can’t be resolved to an IP address
  • Not all subdomains found will be accessible, some are internal domains. But they would be useful for internal pentests
  • A penetration tester’s guide to subdomain enumeration

Solution: CT logs + Massdns

  • Use Massdns with CT logs script to quickly identify resolvable domain names
  • python3 | ./bin/massdns -r resolvers.txt -t A -a -o -w results.txt -

Using Certspotter

  • Does vertical & horizontal domain correlation
    find-cert() {
      curl -s$1 | jq -c '.[].dns_names' | grep -o '"[^"\+"';


  • gets data from CT logs only where “legit” CA submit the certs to a log
  • not only gets certificates from CT logs, but also scans the IPv4 address space, domains and “finds & analyzes” all the certificates
  • curl -L -sd "api_key=API-KEY&q=Organization:\"tesla\"&response_type=0

Finding vulnerable CMS using CT

Content Security Policy (CSP)

  • CSP HTTP headers allow devs to create a whitelist of sources of trusted content & instruct the browser to only execute or render resources from those sources
  • => It’s basically a list of domains!
  • Extract domains from CSP headers with domains-from-csp
    • python
    • python -r to resolve the domains

Sender Policy Framework (SPF)

  • DNS record used to indicate to receiving mail exchanges which hosts are authorized to send mail for a given domain
  • => Lists all hosts authorized to send email on behalf of a domain
  • => Exposes subdomains & IP ranges
  • dig +short TXT | grep spf
  • Extract net blocks/domains from SPF record with assets-from-spf
    • python
    • python --asn | jq . => returns ASN info of all the assets


  • DNS zone transfer (misconfiguration rarely found today)
    • dig AXRF
  • Authenticated Denial of Existence (RFC 7129)
    • In DNS, when clients query for a non-existent domain, the server must deny its existence. It’s harder to do in DNSSEC do to cryptographic signing. It can be done using NSEC or NSEC3 records
  • => DNSSEC zone walking is the new zone transfer
  • If DNSSEC is enabled on your target & NSEC records are used you’ll get all the domains:
    • Install Ldnsutils on Kali/Debian/Ubuntu: sudo apt-get install ldnsutils
    • Try zone walking NSEC - LDNS
      • ldns-walk (part of ldnsutils) can be used to zone walk DNSSEC signed zone that uses NSEC
      • ldns-walk
      • ldns-walk
  • NSEC3 records are like NSEC records but provide a signed gap of hashes of domain names, to prevent zone enumeration or make it expensive
    • An attacker can still collect all the subdomain hashes & crack them offline using Nsec3walker & nsec3map
    • Example of zone walking NSEC3 protected zone :
# Detect if DNSSEC NSEC or NSEC3 is used

# Collect NSEC3 hashes of a domain
$ ./collect >

# Undo the hashing, expose the subdomain information
$ ./unhash >

# Check the number of successfully cracked subdomain hashes
$ cat | grep "icann" | wc -l

# List only the subdomain part from the unhashed data
$ cat | grep "icann" | awk '{print $2;}'

Few things that changed with the advent of APIs/Devops

  • Storage
  • Authentication
    • API keys & token based authentication instead of username & password
  • More & more code
  • CI/CD pipelines

Cloud storage

  • Cloud storage has become popular (inexpensive & easy to setup), especially object/block storage
  • Object storage is ideal for storing static, unstructured data like audio, video, documents, images, logs & large amounts of text
  • Popular object storage services:
    • AWS S3 buckets
    • Digital Ocean Spaces

What’s the catch with object storage?

  • It’s a treasure trove of information
  • Users store anything on 3rd-party services, including passwords in plain text files, log files, backup files…

Amazon S3 buckets

Hunting for publicly accessible S3 buckets

  • Users can store Files (Objects) in a Bucket
  • Each Bucket will get a unique, predictable URL
    • Just go to the Bucket’s URL to find out if it’s a public or private Bucket
  • Each file in a Bucket will get a unique URL
  • There are Access Control Mechanisms available at both Bucket & Object level
  • Finding buckets
    • Google Dorks
      • file:pdf
      • password
    • Do a dictionary based attack (since Buckets have a predictable URL)

Digital Ocean Spaces

  • Spaces is an object storage service by DigitalOcean, similar to AWS S3 buckets
  • Spaces API aims to be interoperable with Amazon’s AWS S3 API
    • => All tools & attacks used to test AWS S3 can be used against Spaces!
  • Spaces URL pattern
    • Users can store Files in a “Space”
    • Each Space will get a unique, predictable URL
    • Each file in a Space will get a unique URL
    • Access Control Mechanisms are available at Space & file level

Hunting for publicly accessible Spaces

  • A Space is typically considered:
    • “public” if any user can list the contents of the Space
    • “private” if the Space’s contents can only be listed or written by certain users (When access, gives AccessDenied errors)
Spaces finder
  • Spaces finder is a tool to look for publicly accessible Digital Ocean Spaces using a wordlist, list all the accessible files on a public Space & download the files
  • It’s AWSBucketDump tweaked to work with DO Spaces, since Spaces API is interoperable with Amazon’s S3 API
  • python3 -l sample_spaces.txt -g interesting_keywords.txt -D -m 500000 -t 2


  • With almost every service exposing an API, keys have become critical in authenticating
  • API keys are treated as keys to the kingdom
  • For apps, API keys tend to be Achilles heel: APIs are 2FA’s Achilles Heel

Code repos for recon

  • Code repos (Github, GitLab, Bitbucket…) are a treasure trove during recon
  • They can reveal a lot: credentials, potential vulnerabilities, infrastructure details

Github for recon

  • Github has a powerful search feature with advanced operators
  • It has a well designed REST API that can be used to automate your Github recon
  • Github for Bug Bounty Hunters
Things to focus on in Github
  • Repositories
  • Code
  • Commits (Bharat’s favorite!)
  • Issues

  • Examples:
    • “Delete the private ssh” commit message
    • “Multiple XSS vulnerabilities” issue
Mass cloning on Github
  • Clone all the target organization’s repos & analyze them locally
  • Use GithubCloner
    • python --org organization -o /tmp/output
Static code analysis
  • Try to understand the code, language used & architecture
  • Look for keywords & patterns:
    • API and key (To get more endpoints & find API keys)
    • token
    • secret
    • vulnerable
    • http://
  • Tools for finding secrets
Github Dorks

Passive recon using public datasets

  • Various projects gather Internet wide scan data & make it public, including: port scans, DNS data, SSL/TLS cert data, data breach dumps
  • Find your needle in the haystack

Why use public data sets for recon?

  • To reduce dependency on 3rd party APIs & services (be paranoid!)
  • To reduce active probing of target infrastructure
  • More the sources better the coverage (better understanding of the history of the organization & how its infrastructure evolved)
  • Build your own recon platforms

Data sources

Name Description Price
Sonar FDNS, RDNS, UDP, TCP, TLS, HTTP, HTTPS scan data FREE TCP, TLS, HTTP, HTTPS scan data FREE (non-commercial) TLS FREE
CZDS DNS zone files for “new” global TLDs FREE
ARIN American IP registry information (ASN, Org, Net, Poc) FREE
CAIDA PFX2AS IPv4 Daily snapshots of ASN to IPv6 mappings FREE
US Gov Daily snapshots of ASN to IPv6 mappings FREE
US Gov US government domain names FREE
UK Gov UK government domain names FREE
RIR Delegations Regional IP allocations FREE
PremiumDrops DNS zone files for com/net/info/org/biz/xxx/sk/us TLDs $24.95/mo New domain whois data $24.95/mo


Rapid7 Forward DNS dataset
  • FDNS is a massive dataset, 20+ GB compressed & 300+GB uncompressed
  • Rapid7 publishes it on
  • It aims to discover all domains found on the Internet
  • There is also RDNS dataset
Hunting subdomain in FDNS dataset
  • The data format is a gzip-compressed JSON file
  • Use jq to extract subdomains of a specific domain:
    • curl --silent -L | pigz -dc | head -n 10 | jq .
    • cat 20170417-fdns.json.gz | pigz -dc | grep "\.example\.com"
  • Analyzing the results:
# Extract subdomain names for a given domain from FDNS data
cat 20170417-fdns.json.gz | pigz -dc | grep "\.example\.com" | jq .name >

# Display first 15 subdomains from all the unique subdomains gathered
cat | grep "\.example\.com" | uniq | head -n 15

# Total number of unique subdomains enumerated
cat | grep "\.example\.com" | uniq | wc -l
  • These methods are slow. See HD Moore’s talk on how to do faster lookups


Tool Number of subdomains What the tool finds
Sublist3r 278 Apps indexed by search engines
CT logs ( 46 Domains that have SSL/TLS certs on CT logs
Zone walking NSEC3 182 Depends on your computation power. Can’t find domains named very unconventionnally
FDNS dataset 3681 Doesn’t have these restrictions (app being hosted or having an SSL/TLS cert…)


See you next time!