» XML

Penetration Testing : NMAP.XML to TAB

Posted on by The Shell Shakespear in Main | 1 Comment

Following up on my last NMAP post, processing port scan data in a meaningful manner is essential to network penetration testing. For those who wish to skip the SQL stage and get quick results, the following one-liner will use xmlstarlet to parse a NMAP XML file:

cat nmap.xml | xmlstarlet sel -T -t -m "//state[@state='open']" -m ../../.. -v address/@addr -m hostnames/hostname -i @name -o ' (' -v @name -o ')' -b -b -b -o "	" -m .. -v @portid -o '/' -v @protocol -o "	" -m service -v @name -i "@tunnel='ssl'" -o 's' -b -o "	" -v @product -o ' ' -v @version -v @extrainfo -b -n -

Into the following tab delimited format:

IP (HOST) \t PORT/PROTOCOL \t SERVICE \t EXTRAINFO

This command sorts ports properly, but does not properly order the hosts. To do this, pipe the above command to the following:

sed 's_^\([^\t ]*\)\( ([^)]*)\)\?\t\([^\t ]*\)_\1.\3\2_' | sort -n -t. -k1,1 -k2,2 -k3,3 -k4,4 -k5,5 | sed 's_^\(\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}\)\.\([^ \t]*\)\( ([^)]*)\)\?_\1\4\t\3_'

This command converts lines that look like IP (HOST) \t PORT to IP.PORT (HOST), sorts it, and then converts them back to IP (HOST) \t PORT. From there, it is simply a matter of grepping for your favorite service. For example, if you wanted to focus on web penetration testing, all you would have to do is pipe the above to:

grep -i -e http

To get a list of services relevant to your testing. Happy Hacking.

Burp Suite Professional to XML: BURP2XML

Posted on by The Shell Shakespear in Main | 2 Comments

With the incorporation of Burp Suite Professional into our audit processes, we discovered that there was not an easy method to extract results from Burp’s session file without having to manually re-run Burp. In order to automate this process, we have developed a standalone Python script to process Burp’s session files into XML, and have released it under the GPLv3 License here burp2xml.py:

#!/usr/bin/env python
#Developed by Paul Haas, <phaas AT redspin DOT com> under Redspin. Inc.
#Licensed under the GNU Public License version 3.0 (2008-2009)
'''Process Burp Suite Professional's output into a well-formed XML document.

Burp Suite Pro's session file zipped into a combination of XML-like tags
containing leading binary headers with type and length definitions followed by
the actual data.  The theory is that this allows the file to read sequentially
rather than requiring tedious XML parsing.  However, without re-writing Burp's
internal parser, we have no way to extract results from its files without
loading the results in Burp.  

This tool takes a zipped Burp file and outputs a XML document based upon the
provided arguments which allows regular XPATH queries and XSL transformations.
'''
import datetime, string, re, struct, zipfile, sys

TAG = re.compile('</?(\w*)>',re.M) # Match a XML tag
nvprint = string.printable.replace('\x0b','').replace('\x0c','') # Printables

def milliseconds_to_date(milliseconds):
	'''Convert milliseconds since Epoch (from Java) to Python date structure:
	See: http://java.sun.com/j2se/1.4.2/docs/api/java/util/Date.html

	There is no direct way to convert milliseconds since Epoch to Python object
	So we convert the milliseconds to seconds first as a POSIX timestamp which
	can be used to get a valid date, and then use the parsed values from that
	object along with converting mili -> micro seconds in a new date object.'''
	try:
		d = datetime.datetime.fromtimestamp(milliseconds/1000)
		date = datetime.datetime(d.year,d.month,d.day,d.hour,d.minute,d.second,
			(milliseconds%1000)*1000)
	except ValueError, e: # Bad date, just return the milliseconds
		date = str(milliseconds)
	return date	

def burp_binary_field(field,i):
	'''Strip Burp Suite's binary format characters types from our data.
	The first character after the leading tag describes the type of the data.'''
	if len(field) <= i:
		return None,-1
	elif field[i] == '\x00': # 4 byte integer value
		return str(struct.unpack('>I',field[i+1:i+5])[0]),5
	elif field[i] == '\x01': # Two possible unsigned long long types
		if field[i+1] == '\x00': # (64bit) 8 Byte Java Date
			ms = struct.unpack('>Q',field[i+1:i+9])[0]
			date = milliseconds_to_date(ms)
			value =	date.ctime() # Use the ctime string format for date
		else: # Serial Number only used ocasionally in Burp
			value = str(struct.unpack('>Q',field[i+1:i+9])[0])
		return value,9
	elif field[i] == '\x02': # Boolean Object True/False
		return str(struct.unpack('?',field[i+1:i+2])[0]),2
	elif field[i] == '\x03' or field[i] == '\x04': # 4 byte length + string
		length = struct.unpack('>I',field[i+1:i+5])[0]
		#print "Saw string of length",length,"at",i+5,i+5+length
		value = field[i+5:i+5+length]
		if '<' in value or '>' in value or '&' in value: # Sanatize HTML w/CDATA
			value = '<![CDATA[' + value.replace(']]>',']]><![CDATA[') + ']]>'
		value = ''.join(c for c in value if c in nvprint) # Remove nonprintables
		return value,5+length # ** TODO: Verify length by matching end tag **
	print "Unknown binary format",repr(field[i])
	return None,-1

def burp_to_xml(filename):
	'''Unzip Burp's file, remove non-printable characters, CDATA any HTML,
	include a valid XML header and trailer, and return a valid XML string.'''

	xml = '' # Our output string
	z = zipfile.ZipFile(filename) # Open Burp's zip file
	burp = z.read('burp','rb') # Read-in the main burp file
	m = TAG.match(burp,0) # Match a tag at the start of the string
	while m:
		xml += m.group()
		index = m.end()
		etag = m.group().replace('<','</') # Matching tag

		m = TAG.match(burp,index) # Attempt to get the next tag
		if not m: # Data folows
			# Read the type of data using Burp's binary data headers
			value, length = burp_binary_field(burp, index)
			if value is None: break

			xml += value
			xml += etag
			index += length + len(etag) # Point our index to the next tag
			m = TAG.match(burp,index) # And retrieve it

	xml = '<?xml version="1.0"?><burp>' + xml + '</burp>' # XMLify our string
	return xml # And return it

def main():
	'''Called if script is run from the command line.'''
	import sys
	if (len(sys.argv) < 2):
		print __doc__
		print "Usage:",sys.argv[0],"burp_session_file {output XML name}"
		exit(1)
	xml = burp_to_xml(sys.argv[1])
	# Write out file to a optional argument or provided file + xml extension
	out = sys.argv[2] if (len(sys.argv) > 2) else sys.argv[1]+'.xml'
	out = open(out, 'wb')
	out.write(xml)
	out.close()
	#sys.stdout.write("# Output written to %s.xml" % out)

if __name__ == '__main__':
	main()

My next post will include some useful commands for parsing XML on the command line. Let us know if you have any questions running it or include it in your projects.

NMAP Database Output : XML TO SQL

Posted on by The Shell Shakespear 4 Comments

SQL support has been a much requested feature of NMAP in the Redspin office. While a number of tools exist to support NMAP SQL output, their database format has left much to be desired. Using SQLite, Perl’s DB and the NMAP Parser module, our tool extracts all supported fields in an NMAP XML file and creates the following database format:

TABLE nmap (
    sid INTEGER PRIMARY KEY AUTOINCREMENT,
    version TEXT,
    xmlversion TEXT,
    args TEXT,
    types TEXT,
    starttime INTEGER,
    startstr TEXT,
    endtime INTEGER,
    endstr TEXT,
    numservices INTEGER) 

TABLE hosts (
    sid INTEGER,
    hid INTEGER PRIMARY KEY AUTOINCREMENT,
    ip4 TEXT,
    ip4num INTEGER,
    hostname TEXT,
    status TEXT,
    tcpcount INTEGER,
    udpcount INTEGER,
    mac TEXT,
    vendor TEXT,
    ip6 TEXT,
    distance INTEGER,
    uptime TEXT,
    upstr TEXT) 

TABLE sequencing (
    hid INTEGER,
    tcpclass TEXT,
    tcpindex TEXT,
    tcpvalues TEXT,
    ipclass TEXT,
    ipvalues TEXT,
    tcptclass TEXT,
    tcptvalues TEXT) 

TABLE ports (
    hid INTEGER,
    port INTEGER,
    type TEXT,
    state TEXT,
    name TEXT,
    tunnel TEXT,
    product TEXT,
    version TEXT,
    extra TEXT,
    confidence INTEGER,
    method TEXT,
    proto TEXT,
    owner TEXT,
    rpcnum TEXT,
    fingerprint TEXT) 

TABLE os (
    hid INTEGER,
    name TEXT,
    family TEXT,
    generation TEXT,
    type TEXT,
    vendor TEXT,
    accuracy INTEGER)

The resulting database can then be queried directly using SQLite in order to extract the relevant information. The tool also prints output in the following format sorted by IP and PORT as show below:

$ nmap -A -T4 scanme.nmap.org -oX scanme >/dev/null
$ nmap_xml2sql.pl scanme | grep -v "^#"
64.13.134.52 (scanme.nmap.org)	53/tcp	domain
64.13.134.52 (scanme.nmap.org)	80/tcp	http

Compare this database structure with the other similar XML2SQL approaches:
NMAP-SQL: http://sourceforge.net/projects/nmapsql/ outdated – only supports NMAP 3.75 and MySQL:

table portstat, table runlist, table targets

NMAP-Parser (nmap2db.pl): http://search.cpan.org/dist/Nmap-Parser/ MySQL and SQLite support:

table hosts(
  ip              VARCHAR(15) PRIMARY KEY NOT NULL,
  mac             VARCHAR(17),
  status          VARCHAR(7) DEFAULT 'down',
  hostname        TEXT,
  open_ports      TEXT,
  filtered_ports  TEXT,
  osname	      TEXT,
  osfamily        TEXT,
  osgen           TEXT,
  last_scanned    TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  UNIQUE (ip))

PBNJ -x option: http://pbnj.sourceforge.net/ MySQL and SQLite support:

table machines(
  mid             PRIMARY KEY AUTOINCREMENT,
  ip             	TEXT,
  host          	TEXT,
  localh        	INTEGER,
  os      		TEXT,
  machine_created TEXT,
  created_on      TEXT)

table services(
  mid             INTEGER,
  service         TEXT,
  state          	TEXT,
  port        	INTEGER,
  protocol      	TEXT,
  version 		TEXT,
  banner      	TEXT,
  machine_updated TEXT,
  updated_on      TEXT)

The tool has been released under a NMAP equivalent license as well as a Fyodor may-use-as-he-pleases license, and can be download here: nmap_xml2sql.pl

Requirements: