HostIP for geolocation - Noah.org

HostIP for geolocation

From Noah.org

Jump to: navigation, search


The hostip.info geolocation database is free and accurate. You can use this to turn an IP address into a geographic coordinates. It also includes city, state, and country information.

When I downloaded the bz2 version it appeared to be gzipped multiple times. I don't know why that is, but if you have trouble opening it trying using the `file` command to see what format it is in. Keep gunzipping it until your get ASCII text. Here is what I had to do:

 $ bunzip2 hostip_current.sql.bz2
 $ mv hostip_current.sql hostip_current.sql.gz
 $ gunzip hostip_current.sql.gz
 $ mv hostip_current.sql hostip_current.sql.gz # yes, do it again
 $ gunzip hostip_current.sql.gz

I knew I was done when `file` gave me this results:

 $ file hostip_current.sql 
 hostip_current.sql: ASCII text, with very long line

Now, to actually use this data here is what I to did. Create a database for hostip and load the data:

 $ mysql -u root -p ROOT_PASSWORD -e "create database hostip;"
 $ mysql -u root -p ROOT_PASSWORD hostip < hostip_current.sql

That gave me a nice database. Each A block of an IP address is stored in a different table. It was pretty easy to write a Python script that would decode an IP address into {city, state, country, country_code lat, lon}. This is a short script that does the bare essentials as an example. I have included a longer script at the end which has more features.

#!/usr/bin/env python
# hostipgeo.py
# HostIP.info geolocation lookup.
# Run this script with an IP address as an argument and it will return
# the following information: {city, state, country, country_code lat, lon}.
# Noah Spurrier 2007
import MySQLdb
import os, sys, urllib
ip = sys.argv[1].split('.')
con = MySQLdb.Connect(host="127.0.0.1", port=3306, user="root", passwd="", db="hostip")
cursor = con.cursor()
table_a = "ip4_" + ip[0]
net_b = ip[1]
net_c = ip[2]
sql = """SELECT cityByCountry.name as city, cityByCountry.state as state,
    countries.name as country, countries.code as country_code,
    cityByCountry.lat as lat, cityByCountry.lng as lon
FROM %(table_a)s, countries, cityByCountry
WHERE %(table_a)s.city=cityByCountry.city AND
    %(table_a)s.country=cityByCountry.country AND
    %(table_a)s.country=countries.id AND
    b=%(net_b)s AND c=%(net_c)s;"""
sql = sql % locals()
if cursor.execute(sql) == 0:
    print "That IP address was not found in the database."
    sys.exit(1)
con.close()
results = list(cursor.fetchall()[0])
results[0] = urllib.unquote(results[0])
keys = ['city','state','county','country_code','lat','lon']
results_dict = dict(zip(keys, results))
for key in keys:
    print "%12s: %s" %(key, results_dict[key])

Running this script on the IP address 69.80.208.1 gives the following:

$ ./hostipgeo.py 69.80.208.1
        city: San Francisco, CA
       state: California
      county: UNITED STATES
country_code: US
         lat: 37.8133
         lon: -122.50


Full script

This version adds several options, error checking, and multiple output formats.

Click to download: hostipgeo.py

#!/usr/bin/env python
"""
SYNOPSIS
 
    hostipgeo.py IP_ADDRESS
 
DESCRIPTION
 
    HostIP.info geolocation lookup script
 
    Run this script with an IP address as an argument and it will return the
    following information:
 
        ip, city, state, country, country_code, lat, lon
 
    Download mysql data from http://www.hostip.info/. This datafile may work
    in another database with a little tweaking. Create a database for hostip
    and load the data:
 
        $ mysql -u root -p ROOT_PASSWORD -e "create database hostip;"
        $ mysql -u root -p ROOT_PASSWORD hostip < hostip_current.sql
 
EXAMPLES
 
    ./hostipgeo.py 69.80.212.15
    69.80.212.15,"San Francisco, CA",California,UNITED STATES,US,37.8133,-122.505
 
    ./hostipgeo.py -v 69.80.212.15
              ip: 69.80.212.15
            city: San Francisco, CA
           state: California
         country: UNITED STATES
    country_code: US
             lat: 37.8133
             lon: -122.505
    # TOTAL TIME IN SECONDS: 0.0152049064636
 
 
EXIT STATUS
 
    returns 0 if the IP address is found in the database.
    returns 1 if the IP address is not found.
 
AUTHOR
 
    Noah Spurrier <noah@noah.org>
 
LICENSE
 
    This script is in the public domain, free from copyrights or restrictions.
 
VERSION
 
    $Id: hostipgeo.py 118 2007-12-05 02:12:17Z noah $
"""
 
# TODO: Add database configuration from the command-line:
#    --host=HOSTNAME (default is 127.0.0.1, localhost)
#    --user=USERNAME (default is root)
#    --password=PASSWORD (default is blank)
#    --database=DB_NAME (default is hostip)
#    --port=PORT_NUM (default is 3306)
#    --dsn=DATASOURCENAME (alternate)
#
# TODO: Add batch file support of Apache log processing
#   if no argument is given for IP address then assume stdin.
#    --file=FILENAME : Process all lines in the given file. Pass - to process stdin.
 
import sys, os, traceback, optparse, urllib
import time
import MySQLdb
 
def main ():
 
    global options, args
 
    ipstr = args[0]
    ip = ipstr.split('.')
 
    # reject 10.0.0.0 - 10.255.255.255
    # reject 192.168.0.0 - 192.168.255.255
    # reject 172.16.0.0 - 172.31.255.255
    if ip[0]=='10' or (ip[0]=='192' and ip[1]=='168') or (ip[0]=='172' and (int(ip[1])>=16 and int(ip[1])<=31)):
        print "# ERROR: reserved LAN IP address."
        return 1
 
    con = MySQLdb.Connect(host="127.0.0.1", port=3306, user="root", passwd="", db="hostip") #, unix_socket='/var/lib/mysql/mysql.sock')
    cursor = con.cursor()
    table_a = "ip4_" + ip[0]
    net_b = ip[1]
    net_c = ip[2]
    sql = """SELECT cityByCountry.name as city, cityByCountry.state as state,
        countries.name as country, countries.code as country_code,
        cityByCountry.lat as lat, cityByCountry.lng as lon
    FROM %(table_a)s, countries, cityByCountry
    WHERE %(table_a)s.city=cityByCountry.city AND
        %(table_a)s.country=cityByCountry.country AND
        %(table_a)s.country=countries.id AND
        b=%(net_b)s AND c=%(net_c)s;"""
    sql = sql % locals()
    if cursor.execute(sql) == 0:
        con.close()
        print "# %s IP address was not found in the database." % ipstr
        return 1
    con.close()
    results = list(cursor.fetchall()[0])
    results[0] = urllib.unquote(results[0])
    results = [ipstr] + results
    keys = ['ip','city','state','country','country_code','lat','lon']
    results_dict = dict(zip(keys, results))
    if options.verbose:
        for key in keys:
            print "%12s: %s" %(key, results_dict[key])
    else:
        results_dict['city'] = '"' + results_dict['city'] + '"'
        print "%(ip)s,%(city)s,%(state)s,%(country)s,%(country_code)s,%(lat)s,%(lon)s" % results_dict
    return 0
 
if __name__ == '__main__':
    try:
        start_time = time.time()
        parser = optparse.OptionParser(formatter=optparse.TitledHelpFormatter(), usage=globals()['__doc__'], version='$Id: hostipgeo.py 118 2007-12-05 02:12:17Z noah $')
        parser.add_option ('-v', '--verbose', action='store_true', default=False, help='verbose output')
        (options, args) = parser.parse_args()
        if len(args) != 1:
            parser.error ('missing IP address argument')
        exit_val = main()
        if options.verbose: print '# TOTAL TIME IN SECONDS:',
        if options.verbose: print (time.time() - start_time)
        sys.exit(exit_val)
    except KeyboardInterrupt, e: # Ctrl-C
        raise e
    except SystemExit, e: # sys.exit()
        raise e
    except Exception, e:
        print 'ERROR, UNEXPECTED EXCEPTION'
        print str(e)
        traceback.print_exc()
        os._exit(1)
-->