I’ve wondered for quite some time, how do sites like https://ipinfo.io get their data? Secondly, the question I had was “Could I gather/build out the data used by these sorts of services?”. Off the bat, looking at the data, I made the assumption that the ownership data is stored publicly or “premiumly” that I could access, but where? So I did some digging around and below is some the datasets I discovered.


ICANN runs IANA which allocates IP addresses globally. The IANA allocates blocks of IPs to RIRs (Regional Internet Registry) which manage the blocks allocated to them. You can view the list of ranges and how they are allocated here https://www.iana.org/assignments/ipv4-address-space/ipv4-address-space.xhtml. It seems they allocate /8 ranges, which you can also see here https://en.wikipedia.org/wiki/List_of_assigned_/8_IPv4_address_blocks


The RIR is a regional registry (RIPE, APNIC, ARIN, LACNIC, NRO, AFRINIC) and you can see the map here. Each RIR will further allocate addresses to a LIR.


So with that knowledge, I dug down into each RIR to see where I could possibly mine the allocated addresses. I found there were dumps of data which took me some time to look through to find what datasets would be helpful. In the process of looking at ARIN datasets I ran across the term ASN. ASNs are assigned by the IANA to each RIR which assign those the ASNs to a block. You can see all the assigments here https://www.iana.org/assignments/as-numbers/as-numbers.xhtml

So you can get the list of ASN details from each RIR. It did notice RIPE had the details for every RIR (https://ftp.ripe.net/pub/stats/ ) so you don’t necessarily have to go to every RIR to get that data. Using the data I could figure out what ranges and ASNs belonged together.

For verification that the data was good, I looked up some addresses at IP info to see if things checked out, and everything looked good, but there was some caveats. I looked up some of the ranges and they were noted as inactive. How did I know a given range was “inactive”, there were not clear indicators in the ARIN datasets?

Another challenge was associating ranges to a specific ASN. You can use the reg-id column to tie ASNs and IPs under the same org, but it isn’t clear how to tie an ASN to a specific IP allocation.

Stats delegated format

Looking at the delegated extended datasets, they look like this.


It look me awhile to decipher the format, especially the last column, but this is the gist of it (I finally found a doc on the format). The last column essentially is a key to help you associate rows with a given org.



During my research process, verifying information about ranges and ownership I ran into https://bgp.he.net/AS3356. BGP (which I won’t dig into because I haven’t take the time yet to understand) is a protcol for exchanging routing information. I found you can download dumps of the exchanges and parse through them. Inside those dumps (http://archive.routeviews.org/bgpdata/ ), you can find ASNs and ranges - aha! this is where I can sort out if a range is inactive or not!

The data roughly looks like this when you dump it out with bgpdump

TABLE_DUMP2|1593640801|B||7018||7018 15169|IGP||0|0|7018:2500 7018:37232|NAG||
TABLE_DUMP2|1593640801|B||1239||1239 15169|IGP||0|80||NAG||
TABLE_DUMP2|1593640801|B||3549||3549 3356 15169|IGP||0|2504|3356:2 3356:86 3356:500 3356:666 3356:2064 3356:11078 3549:2352 3549:31826|NAG||
TABLE_DUMP2|1593640801|B||2497||2497 15169|IGP||0|0||NAG||
# Get outgoing IP and asn
bgpdump -m rib.20200701.2200.bz2  | cut -d'|' -f 4,5

# Find matches for a specific ASN
bgpdump updates.20200428.1815 | grep 'AS6447'

Parsing through a rib file you can find figure out what ASNs are associated with what ranges and are still active.


So I’m still not concluded … but you can start to see how the pieces are coming together. I’m not clear on how you get specific states or cities, but that’s why things are not concluded.

During my process of searching for data I always find useful looks that directly contributed to my discoveries or are related.