Extracting links from a URL to Maltego
There is another recipe in this book that illustrates how to use the BeautifulSoup library to programmatically get domain names. This recipe will show you how to create a local Maltego transform, which you can then use within Maltego itself to generate information in an easy to use, graphical way. With the links gathered from this transform, this can then also be used as part of a larger spidering or crawling solution.
How to do it…
The following code shows how you can create a script that will output the enumerated information into the correct format for Maltego:
import urllib2
from bs4 import BeautifulSoup
import sys
tarurl = sys.argv[1]
if tarurl[-1] == “/”:
tarurl = tarurl[:-1]
print”<MaltegoMessage>”
print”<MaltegoTransformResponseMessage>”
print” <Entities>”
url = urllib2.urlopen(tarurl).read()
soup = BeautifulSoup(url)
for line in soup.find_all(‘a’):
newline = line.get(‘href’)
if newline[:4] == “http”:
print”<Entity...