Target - Abstract Challenge Data

The Target class abstracts away interactions with raw target data by first evaluating what kind target the data is, and providing convenience methods for accessing raw data, files, or URLs from the data.

class katana.target.Target(manager: katana.manager.Manager, upstream: bytes, parent: Optional[katana.unit.Unit] = None, config: Optional[configparser.ConfigParser] = None)

A Target has two main parts:

  • Upstream
  • Raw Data

The Upstream is what was passed to the target constructor. In the case of raw data, upstream and raw will be identical objects. If a URL was passed to the constructor, raw will take the form of the content of the web page. Katana will automatically attempt to fetch the page. In a similar fashion, raw will return the content of a file, if the upstream was a path.

If you don’t rely on external tools, you should mostly deal with raw or stream. raw will either be a bytes object, or a memory mapped file (which acts like a bytes object in most situations). stream will either be an open file handle for file upstreams, or a BytesIO object which will act like a file. This allows you to reference the data in an abstract way no matter what the upstream target was. Other useful properties are also available which describe the data and are listed below.

Property upstream:
 A bytes object holding the original target data.
Property parent:
 A Unit object describing how this target was created (or None for root targets).
Property is_printable:
 Whether the data is mostly printable text
Property is_english:
 Whether the data appears to be mostly english
Property is_image:
 Whether the data is an image
Property is_base64:
 Whether the data looks like base64
Property path:The path to a file-backed target (URLs are also file-backed by an artifact)
Property completed:
 Whether we are done processing this target
Property url_pieces:
 A regex Match object containing the URL pieces, if this is a URL.
Property is_url:
 True if this appears to be a valid URL
Property is_file:
 True if this appears to be a valid file path. This is also true, if manager[download] is True, and we were able to download the file as an artifact.
Property magic:libmagic result for the data
Property hash:A hashlib.md5 object representing the hash of the data
Property start_time:
 The time in seconds that this target was started
Property end_time:
 When this target completed
Property units_evaluated:
 The total number of units evaluated under this target (only root targets)
add_unit()

Add a unit for tracking. This is called by Manager.queue

build_target()

This method does the resource intensive part of building the target. It is done in a separate thread to decrease the time to return from the Manager.queue_target method (e.g. when running w/ a REPL)

is_webpage

Opposite of is_website_root?

is_website_root

if this is a URL, return whether we are at the root of the URL

raw

Return a bytes-like object for any given target type:

  • Files/content already in memory: return self.content
  • Files already written to disk: return a mmap object
  • For all other unknown data: return self.upstream directly
rem_unit()

Remove a unit for tracking. Also sets completed if all units are done.

stream

Return a file-like object for any given target type:

  • Files/content already in memory: return a BytesIO object
  • Files already written to disk: return an binary file handle
  • For all other unknown data: return a BytesIO object of upstream
web_host

if this is a URL, return the hostname

web_port

if this is a URL, return the port number

web_protocol

if this is a URL, return the protocol

web_query

if this is a url, return the query string

web_uri

if this is a url, return the URI

website_root

if this is a url, return the root of the URL (without any URI)