Target - Abstract Challenge Data¶
The Target
class abstracts away interactions with raw target data by first evaluating
what kind target the data is, and providing convenience methods for accessing raw data, files,
or URLs from the data.
-
class
katana.target.
Target
(manager: katana.manager.Manager, upstream: bytes, parent: Optional[katana.unit.Unit] = None, config: Optional[configparser.ConfigParser] = None)¶ A Target has two main parts:
- Upstream
- Raw Data
The Upstream is what was passed to the target constructor. In the case of raw data,
upstream
andraw
will be identical objects. If a URL was passed to the constructor,raw
will take the form of the content of the web page. Katana will automatically attempt to fetch the page. In a similar fashion,raw
will return the content of a file, if the upstream was a path.If you don’t rely on external tools, you should mostly deal with
raw
orstream
.raw
will either be a bytes object, or a memory mapped file (which acts like a bytes object in most situations).stream
will either be an open file handle for file upstreams, or a BytesIO object which will act like a file. This allows you to reference the data in an abstract way no matter what the upstream target was. Other useful properties are also available which describe the data and are listed below.Property upstream: A bytes object holding the original target data. Property parent: A Unit object describing how this target was created (or None for root targets). Property is_printable: Whether the data is mostly printable text Property is_english: Whether the data appears to be mostly english Property is_image: Whether the data is an image Property is_base64: Whether the data looks like base64 Property path: The path to a file-backed target (URLs are also file-backed by an artifact) Property completed: Whether we are done processing this target Property url_pieces: A regex Match object containing the URL pieces, if this is a URL. Property is_url: True if this appears to be a valid URL Property is_file: True if this appears to be a valid file path. This is also true, if manager[download]
is True, and we were able to download the file as an artifact.Property magic: libmagic result for the data Property hash: A hashlib.md5 object representing the hash of the data Property start_time: The time in seconds that this target was started Property end_time: When this target completed Property units_evaluated: The total number of units evaluated under this target (only root targets) -
add_unit
()¶ Add a unit for tracking. This is called by Manager.queue
-
build_target
()¶ This method does the resource intensive part of building the target. It is done in a separate thread to decrease the time to return from the Manager.queue_target method (e.g. when running w/ a REPL)
-
is_webpage
¶ Opposite of is_website_root?
-
is_website_root
¶ if this is a URL, return whether we are at the root of the URL
-
raw
¶ Return a bytes-like object for any given target type:
- Files/content already in memory: return self.content
- Files already written to disk: return a mmap object
- For all other unknown data: return self.upstream directly
-
rem_unit
()¶ Remove a unit for tracking. Also sets completed if all units are done.
-
stream
¶ Return a file-like object for any given target type:
- Files/content already in memory: return a BytesIO object
- Files already written to disk: return an binary file handle
- For all other unknown data: return a BytesIO object of upstream
-
web_host
¶ if this is a URL, return the hostname
-
web_port
¶ if this is a URL, return the port number
-
web_protocol
¶ if this is a URL, return the protocol
-
web_query
¶ if this is a url, return the query string
-
web_uri
¶ if this is a url, return the URI
-
website_root
¶ if this is a url, return the root of the URL (without any URI)