katana.units.pdf.pdfimages — pdfimages - Extract Images¶
Extract PDF images
This unit retrieves the images included in a PDF document,
using the pdfimages command-line tool. The syntax is:
pdfimage -png <target_path> <pdfimages_directory>
The unit inherits from katana.unit.FileUnit to ensure the target
is a PDF file.
-
class
katana.units.pdf.pdfimages.Unit(*args, **kwargs) Bases:
katana.unit.FileUnit-
BLOCKED_GROUPS= ['pdf'] PDFs shouldn’t come out of this. So no reason to look.
-
GROUPS= ['pdf', 'pdfimages'] These are “tags” for a unit. Considering it is a pdf unit, “pdf” is included, and the name of this unit “pdfimages”.
-
PRIORITY= 25 Priority works with 0 being the highest priority, and 100 being the lowest priority. 50 is the default priorty. This unit has a high priority if this is detected…
-
RECURSE_SELF= False Again no PDF from this. So recursion is silly.
-
evaluate(case: Any) → None Evaluate the target. Run
pdfimageson the target and recurse on any new found files.Parameters: case – A case returned by enumerate. For this unit, theenumeratefunction is not used.Returns: None. This function should not return any data.
-