Giter Club home page Giter Club logo

armadito-pdf's Introduction

ARMADITO PDF ANALYZER

Copyright (C) Teclib', 2015, 2016

Project home : http://www.teclib-edition.com/teclib-products/armadito-antivirus

See Online documentation at : http://armadito-av.readthedocs.io/en/latest/

What is it?

Armadito PDF analyzer is a module for PDF documents scanning that includes:

  • a PDF parser

  • an heuristic analyzer that computes the document confidence level

Licensing

Armadito PDF analyzer is licensed under the GPLv3 https://www.gnu.org/licenses/license-list.html#GNUGPLv3

Dependencies

miniz.c

FEATURES

==> Parsing <==

  • Remove PostScript comments in the content of the document.
  • Get PDF version in header (Ex: %PDF-1.7).
  • Get trailers and xref table or xref objects.
  • Get objects informations described in the document (reference, dictionary, type, stream, filters, etc).
  • Extract objects embedded in stream objects.
  • Decode object streams encoded with filters : FlateDecode, ASCIIHexDecode, ASCII85Decode, LZWDecode, CCITTFaxDecode

==> Analysis <==

  • Tests based on PDF document structure (accodring to PDF specifications):

    • Check the PDF header version (from version 1.1 to 1.7).
    • Check if the content of the document is encrypted.
    • Check that the document contains non-empty pages.
    • Check object collision in object declaration.
    • Check trailers format.
    • Check xref table and xref object.
    • Check the presence of malicious Postscript comments (which could cause parsing errors).
  • Tests based on PDF objects content:

    • Get potentially malicious active contents (JavaScripts, Embedded files, Forms, URI, etc.)
    • JavaScript content analysis (malicious keywords, pattern repetition, unicode strings, etc).
    • Info object content analysis (search potentially malicious strings).
    • Check if object dictionary is hexa obfuscated.

==> Notation <==

  • A suspicious coefficient is attributed to each test.
  • Calc the suspicious coefficient of the pdf document.

LIMITATIONS

  • Supported PDF versions are: %PDF-1.1 to %PDF-1.7.
  • PDF documents with encrypted content are not supported.
  • Removing comments is skipped for for document > 2MB

armadito-pdf's People

Contributors

fdechelle avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.