Giter Club home page Giter Club logo

bigdataaudit's Introduction

#Big Data Audit By Kotobukki.

Build Status ![Support] (https://camo.githubusercontent.com/4a42460f88f172b10e916fec11857648a8a2f2c8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f706c6174666f726d2d6f73782532466c696e757825324677696e646f77732d677265656e2e737667) ![Supportpython] (https://camo.githubusercontent.com/352488c0cbba0e8f6da11ae0761444dd0c93489c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f707974686f6e2d322e372d626c75652e737667)

##使用说明 程序的入口文件是main.py, 可以使用下面的命令查看怎么使用

$python main.py -h
usage: main.py [-h] {hadoop,spark} ...

This is a tool for detectiong the security problem of hadoop!

positional arguments:
  {hadoop,spark}  commands
    hadoop        check security of hadoop
    spark         check security of spark

optional arguments:
  -h, --help      show this help message and exit

主要有两个子功能,Hadoop和Spark检测 ##Hadoop检测 同样可以使用-h查看使用说明

python main.py hadoop -h
usage: main.py hadoop [-h] confFolder

positional arguments:
  confFolder  the dir of hadoop configuration files

optional arguments:
  -h, --help  show this help message and exit

confFolder是放置hadoop配置文件的文件夹,可以是你安装hadoop时候的conf文件夹,如/usr/local/hadoop/conf/hadoop/,也可以将需要的文件copy到指定的目录,copy时候文件的名字要和hadoop.json里的一样,比如下面是系统默认配置好的用于检测已知安全选项的文件,使用json格式

{
  "authentication": {
    "core-site": [
      {
        "hadoop.security.authentication": "kerberos",
        "reason": "Suggest to authenticate user by using kerberos!"
      }
    ]
  },
  "authorization": {
    "core-site": [
      {
        "fs.permissions.enabled": "true",
        "reason": "Suggest to enable the permission control for fs!"
      },
      {
        "hadoop.security.authorization": "true",
        "reason": "Suggest to enable authorization for every user!"
      }
    ]
  },
  "acl": {
    "hdfs-site": [
      {
        "dfs.namenode.acls.enabled": "true",
        "reason": "Suggest to enable acl for user!"
      }
    ]
  },
  "encry": {
    "hdfs-site": [
      {
        "dfs.encryption.key.provider.uri": "*",
        "reason": "Suggest to encrypt data!"
      }
    ]
  }
}

json中的core-site等字段,就是要检测的文件名字,下一级则是要检测的配置项,它的父级像encry意思是检测加密的安全问题。该文件使用者可以自己进行扩展。

以当前默认的系统配置,测试如下:

$ python main.py hadoop ./hadoop
[Info]: Begining to check security: authentication
[Info]: >> Check file: core-site.xml
[Pass]: Your hadoop.security.authentication setting is safe!
[Info]: Begining to check security: encry
[Info]: >> Check file: hdfs-site.xml
[Warning]: Suggest to encrypt data! Set: dfs.encryption.key.provider.uri=*
[Info]: Begining to check security: authorization
[Info]: >> Check file: core-site.xml
[Warning]: Suggest to enable authorization for every user! Set: hadoop.security.authorization=true
[Info]: Begining to check security: acl
[Info]: >> Check file: hdfs-site.xml
[Warning]: Suggest to enable acl for user! Set: dfs.namenode.acls.enabled=true

##Spark检测 和hadoop检测类似,同样指定配置文件的文件夹路径,不同的是,spark默认的配置文件只有一个spark-defaults.conf,不要改名复制到特定的目录里。

和hadoop检测类似提供了一个可以配置的需要检测配置项,在根目录的spark文件夹下,名字为security.ini,该文件的section名字是要检测的配置项输入什么问题,key-value以空格隔开。默认的配置,列出了必要的安全配置项。

例子:

$ python main.py spark ./spark/
[Info]: Start to check the security of spark...
[Warning]: Suggest to add option spark.authenticate = true if your spark runs on standalone mode
[Warning]: Suggest to add option spark.authenticate.secret if your spark runs on yarn mode
[Warning]: Suggest to set option spark.ssl.enable = true
[Warning]: Suggest to set option spark.eventlog.enabled = true

bigdataaudit's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.