Giter Club home page Giter Club logo

vakt's Introduction

Vakt logo

Attribute-based access control (ABAC) SDK for Python.

CI Status codecov.io PyPI version Apache 2.0 licensed


Documentation

Description

Vakt is an attribute-based and policy-based access control (ABAC) toolkit that is based on policies. ABAC stands aside of RBAC and ACL models, giving you a fine-grained control on definition of the rules that restrict an access to resources and is generally considered a "next generation" authorization model. In its form Vakt resembles IAM Policies, but has a way nicer attribute managing.

See concepts section for more details.

Back to top

Concepts

Given you have some set of resources, you can define a number of policies that will describe access to them answering the following questions:

  1. What resources (resource) are being requested?
  2. Who is requesting the resource?
  3. What actions (action) are requested to be done on the asked resources?
  4. What are the rules that should be satisfied in the context of the request itself?
  5. What is resulting effect of the answer on the above questions?

The overall diagram of vakt workflow is:

Vakt diagram

Vakt allows you to gain:

  • Policy Based Access Control (vakt is based on Policies that describe access rules, strategies to your resources)
  • Fine-Grained Authorization (vakt Policies give you fine-grained control over resource's, subject's, action's and context's attributes)
  • Dynamic Authorization Management (you can add Policies and change their attributes)
  • Externalized Authorization Management (you can build own external AuthZ server with vakt, see examples)

Back to top

Install

Vakt runs on Python >= 3.6. PyPy implementation is supported as well.

Bare-bones installation with in-memory storage:

pip install vakt

For MongoDB storage:

pip install vakt[mongo]

For SQL storage:

pip install vakt[sql]
pip install $ANY_DB_DRIVER_OF_YOUR_CHOICE_SUPPORTED_BY_SQLALCHEMY

For Redis storage:

pip install vakt[redis]

Also see redis-py docs. For example if hiredis is found in the system, it will be used as a faster parser. However vakt doesn't enforce this dependency.

Back to top

Usage

A quick dive-in:

import vakt
from vakt.rules import Eq, Any, StartsWith, And, Greater, Less

policy = vakt.Policy(
    123456,
    actions=[Eq('fork'), Eq('clone')],
    resources=[StartsWith('repos/Google', ci=True)],
    subjects=[{'name': Any(), 'stars': And(Greater(50), Less(999))}],
    effect=vakt.ALLOW_ACCESS,
    context={'referer': Eq('https://github.com')},
    description="""
    Allow to fork or clone any Google repository for
    users that have > 50 and < 999 stars and came from Github
    """
)
storage = vakt.MemoryStorage()
storage.add(policy)
guard = vakt.Guard(storage, vakt.RulesChecker())

inq = vakt.Inquiry(action='fork',
                   resource='repos/google/tensorflow',
                   subject={'name': 'larry', 'stars': 80},
                   context={'referer': 'https://github.com'})

assert guard.is_allowed(inq)

Or if you prefer Amazon IAM Policies style:

import vakt
from vakt.rules import CIDR

policy = vakt.Policy(
    123457,
    effect=vakt.ALLOW_ACCESS,
    subjects=[r'<[a-zA-Z]+ M[a-z]+>'],
    resources=['library:books:<.+>', 'office:magazines:<.+>'],
    actions=['read', 'get'],
    context={
        'ip': CIDR('192.168.0.0/24'),
    },
    description="""
    Allow all readers of the book library whose surnames start with M get and read any book or magazine,
    but only when they connect from local library's computer
    """,
)
storage = vakt.MemoryStorage()
storage.add(policy)
guard = vakt.Guard(storage, vakt.RegexChecker())

inq = vakt.Inquiry(action='read',
                   resource='library:books:Hobbit',
                   subject='Jim Morrison',
                   context={'ip': '192.168.0.220'})

assert guard.is_allowed(inq)

For more examples see here.

Back to top

Components

Policy

Policy is a main object for defining rules for accessing resources. The main parts reflect questions described in Concepts section:

  • resources - a list of resources. Answers: what is asked?
  • subjects - a list of subjects. Answers: who asks access to resources?
  • actions - a list of actions. Answers: what actions are asked to be performed on resources?
  • context - rules that should be satisfied by the given inquiry's context.
  • effect - If policy matches all the above conditions, what effect does it imply? Can be either vakt.ALLOW_ACCESS or vakt.DENY_ACCESS

All resources, subjects and actions are described with a list containing strings, regexes, Rules or dictionaries of strings (attributes) to Rules. Each element in list acts as logical OR. Each key in a dictionary of Rules acts as logical AND. context can be described only with a dictionary of Rules.

Depending on a way resources, subjects, actions are described, Policy can have either String-based or Rule-based type. Can be inspected by policy.type. This enforces the use of a concrete Checker implementation. See Checker for more.

from vakt import Policy, ALLOW_ACCESS
from vakt.rules import CIDR, Any, Eq, NotEq, In

# Rule-based policy (defined with Rules and dictionaries of Rules)
Policy(
    1,
    description="""
    Allow access to administration interface subcategories: 'panel', 'switch' if user is not
    a developer and came from local IP address.
    """,
    actions=[Any()],
    resources=[{'category': Eq('administration'), 'sub': In('panel', 'switch')}],
    subjects=[{'name': Any(), 'role': NotEq('developer')}],
    effect=ALLOW_ACCESS,
    context={'ip': CIDR('127.0.0.1/32')}
)

# String-based policy (defined with regular expressions)
Policy(
    2,
    description="""
    Allow all readers of the book library whose surnames start with M get and read any book or magazine,
    but only when they connect from local library's computer
    """,
    effect=ALLOW_ACCESS,
    subjects=['<[\w]+ M[\w]+>'],
    resources=('library:books:<.+>', 'office:magazines:<.+>'),
    actions=['<read|get>'],
    context={'ip': CIDR('192.168.2.0/24')}
)

Basically you want to create some set of Policies that encompass access rules for your domain and store them for making future decisions by the Guard component.

st = MemoryStorage()
for p in policies:
    st.add(p)

Additionally you can create Policies with predefined effect classes:

from vakt import PolicyAllow, PolicyDeny, ALLOW_ACCESS, DENY_ACCESS

p = PolicyAllow(1, actions=['<read|get>'], resources=['library:books:<.+>'], subjects=['<[\w]+ M[\w]+>'])
assert ALLOW_ACCESS == p.effect


p = PolicyDeny(2, actions=['<read|get>'], resources=['library:books:<.+>'], subjects=['<[\w]+ M[\w]+>'])
assert DENY_ACCESS == p.effect

Back to top

Inquiry

Inquiry is an object that serves as a mediator between Vakt and outer world request for resource access. All you need to do is take any kind of incoming request (REST request, SOAP, etc.) and build an Inquiry out of it in order to feed it to Vakt. There are no concrete builders for Inquiry from various request types, since it's a very meticulous process and you have hands on control for doing it by yourself. Let's see an example:

from vakt import Inquiry
from flask import request, session

...

# if policies are defined on some subject's and resource's attributes with dictionaries of Rules:
inquiry2 = Inquiry(subject={'login': request.form['username'], 'role': request.form['user_role']},
                   action=request.form['action'],
                   resource={'book': session.get('book'), 'chapter': request.form['chapter']},
                   context={'ip': request.remote_addr})

# if policies are defined with strings or regular expressions:
inquiry = Inquiry(subject=request.form['username'],
                  action=request.form['action'],
                  resource=request.form['page'],
                  context={'ip': request.remote_addr})

Here we are taking form params from Flask request and additional request information. Then we transform them to Inquiry. That's it.

Inquiry has several constructor arguments:

  • resource - any | dictionary of str -> any. What resource is being asked to be accessed?
  • action - any | dictionary str -> any. What is being asked to be done on the resource?
  • subject - any | dictionary str -> any. Who asks for it?
  • context - dictionary str -> any. What is the context of the request?

If you were observant enough you might have noticed that Inquiry resembles Policy, where Policy describes multiple variants of resource access from the owner side and Inquiry describes an concrete access scenario from consumer side.

Back to top

Rules

Rules allow you to describe conditions directly on action, subject, resource and context or on their attributes. If at least one Rule in the Rule-set is not satisfied Inquiry is rejected by given Policy.

Attaching a Rule-set to a Policy is simple. Here are some examples:

from vakt import Policy, rules

Policy(
    ...,
    subjects=[{'name': rules.Eq('Tommy')}],
),

Policy(
    ...,
    actions=[rules.Eq('get'), rules.Eq('list'), rules.Eq('read')],
),

Policy(
    ...,
    context={
        'secret': rules.string.Equal('.KIMZihH0gsrc'),
        'ip': rules.net.CIDR('192.168.0.15/24')
    },
)

There are a number of different Rule types, see below.

If the existing Rules are not enough for you, feel free to define your own.

Comparison-related
Rule Example in Policy Example in Inquiry Notes
Eq 'age': Eq(40) 'age': 40
NotEq 'age': NotEq(40) 'age': 40
Greater 'height': Greater(6.2) 'height': 5.8
Less 'height': Less(6.2) 'height': 5.8
GreaterOrEqual 'stars': GreaterOrEqual(300) 'stars': 77
LessOrEqual 'stars': LessOrEqual(300) 'stars': 300
Logic-related
Rule Example in Policy Example in Inquiry Notes
Truthy 'admin': Truthy() 'admin': user.is_admin() Evaluates on Inquiry creation
Falsy 'admin': Falsy() 'admin': lambda x: x.is_admin() Evaluates on Inquiry creation
Not 'age': Not(Greater(90)) 'age': 40
And 'stars': And(Greater(50), Less(89)) 'stars': 78 Also, attributes in dictionary of Rules act as AND logic
Or 'stars': Or(Greater(50), Less(120), Eq(8888)) 'stars': 78 Also, rules in a list of, say, actions act as OR logic
Any actions=[Any()] action='get', action='foo' Placeholder that fits any value
Neither subjects=[Neither()] subject='Max', subject='Joe' Not very useful, left only as a counterpart of Any
List-related
Rule Example in Policy Example in Inquiry Notes
In 'method': In('get', 'post') 'method': 'get'
NotIn 'method': NotIn('get', 'post') 'method': 'get'
AllIn 'name': AllIn('Max', 'Joe') 'name': ['Max', 'Joe']
AllNotIn 'name': AllNotIn('Max', 'Joe') 'name': ['Max', 'Joe']
AnyIn 'height': AnyIn(5.9, 7.5, 4.9) 'height': [7.55]
AnyNotIn 'height': AnyNotIn(5.9, 7.5, 4.9) 'height': [7.55]
Network-related
Rule Example in Policy Example in Inquiry Notes
CIDR 'ip': CIDR('192.168.2.0/24') 'ip': 192.168.2.4
String-related
Rule Example in Policy Example in Inquiry Notes
Equal 'name': Equal('max', ci=True) 'name': 'Max' Aliased as StrEqual. Use instead of Eq it you want string-type check and case-insensitivity
PairsEqual 'names': PairsEqual() 'names': ['Bob', 'Bob'] Aliased as StrPairsEqual
RegexMatch 'file': RegexMatch(r'\.rb$') 'file': 'test.rb'
StartsWith 'file': StartsWith('logs-') 'file': 'logs-data-101967.log' Supports case-insensitivity
EndsWith 'file': EndsWith('.log') 'file': 'logs-data-101967.log' Supports case-insensitivity
Contains 'file': Contains('sun') 'file': 'observations-sunny-days.csv' Supports case-insensitivity
Inquiry-related

Inquiry-related rules are useful if you want to express equality relation between inquiry elements or their attributes.

Rule Example in Policy Example in Inquiry Notes
SubjectMatch resources=[{'id': SubjectMatch()}] Inquiry(subject='Max', resource={'id': 'Max'}) Works for the whole subject value or one of its attributes
ActionMatch subjects=[ActionMatch('id')] Inquiry(subject='Max', action={'method': 'get', id': 'Max'}) Works for the whole action value or one of its attributes
ResourceMatch subjects=[ResourceMatch('id')] Inquiry(subject='Max', resource={'res': 'book', id': 'Max'}) Works for the whole resource value or one of its attributes
SubjectEqual 'data': SubjectEqual() Inquiry(subject='Max') Works only for strings. Favor SubjectMatch
ActionEqual 'data': ActionEqual() Inquiry(action='get') Works only for strings. Favor ActionMatch
ResourceIn 'data': ResourceIn() Inquiry(resource='/books/') Works only for strings. Favor ResourceMatch

Back to top

Checker

Checker allows you to check whether Policy matches Inquiry by concrete field (subject, action, etc.). It's used internally by Guard, but you should be aware of Checker types:

  • RulesChecker - universal type that is used to check match of Policies defined with Rules or dictionaries of Rules (Rule-based Policy type). It gives you the highest flexibility. Most of the time you will use this type of Polices and thus this type of a Checker. Besides, it's much more performant than RegexChecker. See benchmark for more details.
from vakt import RulesChecker

ch = RulesChecker()
# etc.
  • RegexChecker - checks match by regex test for policies defined with strings and regexps (String-based Policy type). This means that all you Policies can be defined in regex syntax (but if no regex defined in Policy falls back to simple string equality test) - it gives you better flexibility compared to simple strings, but carries a burden of relatively slow performance. You can configure a LRU cache size to adjust performance to your needs:
from vakt import RegexChecker

ch = RegexChecker(2048)
ch2 = RegexChecker(512)
# etc.

See benchmark for more details.

Syntax for description of Policy fields is:

 '<foo.*>'
 'foo<[abc]{2}>bar'
 'foo<\w+>'
 'foo'

Where <> are delimiters of a regular expression boundaries part. Custom Policy can redefine them by overriding start_tag and end_tag properties. Generally you always want to use the first variant: <foo.*>.

Due to relatively slow performance of regular expressions execution we recommend to define your policies in regex syntax only when you really need it, in other cases use simple strings: both will work perfectly (and now swiftly!) with RegexChecker.

NOTE. All regex checks are performed in a case-sensitive way by default. Even thought some storages (e.g. MemoryStorage) allow you to specify regex modifiers within the regex string, we do not translate regex modifiers to all storages (e.g. SQLStorage). Also see warning below

WARNING. Please note, that storages have varying level of regexp support. For example, most SQL databases allow to use POSIX metacharacters whereas python re module and thus MemoryStorage does not. So, while defining policies you're safe and sound as long as you understand how storage of your choice handles the regexps you specified.

  • StringExactChecker - the most quick checker:
Checker that uses exact string equality. Case-sensitive.
E.g. 'sun' in 'sunny' - False
     'sun' in 'sun' - True
  • StringFuzzyChecker - quick checker with some extent of flexibility:
Checker that uses fuzzy substring equality. Case-sensitive.
E.g. 'sun' in 'sunny' - True
     'sun' in 'sun' - True

Note, that some Storage handlers can already check if Policy fits Inquiry in find_for_inquiry() method by performing specific to that storage queries - Storage can (and generally should) decide on the type of actions based on the checker class passed to Guard constructor (or to find_for_inquiry() directly).

Regardless of the results returned by a Storage the Checker is always the last row of control before Vakt makes a decision.

Back to top

Guard

Guard component is a main entry point for Vakt to make a decision. It has one method is_allowed that passed an Inquiry gives you a boolean answer: is that Inquiry allowed or not?

Guard is constructed with Storage and Checker.

Policies that have String-based type won't match if RulesChecker is used and vise-versa.

st = MemoryStorage()
# And persist all our Policies so that to start serving our library.
for p in policies:
    st.add(p)

guard = Guard(st, RulesChecker())

if guard.is_allowed(inquiry):
    return "You've been logged-in", 200
else:
    return "Go away, you violator!", 401

To gain best performance read Caching section.

Back to top

Storage

Storage is a component that gives an interface for manipulating Policies persistence in various places.

It provides the following methods:

add(policy)                 # Store a Policy
get(uid)                    # Retrieve a Policy by its ID
get_all(limit, offset)      # Retrieve all stored Policies (with pagination)
retrieve_all(batch)         # Retrieve all existing stored Policies (without pagination)
update(policy)              # Store an updated Policy
delete(uid)                 # Delete Policy from storage by its ID
find_for_inquiry(inquiry)   # Retrieve Policies that match the given Inquiry

Storage may have various backend implementations (RDBMS, NoSQL databases, etc.), they also may vary in performance characteristics, so see Caching and Benchmark sections.

Vakt ships some Storage implementations out of the box. See below:

Memory

Implementation that stores Policies in memory. It's not backed by any file or something, so every restart of your application will swipe out everything that was stored. Useful for testing.

from vakt import MemoryStorage

storage = MemoryStorage()
MongoDB

MongoDB is chosen as the most popular and widespread NO-SQL database.

from pymongo import MongoClient
from vakt.storage.mongo import MongoStorage

client = MongoClient('localhost', 27017)
storage = MongoStorage(client, 'database-name', collection='optional-collection-name')

Default collection name is 'vakt_policies'.

Actions are the same as for any Storage that conforms interface of vakt.storage.abc.Storage base class.

Beware that currently MongoStorage supports indexed and filtered-out find_for_inquiry() only for StringExact, StringFuzzy and Regex (since MongoDB version 4.2 and onwards) checkers. When used with the RulesChecker it simply returns all the Policies from the database.

SQL

SQL storage is backed by SQLAlchemy, thus it should support any RDBMS available for it: MySQL, Postgres, Oracle, MSSQL, Sqlite, etc.

Given that we support various SQL databases via SQLAlchemy, we don't specify any DB-specific drivers in the vakt dependencies. It's up to the user to provide a desired one. For example: psycopg2 or PyMySQL.

Example for MySQL.

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, scoped_session
from vakt.storage.sql import SQLStorage

engine = create_engine('mysql://root:root@localhost/vakt_db')
storage = SQLStorage(scoped_session=scoped_session(sessionmaker(bind=engine)))

# Don't forget to run migrations here (especially for the first time)
...

Beware that currently SQLStorage supports indexed and filtered-out find_for_inquiry() only for StringExact, StringFuzzy and Regex checkers. When used with the RulesChecker it simply returns all the Policies from the database.

Note that vakt focuses on testing SQLStorage functionality only for two most popular open-source databases: MySQL and Postgres. Other databases support may have worse performance characteristics and/or bugs. Feel free to report any issues.

Redis

Redis storage.

RedisStorate stores all Policies in a hash whose key is the collection name and the hash'es key value pairs are Policy UID -> serialized Policy representation.

Default collection name is "vakt_policies".

You can use different Serializers. Any custom or one of the vakt's native. Just pass it to the RedisStorage constructor.

Vakt is shipped with:

  • JSONSerializer
  • PickleSerializer - the fastest. Used as the default one.

Due to serialization/deserialization Redis is not as fast as simple MemoryStorage. You can run the benchmark and check performance for your use-case.

from redis import Redis
from vakt.storage.redis import RedisStorage

client = Redis('127.0.0.1', 6379)
yield RedisStorage(client, collection='optional-policies-collection-name')
client.flushdb()
client.close()
...

Back to top

Migration

vakt.migration is a set of components that are useful from the perspective of the Storage. It's recommended to favor it over manual actions on DB schema/data since it's aware of Vakt requirements to Policies data. But it's not mandatory, anyway. However it's up to a particular Storage to decide whether it needs migrations or not. It consists of 3 components:

  • Migration
  • MigrationSet
  • Migrator

Migration allows you to describe data modifications between versions. Each storage can have a number of Migration classes to address different releases with the order of the migration specified in order property. Should be located inside particular storage module and implement vakt.storage.migration.Migration. Migration has 2 main methods (as you might guess) and 1 property:

  • up - runs db "schema" upwards
  • down - runs db "schema" downwards (rolls back the actions of up)
  • order - tells the number of the current migration in a row

MigrationSet is a component that represents a collection of Migrations for a Storage. You should define your own migration-set. It should be located inside particular storage module and implement vakt.storage.migration.MigrationSet. It has 3 methods that lest unimplemented:

  • migrations - should return all initialized Migration objects
  • save_applied_number - saves a number of a lst applied up migration in the Storage for later reference
  • last_applied - returns a number of a lst applied up migration from the Storage

Migrator is an executor of a migrations. It can execute all migrations up or down, or execute a particular migration if number argument is provided.

Example usage:

from pymongo import MongoClient
from vakt.storage.mongo import MongoStorage, MongoMigrationSet
from vakt.storage.migration import Migrator

client = MongoClient('localhost', 27017)
storage = MongoStorage(client, 'database-name', collection='optional-collection-name')

migrator = Migrator(MongoMigrationSet(storage))
migrator.up()
...
migrator.down()
...
migrator.up(number=2)
...
migrator.down(number=2)

Back to top

Caching

Vakt has several layers of caching, that serve a single purpose: speed up policy enforcement decisions. In most situations and use-cases you might want to use them all, thus they are designed not to interact with each other, but rather work in tandem (nonetheless you are free to use any single layer alone or any combination of them). That said let's look at all those layers.

Caching RegexChecker

It's relevant only for RegexChecker and allows to cache parsing and execution of regex-defined Policies, which can be very expensive due to inherently slow computational performance of regular expressions and vakt's parsing. When creating a RegexChecker you can specify a cache size for an in-memory LRU (least recently used) cache. Currently only python's native LRU cache is supported.

# preferably size is a power of 2
chk = RegexChecker(cache_size=2048)

# or simply
chk = RegexChecker(2048)

# or 512 by default
chk = RegexChecker()
Caching the entire Storage backend

Some vakt's Storages may be not very clever at filtering Policies at find_for_inquiry especially when dealing with Rule-based policies. In this case they return the whole set of the existing policies stored in the external storage. Needless to say that it makes your application very heavy IO-bound and decreases performance for large policy sets drastically. See benchmark for more details and exact numbers.

In such a case you can use EnfoldCache that wraps your main storage (e.g. MongoStorage) into another one (it's meant to be some in-memory Storage). It returns you a Storage that behind the scene routes all the read-calls (get, get_all, find_for_inquiry, ...) to an in-memory one and all modify-calls (add, update, delete) to your main Storage ( don't worry, in-memory Storage is kept up-to date with the main Storage). In case a requested policy is not found in in-memory Storage it's considered a cache miss and a request is routed to a main Storage.

Also, in order to keep Storages in sync, when you initialize EnfoldCache the in-memory Storage will fetch all the existing Policies from a main one - therefore be forewarned that it might take some amount of time depending on the size of a policy-set. Optionally you can call populate method after initialization, but in this case do not ever call any modify-related methods of EnfoldCache'd storage before populate(), otherwise Storages will be in an unsynchronized state and it'll result in broken Guard functionality.

from vakt import EnfoldCache, MemoryStorage, Policy, Guard, RegexChecker
from vakt.storage.mongo import MongoStorage

storage = EnfoldCache(MongoStorage(...), cache=MemoryStorage())
storage.add(Policy(1, actions=['get']))

...

guard = Guard(storage, RegexChecker())
Caching the Guard

Guard.is_allowed it the the centerpiece of vakt. Therefore it makes ultimate sense to cache it. And create_cached_guard() function allows you to do exactly that. You need to pass it a Storage, a Checker and a maximum size of a cache. It will return you a tuple of: Guard, Storage and AllowanceCache instance:

  • You must do all policies operations with the returned storage (which is a slightly enhanced version of a Storage you provided to the function).
  • The returned Guard is a normal vakt's Guard, but its is_allowed is cached with AllowaceCache.
  • The returned cache is an instance of AllowaceCache and has a handy method info that provides current state of the cache.

How it works?

Only the first Inquiry will be passed to is_allowed, all the subsequent answers for similar Inquiries will be taken from cache. AllowanceCache is rather coarse-grained and if you call Storage's add, update or delete the whole cache will be invalided because the policy-set has changed. However for stable policy-sets it is a good performance boost.

By default AllowanceCache uses in-memory LRU cache and maxsize param is it's size. If for some reason it does not satisfy your needs, you can pass your own implementation of a cache backend that is a subclass of vakt.cache.AllowanceCacheBackend to create_cached_guard as a cache keyword argument.

guard, storage, cache = create_cached_guard(MongoStorage(...), RulesChecker(), maxsize=256)

p1 = Policy(1, actions=[Eq('get')], resources=[Eq('book')], subjects=[Eq('Max')], effect=ALLOW_ACCESS)
storage.add(p1)

# Given we have some inquiries that tend to repeat
inq1 = Inquiry(action='get', resource='book', subject='Max')
inq2 = Inquiry(action='get', resource='book', subject='Jamey')

assert guard.is_allowed(inq1)
assert guard.is_allowed(inq1)
assert guard.is_allowed(inq1)
assert not guard.is_allowed(inq2)
assert guard.is_allowed(inq1)
assert guard.is_allowed(inq1)

# You can check cache state
assert 4 == cache.info().hits
assert 2 == cache.info().misses
assert 2 == cache.info().currsize

Back to top

JSON

All Policies, Inquiries and Rules can be JSON-serialized and deserialized.

For example, for a Policy all you need is just run:

from vakt.policy import Policy

policy = Policy('1')

json_policy = policy.to_json()
print(json_policy)
# {"actions": [], "description": null, "effect": "deny", "uid": "1",
# "resources": [], "context": {}, "subjects": []}

policy = Policy.from_json(json_policy)
print(policy)
# <vakt.policy.Policy object at 0x1023ca198>

The same goes for Rules, Inquiries. All custom classes derived from them support this functionality as well. If you do not derive from Vakt's classes, but want this option, you can mix-in vakt.util.JsonSerializer class.

from vakt.util import JsonSerializer

class CustomInquiry(JsonSerializer):
    pass

Back to top

Logging

Vakt follows a common logging pattern for libraries:

Its corresponding modules log all the events that happen but the log messages by default are handled by NullHandler. It's up to the outer code/application to provide desired log handlers, filters, levels, etc.

For example:

import logging

root = logging.getLogger()
root.setLevel(logging.INFO)
root.addHandler(logging.StreamHandler())

... # here go all the Vakt calls.

Vakt logs can be comprehended in 2 basic levels:

  1. Error/Exception - informs about exceptions and errors during Vakt work.
  2. Info - informs about incoming inquiries, their resolution and policies responsible for this decisions ('vakt.guard' and 'vakt.audit' streams).

Back to top

Audit

Vakt allows you to not only watch the incoming inquiries and their resolution, but also keep track of the policies that were responsible for allowing or rejecting the inquiry. It's done via audit logging.

Audit logging is implemented within a standard Python logging framework. You can enable it by subscribing to an audit ('vakt.audit') logging "stream".

Example of configuration in the code:

import logging

logger = logging.getLogger('vakt.audit')
logger.setLevel(logging.INFO)

fmt = 'msg: %(message)s | effect: %(effect)s | deciders: %(deciders)s | candidates: %(candidates)s | inquiry: %(inquiry)s'
fileHandler = logging.FileHandler('test.log')
fileHandler.setFormatter(logging.Formatter(fmt))
fileHandler.setLevel(logging.INFO)
logger.addHandler(fileHandler)

... # here go all the Vakt calls.

Vakt logs all audit records at the INFO level.

The formatter supports the following fields:

  • message - the message that tells what and why happened in the audit.
  • effect - effect that this decision has: 'allow' or 'deny'.
  • candidates - potential policies that were filtered by storage and checkers and may be responsible for the decision.
  • deciders - policies that are responsible for the final decision.
  • inquiry - the inquiry in question.
  • all the standard Python logging fields like time, level, module name, etc.

The deciders and candidates field can be logged in various ways depending on the the audit_policies_cls. It can be passed to the Guard constructor.

Vakt has the following Audit Policies messages classes out of the box:

  • PoliciesNopMsg
  • PoliciesUidMsg (is the default one)
  • PoliciesDescriptionMsg
  • PoliciesCountMsg

Refer to their documentation on how they represent the policies.

WARNING. Please note, that if you have Guard caching enabled, then audit records for the same subsequent inquiries won't be logged because the calls are cached. However the log records from 'vakt.guard' stream will be always logged - they will tell only was the inquiry allowed or not.

Back to top

Milestones

Most valuable features to be implemented in the order of importance:

  • SQL Storage
  • Rules that reference Inquiry data for Rule-based policies
  • Caching mechanisms (for Storage and Guard)
  • YAML-based language for declarative policy definitions
  • Enhanced audit logging
  • Redis Storage

Back to top

Benchmark

You can see how much time it takes for a single Inquiry to be processed given we have a number of unique Policies in a Storage. For MemoryStorage it measures the runtime of a decision-making process for all the existing Policies when Guard's code iterates the whole list of Policies to decide if Inquiry is allowed or not. In case of other Storages the mileage may vary since they may return a smaller subset of Policies that fit the given Inquiry. Don't forget that most external Storages add some time penalty to perform I/O operations. The runtime also depends on a Policy-type used (and thus checker): RulesChecker performs much better than RegexChecker.

Example:

python3 benchmark.py --checker regex --storage memory -n 1000

Output is:

Populating MemoryStorage with Policies
......................
START BENCHMARK!
Number of unique Policies in DB: 1,000
Among them Policies with the same regexp pattern: 0
Checker used: RegexChecker
Storage used: MemoryStorage
Number of concurrent threads: 1
Decision for Inquiry took (mean: 0.2062 seconds. stdev: 0.0000)
Inquiry passed the guard? False

Script usage:

usage: benchmark.py [-h] [-n [POLICIES_NUMBER]] [-s {mongo,memory,sql,redis}] [-d [SQL_DSN]] [-c {regex,rules,exact,fuzzy}]
                    [-t [THREADS]] [--regexp] [--same SAME] [--cache CACHE] [--serializer {json,pickle}]

Run vakt benchmark.

optional arguments:
  -h, --help            show this help message and exit
  -n [POLICIES_NUMBER], --number [POLICIES_NUMBER]
                        number of policies to create in DB (default: 100000)
  -s {mongo,memory,sql,redis}, --storage {mongo,memory,sql,redis}
                        type of storage (default: memory)
  -d [SQL_DSN], --dsn [SQL_DSN]
                        DSN connection string for sql storage (default: sqlite:///:memory:)
  -c {regex,rules,exact,fuzzy}, --checker {regex,rules,exact,fuzzy}
                        type of checker (default: regex)
  -t [THREADS], --threads [THREADS]
                        number of concurrent requests (default: 1)

regex policy related:
  --regexp              should Policies be defined without Regex syntax? (default: True)
  --same SAME           number of similar regexps in Policy
  --cache CACHE         number of LRU-cache for RegexChecker (default: RegexChecker's default cache-size)

Redis Storage related:
  --serializer {json,pickle}
                        type of serializer for policies stored in Redis (default: json)

Back to top

Acknowledgements

Initial code ideas of Vakt are based on Amazon IAM Policies and Ladon Policies SDK as its reference implementation.

Back to top

Development

To hack Vakt locally run:

$ ...                              # activate virtual environment w/ preferred method (optional)
$ pip install -e .[dev,mongo,sql,redis]  # to install all dependencies
$ pytest -m "not integration"      # to run non-integration tests with coverage report
$ pytest --cov=vakt tests/         # to get coverage report
$ pylint vakt                      # to check code quality with PyLint

To run only integration tests (for Storage adapters other than MemoryStorage):

$ docker run --rm -d -p 27017:27017 mongo
$ # run sql and Redis database here as well...
$ pytest -m integration
$ pytest -m sql_integration

Optionally you can use make to perform development tasks.

Back to top

License

The source code is licensed under Apache License Version 2.0

Back to top

vakt's People

Contributors

kgoyal1988 avatar kolotaev avatar mouslimmouden avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

vakt's Issues

SQL storage implementation

Since the majority of applications use RDBMS we need to implement SQLStorage to support integration of Vakt into existing apps.
One of the obvious wrapper for that is of course SQLAlchemy.

Add List rules

We need to add a rule for List-related checks. Proposed checks are:

  • InList
  • NotInListRule
  • AllInListRule
  • AllNotInListRule
  • AnyInListRule
  • AnyNotInListRule

Support a `filter_by` option in Gaurd

I am not sure if the following fits with the future vision of the package, but wanted to see if it can help resolve the issue -- all policy retrieval from Storage for Regex and Rule based checker.

Provide a filter_by option in Guard initialization, or the Guard.is_allowed method. This value can be then used by the Storage to filter policies on DB. In fact, if MongoEngine back-end is used to implement the MongoStorage, the structure of filter_by can be the same as that used in MongoEngine package. The other option is to just let it be a MongoDB query json.

Since the filter_by is DB dependent, a unified interface to create these filters that can be used for all storage types might be useful and part of a future feature. Some indexing strategy will also be useful.

Overall the approach has the following pros and cons:

Pros:
DB level filtering of policies before evaluation resulting in faster performance.

Cons:
It is possible for a user to use a filter that may result in some policies to be missed for evaluation.

I don't think the con is a big issue since the filter_by option can be left for advance usage with a user warning in the docs.

Is role management supported?

Hi there,
I've been peeking through this library to solve a complex authorization problem I have to handle. Looking at the examples, there is one where you use the "role" attribute as subject to assess whether you should give access to a developer or not. In the RBAC system I currently have, users have a role "e.g. developer", and then I inquiry the authorization service with just the username, and then the service automatically resolves that the user is a developer, and then it decides to give you access or not (the request never knows what the role of the user might be).

Is this kind of capability supported? Thank you!

Convenient attribute-based checker

Regex and strings are sometimes not very handy for attribute manipulation as supposed by classic ABAC implementation.

We need to add new Checker type that will give the ability to:

  • add/delete attributes to policies
  • support various comparison operators for attribute match

Setting objects' attributes in vakt

Hi Egor,

I'm looking into your library, and I would like to understand is it possible to attach attributes to entities as per ABAC model (https://en.wikipedia.org/wiki/Attribute-based_access_control#Attributes)? So far, I see that I can use contextual attributes in Roles, but I can't see how I can add subject, object and resources attributes.

For example, how do I implement the following simple scenario with vakt: I have users that register continuously, a user has an attribute "collaborator" with a list of dataset ids, when a new id is added to this list, they can update a dataset with this id.

Broken readme link

The README contains this section:

Description

Vakt is an attribute-based and policy-based access control (ABAC) toolkit that is based on policies. ABAC stands aside of RBAC and ACL models, giving you a fine-grained control on definition of the rules that restrict an access to resources and is generally considered a "next generation" authorization model. In its form Vakt resembles IAM Policies, but has a way nicer attribute managing.

However the link: https://github.com/awsdocs/iam-user-guide/blob/master/doc_source/access_policies.md is broken.

Error when using vakt.rules.string.RegexMatchRule

Hello!
I want to write this kind of policy:

Policy(
        uid = str(8),
        effect = DENY_ACCESS,
        subjects = ['<.*>'],
        resources = ['vms'],
        actions = ['pause'],
        rules = {
            'vm': vakt.rules.string.RegexMatchRule('m.*'),
        }
    )

And I get this kind of error:


Traceback (most recent call last):
  File "creator.py", line 54, in <module>
    st.add(p)
  File "/home/skotti/attribute_conf/vakt/vakt/storage/mongo.py", line 36, in add
    self.collection.insert_one(self.__prepare_doc(policy))
  File "/home/skotti/attribute_conf/vakt/vakt/storage/mongo.py", line 110, in __prepare_doc
    doc = b_json.loads(policy.to_json())
  File "/home/skotti/attribute_conf/vakt/vakt/util.py", line 33, in to_json
    default=lambda o: o.to_json() if isinstance(o, JsonSerializer) else vars(o))
  File "/usr/lib/python3.5/json/__init__.py", line 237, in dumps
    **kw).encode(obj)
  File "/usr/lib/python3.5/json/encoder.py", line 198, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.5/json/encoder.py", line 256, in iterencode
    return _iterencode(o, 0)
  File "/home/skotti/attribute_conf/vakt/vakt/util.py", line 33, in <lambda>
    default=lambda o: o.to_json() if isinstance(o, JsonSerializer) else vars(o))
  File "/home/skotti/attribute_conf/vakt/vakt/util.py", line 33, in to_json
    default=lambda o: o.to_json() if isinstance(o, JsonSerializer) else vars(o))
  File "/usr/lib/python3.5/json/__init__.py", line 237, in dumps
    **kw).encode(obj)
  File "/usr/lib/python3.5/json/encoder.py", line 198, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.5/json/encoder.py", line 256, in iterencode
    return _iterencode(o, 0)
  File "/home/skotti/attribute_conf/vakt/vakt/util.py", line 33, in <lambda>
    default=lambda o: o.to_json() if isinstance(o, JsonSerializer) else vars(o))

This happens only with this regular expressions, for example with StringEqualRule everything is fine.

How to use ResourceIn

Hi!
This is a question
It's being hard to understand ABAC for me, first time implementing it
I need help trying to create a policy to "Allow a store user to read the open orders of its store only", I have looked for tests or examples using ResourceIn, but just found a very simple rules test, so not sure how to use it, I have this

policy = PolicyAllow(
    1,
    description="Allow a store owner to read orders only for their store",
    subjects=["manager"],
    resources=["order:<.+>"],
    actions=["read"],
    context={"store": ResourceIn()},
)

storage = MemoryStorage()
storage.add(policy)
guard = Guard(storage, RegexChecker())

i = Inquiry(
    subject="manager",
    resource="order:store1",
    action="read",
    context={"store": "order:store1"},
)

print(guard.is_allowed(i))  # False

How can I filter orders by some store ID and to check the open state of each order?

Possible high severity issue which exposes the Werkzeug debugger and allows the execution of arbitrary code

➜  vakt git:(master) bandit -r ./ -lll
[main]	INFO	profile include tests: None
[main]	INFO	profile exclude tests: None
[main]	INFO	cli include tests: None
[main]	INFO	cli exclude tests: None
[main]	INFO	running on Python 3.10.8
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
Run started:2022-12-25 20:21:25.524801

Test results:
>> Issue: [B201:flask_debug_true] A Flask app appears to be run with debug=True, which exposes the Werkzeug debugger and allows the execution of arbitrary code.
   Severity: High   Confidence: Medium
   CWE: CWE-94 (https://cwe.mitre.org/data/definitions/94.html)
   More Info: https://bandit.readthedocs.io/en/1.7.5/plugins/b201_flask_debug_true.html
   Location: ./examples/regex-policies/server.py:158:4
157	    init()
158	    app.run(debug=True)

--------------------------------------------------

Code scanned:
	Total lines of code: 8214
	Total lines skipped (#nosec): 0

Run metrics:
	Total issues (by severity):
		Undefined: 0
		Low: 773
		Medium: 20
		High: 1
	Total issues (by confidence):
		Undefined: 0
		Low: 13
		Medium: 5
		High: 776
Files skipped (0):

The output above is the result of the execution https://github.com/PyCQA/bandit

Mongodb storage more selective filter query for `_create_filter`

Hi,

As per the current implementation in link, the mongodb storage returns all the documents for RegexChecker.This will lead to a performance hit for large number of polices in database. Maybe a temporary solution is to create a more selective query using the following

        operator = "$regex"
        conditions = []
        for field in ['subjects', 'resources', 'actions']:
            conditions.append(
                {
                    field: {
                        '$elemMatch': {
                            operator: get_value(field.rstrip('s'))
                        }
                    }
                }
            )
       return {"$or": conditions} #note the use of $or instead of $and

till the issue https://jira.mongodb.org/browse/SERVER-11947 is fixed? This way better performance can be achieved at scale.

Thanks!

use inquiries data in policies

Hi,

Is it possible to use inquiries parameters in element of the policy ?

The use case is to limit an API call to only raws that belongs to a particular user.
(stored in the example below in 'route_instance_id')

I want to authorise the rule only if the route_instance_id in the inquiry resource match the user_id in the inquiry subject

The only way I've found for now is to add dynamic policy at each request, but it's not very efficient and hard to maintain.

ex:

inquiry = vakt.Inquiry(
                subject={"user_id": 42, "roles": ["user"]},
                action="GET",
                resource={
                        'route_endpoint': 'test_endpoint',
                        'route_instance_id': 42}
                        'query_sort': '-id',
                        'query_foo': 'bar'}
                    }
                ),
                context={"ip": "127.0.0.1"},
            )

against policy:

policy_user_test = vakt.Policy(
        uuid.uuid4().hex,
        resources=[
            {
                "route_endpoint": Eq("test_endpoint"),
                "route_instance_id": Inquiry('subject', 'user_id'),
            }
        ],  # uri
        actions=[Eq("GET")],
        subjects=[{"roles": AnyIn("user")}],
        effect=vakt.ALLOW_ACCESS,
        description="""
        Allow get for only its own instance
        """,
    )

I've seen there was a first support before 1.2 for string with SubjectEqual & co, but it has been dropped and did not support dict, is there a reason for dropping this ?

I think the use case is quite common (below is an example, but i've plenty of use-case like this one for the api).

Or perhaps there is another way to do the same thing i've overlooked ?

Thanks,
Regards,
Thomas.

Rule based on foreign key relationship

hi,

I have a series of relationships set up like this:

User
id
customer_id

Customer
id
distributor_id

Distributor
id

I have an endpoint like /user/, and I want to restrict users to only be able to see other users whose user.customer.distributor_id is the same as their user.customer.distributor_id

How would I go about setting up a rule for a scenario like this? Is it even possible to abstract these kinds of relationships into an ABAC schema?

Using Vakt with Pandas

Hi, I would like to implement access control to a panda dataframe. I set up the whole policies...

I have a data frame called df, to which I have attached some attributes...

I would like to restrict query functions. I would like to restrict a user from using the following command for example:

df.info or df.loc[1:10] (those are basic examples).

I do not see how I can apply vakt for this use case.

Thank you so much for your answer.

GraphDB support

Hello,
Have you any plans to support GraphDB's, for instance Neo4j?

Usage example on README file doesn't work

The code sample fails to add a rule for the context in the policy. i.e
Instead of context={'referer': 'https://github.com'}, it should be context={'referer': Eq('https://github.com')},

[Feature][Performance] Use object instead of dict

While using vakt, i found that it's hard to build a common inquiry, and since vakt does not support dynamic attribute retrieve, so i have to provide all information to make an Inquiry, and there is where waste coming, some information are not necessary but computed anyway.

If vakt build inquiry using object, then we can make some computing delay or never execute by define them with @functools.cached_property

For example, policy A require attribute subject.is_friend_of_my_friends(assume it is heavy computing)

  • policy A missed, then this computing never executed
  • policy A hited, just normal and go
    • some other policies require subject.is_friend_of_my_friends too, and cache will return, no more extra computing

And this make vakt more like Attribute-Base-Access-Control, isn’t it?

Where to find stored policies?

Hi,

First of all, thank you so much for your great job.

I would like to know where all the previously created policies are stored.

Thank you in advance.

Create caching mechanism for Storages

Since external persistent storages may be heavy I/O bound for Rule-based Policies (return all existing policies set), we need a caching mechanism that will alleviate slow performance with in-memory solutions.

For example: MemoryStorage shows 3 seconds per decision for 1 million policy set on PyPy and 12 seconds for Python 3.7. Which is order of magnitude faster than MongoStorage.

Proposal for new features

@kolotaev Wanted to propose a couple of features which can enhance the usability and power of the project.

  1. Have a user-friendly JSON schema for marshaling the policy objects. Currently, the jsonpickle package is being used to load and covert policies from JSON. However, this serializer inserts the py/objects fields to keep track of the classes for de-serialization. As an example, the policy
vakt.Policy(
    str(uuid.uuid4()),
    actions=[Eq('fork'), Eq('clone')],
    resources=[StartsWith('repos/Google', ci=True)],
    subjects=[{'name': Any(), 'stars': And(Greater(50), Less(999))}],
    effect=vakt.ALLOW_ACCESS,
    context={'referer': Eq('https://github.com')},
    description="""
    Allow to fork or clone any Google repository for
    users that have > 50 and < 999 stars and came from Github
    """
)

has the following JSON form:

{
  "actions": [
    {
      "py/object": "vakt.rules.operator.Eq",
      "val": "fork"
    },
    {
      "py/object": "vakt.rules.operator.Eq",
      "val": "clone"
    }
  ],
  "context": {
    "referer": {
      "py/object": "vakt.rules.operator.Eq",
      "val": "https://github.com"
    }
  },
  "description": "\\n    Allow to fork or clone any Google repository for\\n    users that have > 50 and < 999 stars and came from Github\\n    ",
  "effect": "allow",
  "resources": [
    {
      "py/object": "vakt.rules.string.StartsWith",
      "ci": true,
      "val": "repos/Google"
    }
  ],
  "subjects": [
    {
      "name": {
        "py/object": "vakt.rules.logic.Any"
      },
      "stars": {
        "py/object": "vakt.rules.logic.And",
        "rules": {
          "py/tuple": [
            {
              "py/object": "vakt.rules.operator.Greater",
              "val": 50
            },
            {
              "py/object": "vakt.rules.operator.Less",
              "val": 999
            }
          ]
        }
      }
    }
  ],
  "type": 2,
  "uid": "4d7f9d40-0ef7-41e4-a649-4450cc5be9a8"
}

This JSON has fields which are either unclear (like "ci") or not user friendly (like "py/object"). I think this can be ressolved by using a better marshalling package like marshmallow. I created an implementation of such in this forked version of the code --> https://github.com/ketgo/pyabac.

  1. Use of objectPath format for attributes in Policy. This object path can be used to eactract the value of the attribute from the Inquiry. In this way we can support nested attribute based access control. For example, if we have the following inquiry
vakt.Inquiry(
  subjects={"name": "Max", "address": {"city": "Boston", "state": "MA"}},
  resource={"url": "/api/v1.0/users"},
  action={"method": "GET"}
) 

and want to set a policy which includes a rule on city in the adress field, we can do so by following

vakt.Policy(
  subjects=[{"$.address.city": Eq("Boston")}],
  action=[{"$.method": Eq("GET")}],
  effect=ALLOW_ACCESS 
)

Here the sting $.address.city is in object path format. Again, I have a working implementation in the repo --> https://github.com/ketgo/pyabac.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.