accuknox / discovery-engine Goto Github PK
View Code? Open in Web Editor NEWDiscover least permissive security posture, Network Microsegmentation, and Application behaviour based on visibility/observability data emitted from policy engines..
Discover least permissive security posture, Network Microsegmentation, and Application behaviour based on visibility/observability data emitted from policy engines..
Not only get the cilium traffic information from the database,
we should have an option to connect the cilium hubble relay, and get the traffic information each time interval.
Right now, if we apply the cilium network policies that are being generated by our libraries, we see that some of the network flows are getting dropped.
Precisely, the dropped network flow corresponds to egress communication from the pods to external IPs, as we can see in the screenshot below.
We can have some function to check if a packet is being sent to an ip that does not map to any of the pod after DNS resolution, then we drop the egress rules from cilium network policy all together and just have ingress rules for the particular pod. Because that can mean that the pod has external entities communicating with it and we can't possibly have rules to whitelist every possible CIDR.
Policy aggregation level
Selector side: label aggregation level
Target side: label aggregation level
Target side: port aggregation
Ignore flows
Ignore policies
Add 8 test cases for system policy discovery
matchPaths (file operation with fromSource)
matchDirecotires (file operation with fromSource)
matchPaths (file operation w/o fromSource)
matchDirectories (file operation w/o fromSource)
matchPaths (process operation with fromSource)
matchDirecotires (process operation with fromSource)
matchPaths (process operation w/o fromSource)
matchDirectories (process operation w/o fromSource)
Update configuration manager for system policy operations (i.e., file and process)
What happens if we have policies having overlapping rules?
Let's say we have a discovered policy { name: policy_1, label:xyz, rule1, rule2 }
which gets added to the policy group. At a later point in time there is another policy which is discovered which is { name: policy_2, label: xyz, rule1, rule2, rule3 }
.
policy_2
renders policy_1
useless i.e, policy_2
has the same labels and all the rule sets of policy_1
+ more. However, with respect to the enforcement, it is possible to have both the policies applicable at the same time in the backend. This would not cause any problems.
It is possible to find redundant policies in all the groups by running a policy trace simulation engine. However, this could be optimization and not a basic requirement. This issue handles this point.
In the future, we can have an icon alongside a policy that signals it to be a redundant policy and on clicking that icon we show a list of policies against which it matches.
src/core
unit-testssrc/libs
unit-testsAs of now, if at least 1 kubeArmorPolicy applied, KubeArmor doesn't generate the system logs anymore.
Rather, it generates the system alert events.
Thus, we need to discover the system policy based on the system alert events as well to not miss any other system policy.
Tasks involved:
knoxAutoPolicy has a dependency on other modules such as Network flow, MongoDB and knoxServicePolicy. Configure the helm charts in a way such that all the services are deployed appropriately on a new cluster and the information is shared correctly.
provide an option to save and show the network logs(flows) that derived a network policy
for example,
discovered policy (web->db:3066/TCP) ---> based on the network logs(flows): network log A, network log B, ... and so on
Rule-sets could be based on protocol, port, HTTP attributes, FQDNs etc. For detailed rule-sets, please check:
https://docs.google.com/spreadsheets/d/1ty2ZPWCalCGoDsEqB6-2w2H9f08RVG7k9xqSWG1N37E/edit#gid=1740895243
The module should check if the newly detected policy is already present in the database/table. The matching has to be done based on Selector Labels and the RuleSets. It's a strict match i.e, all the labels and all the rules have to match.
Testing the daemon in cilium cluster deployed within GKE to see
Update github document for system policy discovery parts
Then, we can test a test case by pushing inputs (flow) and outputs (expected policies).
provide an option to select a time range for a one-time discovery
- from X to Y
- before X minutes/hours/days/weeks from now
We need to annotate a pod with io.cilium.proxy-visibility=<{Traffic Direction}/{L4 Port}/{L4 Protocol}/{L7 Protocol}>
to see the payload of the packet from cilium hubble.
for example kubectl annotate pod foo -n bar io.cilium.proxy-visibility="<Egress/53/UDP/DNS>,<Egress/80/TCP/HTTP>"
Then, the cilium monitor forwards the packets that are matched with the annotation to the envoy proxy to get its payload.
So, finally, we can generate L7 network policies based on those information.
Here, what is our strategy to monitor the port number/protocol for discovering network policies?
Use-case:
we should be able to discover network policies by three different modes for its flexibility
Currently, the database access is done through user/passwd passed through env vars. This needs to be done through secrets manager.
Currently, if there is an internal pod accessing external service then every flow access may lead to a different IP address. In the flow information, we see that as an external IP access and thus a toCIDR
egress policy is discovered for all such flows. The external service may be hosted on hundreds of different IP addresses and thus may be a problem since it will result in different toCIDR policy everytime.
We need to use reverse DNS lookup to convert the IP address to a domain name and then over a period of time need to aggregate the policies.
For now, the HTTP rule (method/path) is handled by the exact matching. So, for example, if the product page is defined by the product ID, there could be many paths as follows.
apiVersion: v1
kind: KnoxNetworkPolicy
metadata:
name: autopol-egress-thbttgjfepzbadb
namespace: hipster
rule: matchLabels+toHTTPs+toPorts
status: latest
type: egress
spec:
selector:
matchLabels:
app: loadgenerator
egress:
- matchLabels:
app: frontend
k8s:io.kubernetes.pod.namespace: hipster
toPorts:
- port: "8080"
protocol: tcp
toHTTPs:
- method: GET
path: /product/6E92ZMYYFZ
- method: GET
path: /product/9SIQT8TOJO
- method: GET
path: /cart
- method: GET
path: /product/0PUK6V6EV0
- method: GET
path: /product/OLJCESPC7Z
- method: GET
path: /product/1YMWWN1N4O
- method: GET
path: /product/2ZYFJ3GM2N
- method: GET
path: /product/66VCHSJNUP
action: allow
generatedTime: 1608101559
So we need to handle those multiple paths of the HTTP rule by aggregating. The challenges here are how to merge? and why?
Provide a CLI
After a specific point, the discovered policy should be saturated.
if so, check how much time(or how many network flows) we need in Google hipster app.
Performance test with Cilium L7 visibility
base (no L7 visibility)
HTTP visibility
will be tested by Apache Bench
A library which takes input as a set of flows and derives discovers rules. The library is a generic lib which can be used for flows from cilium or sysdig equally.
use mysql db than mongo db to interact with other services
Install prerequisite software
In the cron-job operation,
if we discover a new policy that has same the selector of the previous one,
we should merge the old policy into the new policy.
And, when merging those policies, we should consider the file/process the path aggregation.
Another option was to keep the auto-discovered policies in a separate folder in the same git-server repo. But knoxAutoPolicy daemon might have to periodically check if the newly discovered policy was already discovered previously. Secondly, the version controls cannot be applied to the discovered policies. Hence keeping it in a separate DB makes sense.
Verifying if the discovered policy will result in allowing only the specified traffic (or specified behavior).
If we discover a policy such that only allow role=frontend to communicate to role=backend
, the knoxAutoPolicy should verify if only those flows are impacted by going through (using cilium policy trace for e.g.) all the flows in the database.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.