mitre-attack / attack-datasources Goto Github PK

View Code? Open in Web Editor NEW

402.0 402.0 104.0 2.66 MB

This content is analysis and research of the data sources currently listed in ATT&CK.

License: Apache License 2.0

Jupyter Notebook 100.00%

attack-datasources's People

Contributors

Stargazers

Watchers

Forkers

attack-datasources's Issues

Questions about prior art and specific mappings

Thank you for this! We love ATT&CK, but the data sources sections have always felt a bit "loose" and left mostly as an exercise for the reader. The blog series and this repo prompted a couple questions I hoped you could discuss:

Why not use/extend an existing schema for the abstractions?

For example, STIX Cyber-observable Objects (SCO) cover some of the same ground, and link nicely with STIX-formatted intel ... like ATT&CK itself. The spec for the objects and their relationships reads a bit like your yaml data sources, and they can be reified with real data. Seems like STIX SCO is a natural fit, plus it has a well-thought-out relationship model, serialization format, extensions, etc.

The Elastic Common Schema (ECS) is great too - it's permissively licensed, available for collaboration on github, has abstractions for many of the examples you provide (users, processes, etc.), and is already powering a lot of searches, visualizations, and analytics. We see it more in ops contexts, and it's perhaps a bit more flexible than SCOs. For example, you see it frequently merged with existing event data so you get the benefit of the abstractions without sacrificing the specificity of the original event.

One of the beautiful things about ATT&CK is it reduced bike-shedding over terminology and helped the infosec community focus - STIX and ECS have put a lot of similar work, seems good to stand on the shoulders of giants. Naming things is hard, and it takes time to overcome intuitions (even at the top level: e.g., to my ear the phrase "data source" connotes the place you get the data, rather than an abstraction of the observable, but I'm just one guy 🙂).

In any case, if ATT&CK leveraged one of these for the abstract entities, seems you could save energy for more ATT&CK-specific work like mapping those to (sub-)techniques or the actual concrete logs/artifacts.
Are there plans to be more specific about mappings to artifacts?

Presumably the idea is that (sub-)techniques would eventually use these new abstract data sources to replace or augment the text in the current "Data Sources" section. Unfortunately, unless I'm missing something, the proposed model doesn't seem to have a way to capture links to the concrete logs/artifacts.

For example, the mapping example in figure 13 in part 2 of the blog series illustrates this last step:

That is, it shows links from the data components to specific event logs on the right, and that last step is really useful ... but it doesn't actually live anywhere in this repo's proposed approach. For many teams that last leg is the hard part! If we took your schema, for example, maybe added something like:
```
- name: Service
  definition: Information about software programs that run in the background ...
  example_artifacts:
    - {os: windows, artifact: Security Audit Event 4688}
    - {os: windows, artifact: Sysmon Event 1}
    - {os: windows, artifact: Prefetch file}
    - {os: linux, artifact: auditd SYSCALL event}
    - {os: linux, artifact: auditd EXECVE event}
    # etc
```
Perhaps this is considered out of scope, but hopefully not; it'd be great to see something as authoritative as ATT&CK pointing folks to specific useful artifacts rather than just the abstraction. I'd love to hear your thoughts.

Thanks again for your hard work on this and all the related projects, I look forward to learning more!

Support NIDS and WAF via new 'network traffic content' relationship

Hello.

With the new DS structure NIDS and WAF are no longer available. A new relationship could be created in order to improve the mapping with alert related events:

Data source: Network Traffic
Data component: network traffic content
Relationship:

  - source_data_element: network traffic        
    relationship: triggered        
    target_data_element: alert

Thanks in advance.

.nan appears in techniques_to_components_mapping.yaml file when there is no data source

The code that is generating the techniques_to_components_mapping.yaml seems to be writing .nan when there is no data source. Perhaps these should be left blank or omitted.

Fix Definition for Module

attack-datasources/contribution/module.yml

Line 2 in 0cde745

 definition: Information about module files such as executable, dynamic link library (dll), executable and linkiable format (elf), and Mach-o consisting of one or more classes and interfaces. 

There's a minor typo (linkiable) in the definition text, and I think the overall definition can be modified a little for accuracy since PE/ELF/Mach-O encompass both executables and libraries. I would suggest:

Information about module files consisting of one or more classes and interfaces, such as portable executable (PE) format executables/dynamic link libraries (DLL), executable and linkable format (ELF) executables/shared libraries, and Mach-O format executables/shared libraries.

Permanent UUID or ID in attack-datasources

Thanks for the project. It's a very good idea.

Will you add an fixed/permanent UUID or ID in the sources?

It could be useful for many project to reuse the same data source description or create relationships on a permanent basis (just like we do in CyCAT.org).

Inexistant data components references and duplicate sources

Hello,

While working in the new data sources you made, I found that there are some duplicates and some non-referenced types in the techniques descriptions.

In the following example :

File: File Content does not exist here
File: File Creation is referenced multiple times.

If this is not on purpose, I can try and find all occurrences and report them to you through a PR for missing data components and a list for duplicates.

Thanks,

Small loading error

When I tried to load logon_session.yml I have gotten the error

mapping values are not allowed here
at line 22, column 72.

The offending line is
description: Data and information that describe a logon session (ex: logon type) and activity within it.

It can be solved by removing the space between the colon and logon
description: Data and information that describe a logon session (ex:logon type) and activity within it.

Best regards,
Sascha90

KeyError: "['x_mitre_is_subtechnique'] not in index"

This error occurs in the notebook_functions.py file at the get_attack_dataframe function.

Below Commands in .ipnyb file reproduce this error:
attack = get_attack_dataframe()
attack.head()

output :

KeyError Traceback (most recent call last)
Input In [32], in
----> 1 attack = get_attack_dataframe()
2 attack.head()

File D:\Dec-\attack-datasources-main\docs\scripts\notebook_functions.py:57, in get_attack_dataframe(matrix)
53 attck = json_normalize(attck)
54 # view available columns - my line
55 #print(attck.columns)
56 # selecting columns
---> 57 attck = attck[['technique_id','x_mitre_is_subtechnique','technique','tactic','platform','data_sources']]
59 # Splitting data_sources field
60 attck = attck.explode('data_sources').reset_index(drop=True)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\frame.py:3464, in DataFrame.getitem(self, key)
3462 if is_iterator(key):
3463 key = list(key)
-> 3464 indexer = self.loc._get_listlike_indexer(key, axis=1)[1]
3466 # take() does not accept boolean indexers
3467 if getattr(indexer, "dtype", None) == bool:

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexing.py:1314, in _LocIndexer._get_listlike_indexer(self, key, axis)
1311 else:
1312 keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
-> 1314 self._validate_read_indexer(keyarr, indexer, axis)
1316 if needs_i8_conversion(ax.dtype) or isinstance(
1317 ax, (IntervalIndex, CategoricalIndex)
1318 ):
1319 # For CategoricalIndex take instead of reindex to preserve dtype.
1320 # For IntervalIndex this is to map integers to the Intervals they match to.
1321 keyarr = ax.take(indexer)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexing.py:1377, in _LocIndexer._validate_read_indexer(self, key, indexer, axis)
1374 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
1376 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
-> 1377 raise KeyError(f"{not_found} not in index")

KeyError: "['x_mitre_is_subtechnique'] not in index"

Questions about data format

I found this new data sources very promising as someone coming from the ATT&CK matrix world looking for reducing the gap between events and CTI.

This is more a design question than an issue:

Why did you choose YAML over JSON that is widely used in the cti repo ?
Why did not you follow the STIX format to make it more easily connectable to the (sub)technique from the same cti repo ?

Update Mappings After Initial Release

Need to update mappings file (https://github.com/mitre-attack/attack-datasources/blob/main/sub_techniques_research_reference/DataSources_Techniques_Mapping.yaml) following the release (https://twitter.com/MITREattack/status/1379864257697869828).

mitre-attack / attack-datasources Goto Github PK

attack-datasources's People

Contributors

Stargazers

Watchers

Forkers

attack-datasources's Issues

Questions about prior art and specific mappings

Support NIDS and WAF via new 'network traffic content' relationship

.nan appears in techniques_to_components_mapping.yaml file when there is no data source

Fix Definition for Module

Permanent UUID or ID in attack-datasources

Inexistant data components references and duplicate sources

Small loading error

KeyError: "['x_mitre_is_subtechnique'] not in index"

Questions about data format

Update Mappings After Initial Release

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent