Giter Club home page Giter Club logo

aws-solutions / network-orchestration-for-aws-transit-gateway Goto Github PK

View Code? Open in Web Editor NEW
110.0 27.0 46.0 4.14 MB

The Network Orchestration for AWS Transit Gateway solution automates the process of setting up and managing transit networks in distributed AWS environments. It creates a web interface to help control, audit, and approve (transit) network changes.

Home Page: https://aws.amazon.com/solutions/implementations/serverless-transit-network-orchestrator/

License: Apache License 2.0

Shell 2.59% Python 82.05% JavaScript 0.69% HTML 0.45% CSS 0.08% TypeScript 14.15%
aws-organizations aws-transit-gateway networking

network-orchestration-for-aws-transit-gateway's Introduction

Network Orchestration for AWS Transit Gateway

Formerly known as: Serverless Transit Network Orchestrator (STNO)

🚀Solution Landing Page | 🚧Feature request| 🐛Bug Report | 📜Documentation Improvement

_Note: For relevant information outside the scope of this readme, refer to the solution landing page and implementation guide.

Table of contents

Solution overview

The Network Orchestration for AWS Transit Gateway solution adds automation to AWS Transit Gateway. This solution provides the tools necessary to automate the process of setting up and managing transit networks in multi-account and multi-Region AWS environments. The solution deploys a web interface to help you control, audit, and approve transit network changes. This solution supports both AWS Organizations and standalone AWS account types.

This Network Orchestration for AWS Transit Gateway version supports Transit Gateway inter-Region peering and Amazon Virtual Private Cloud (Amazon VPC) prefix lists. Customers can establish peering connections between transit gateways to extend connectivity and build global networks spanning multiple AWS Regions. You can also automatically register Transit Gateway with Network Manager. This helps customers visualize and monitor their global network from a single dashboard rather than toggling between Regions from the AWS Management Console.

Architecture

The solution follows hub-spoke deployment model and uses given workflow:

  1. An Amazon EventBridge rule monitors specific VPC and subnet tag changes.
  2. An EventBridge rule in the spoke account sends the tags to the EventBridge bus in the hub account.
  3. The rules associated with the EventBridge bus invoke an AWS Lambda function to start the solution workflow.
  4. AWS Step Functions (solution state machine) processes network requests from the spoke accounts.
  5. The state machine workflow attaches a VPC to the transit gateway.
  6. The state machine workflow updates the VPC route table associated with the tagged subnet.
  7. The state machine workflow updates the transit gateway route table with association and propagation changes.
  8. (Optional) The state machine workflow updates the attachment name with the VPC name and the Organizational Unit (OU) name for the spoke account (retrieved from the Org Management account).
  9. The solution updates Amazon DynamoDB with the information extracted from the event and resources created, updated, or deleted in the workflow.

Installing pre-packaged solution template

Note: All templates need to be deployed in the same preferred Region.


Customization

Use the following steps if you want to customize the solution or extend the solution with newer capabilities.

Setup

  • Python Prerequisite: python=3.11 | pip3=23.2.1
  • Javascript Prerequisite: node=v18.16.0 | npm=9.5.1

Clone the repository and make desired code changes.

git clone aws-solutions/network-orchestration-for-aws-transit-gateway

Note: The following steps have been tested under the preceding pre-requisites.

Unit Test

Run unit tests to ensure that your added customization passes the tests.

cd ./source
chmod +x ./run-unit-tests.sh
./run-unit-tests.sh
cd ..

✅ Ensure that all unit tests pass. Review the generated coverage report.

Build

Use the following steps to build your customized distributable.

Note: For PROFILE_NAME, substitute the name of an AWS CLI profile that contains appropriate credentials for deploying in your preferred Region.

  • Create an Amazon Simple Storage Service (Amazon S3) bucket with the format 'MY-BUCKET-<aws_region>'. The solution's CloudFormation template will expect the source code to be located in this bucket. <aws_region> is where you are testing the customized solution.

You can use the following commands to create this bucket:

ACCOUNT_ID=$(aws sts get-caller-identity --output text --query Account --profile <PROFILE_NAME>)
REGION=$(aws configure get region --profile <PROFILE_NAME>)
BUCKET_NAME=stno-$ACCOUNT_ID-$REGION
aws s3 mb s3://$BUCKET_NAME/

# Default encryption:
aws s3api put-bucket-encryption \
  --bucket $BUCKET_NAME \
  --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'

# Enable public access block:
aws s3api put-public-access-block \
  --bucket $BUCKET_NAME \
  --public-access-block-configuration "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true"
  • Configure the solution name, version number, and bucket name:
SOLUTION_NAME=network-orchestration-for-aws-transit-gateway
DIST_OUTPUT_BUCKET=stno-$ACCOUNT_ID
VERSION=custom001
  • Build the distributable using build-s3-dist.sh:
cd ./deployment
chmod +x ./build-s3-dist.sh
./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION

✅ All assets are now built. You should see templates under deployment/global-s3-assets and other artifacts (console and lambda binaries) under deployment/regional-s3-assets.

Deploy

Deploy the distributable to an S3 bucket in your account:

aws s3 ls s3://$BUCKET_NAME  # should not give an error
cd ./deployment
aws s3 cp global-s3-assets/ s3://$BUCKET_NAME/$SOLUTION_NAME/$VERSION/ --recursive --expected-bucket-owner $ACCOUNT_ID --profile <PROFILE_NAME>
aws s3 cp regional-s3-assets/ s3://$BUCKET_NAME/$SOLUTION_NAME/$VERSION/ --recursive --expected-bucket-owner $ACCOUNT_ID --profile <PROFILE_NAME>

✅ All assets are now staged on your S3 bucket. You or any user can use S3 links for deployments.

File structure

Network Orchestration for AWS Transit Gateway solution consists of:

  • Solution templates to provision needed AWS resources
  • Lambda microservices to implement solution functional logics
    • custom_resource: Handle cfn custom resource CRUD
    • state_machine: Handle solution's core state machine
    • tgw_peering: Handle solution transit gateway peering functionality
  • UI to deploy solution UI components
|-.github
|-architecture.png                                     [ architecture diagram ]
|-deployment/    
  |-manifest-generator                                        [ generates manifest files for solution ui ]
  |-network-orchestration-hub.template                        [ hub template ]
  |-network-orchestration-hub-service-linked-roles.template   [ hub template, deploys service linked roles]
  |-network-orchestration-spoke.template                      [ spoke template]  
  |-network-orchestration-spoke-service-linked-roles.template [ spoke template, deploys only service linked roles ]  
  |-network-orchestration-organization-role.template          [ role template, deploys in management account ]
  |-build-s3-dist.sh                                          [ script to build solution microservices ]
|-source/
  |-cognito-trigger                   [ manage new user creation in the cognito user pool ]
  |-lambda/                           [ solution microservices ]
    |-custom_resource                 [ CloudFormation Custom Resources ]
    |-tgw_peering_attachment          [ Manage Transit Gateway Peering Attachments]
    |-tgw_vpc_attachment              [ Manage VPC to Transit Gateway Attachments ]
  |-ui                                [ solution ui components ] 
  |-run-unit-test.sh                  [ script to run unit tests ]
|-additional_files                    [ CODE_OF_CONDUCT, NOTICE, LICENSE, sonar-project.properties etc.]

License

See license here.

Collection of operational metrics

This solution collects anonymous operational metrics to help AWS improve the quality and features of the solution. For more information, including how to disable this capability, please see the implementation guide.


Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at:

http://www.apache.org/licenses/LICENSE-2.0

or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and limitations under the License.

network-orchestration-for-aws-transit-gateway's People

Contributors

abhinay-reddy-asi avatar aijunpeng avatar amazon-auto avatar georgebearden avatar groverlalit avatar gsingh04 avatar hnishar avatar jrgaray27 avatar tabdunabi avatar tbelmega avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

network-orchestration-for-aws-transit-gateway's Issues

Orchestration for multiple TGWs and multiple regions

There does not seem to be a clear solution for how to use STNO in an environment with multiple TGWs and/or multiple regions. The documentation points out the regional nature of some of the services, and notes that the templates all need to be deployed in the same region. Is the expectation, then, that a separate, full STNO deployment would be required for every region with a TGW to manage? And, if there are multiple TGWs in the same region, can a separate STNO be deployed for each? Is this even possible? Even if so, it would likely be unwieldy and cumbersome.

I would like to see STNO natively support multiple TGW hubs and regions. For example, when there are multiple TGWs in the same region, the attachment orchestration could perhaps be done by specifying a TGW ID in the value to the 'Attach-to-tgw' tag. Or, maybe each TGW could have a optional prefix in order to distinguish the route tables - e.g. TGW1_Flat vs TGW2_Flat. These should be able to interface with a single web interface for approvals and audit history, with filtering to optionally show all records or only those related to a single TGW.

For multiple region support, I would like to be able to deploy just one global template for the hub components, with a single web interface and set of rules etc. Specifying the regions to deploy to could be done via parameters (allowing for an environment to grow over time). Alternately, perhaps the regional components could be split out into a separate template that could be deploy as a stack set.

VPC RT change

VPC RT modification only works for those subnets which are tagged with "Attach-to-tgw", Though in our case we use a sort of service subnet for the TGW ENI's in order to preserve IP's from our real subnets and we have that subnet associatied with the default RT. The real RT associated with the other subnets do not get changed. If we tag those subnetes with "Attach-to-tgw" then RT gets changed in the subnet RT but we get an error in the STNO saying DuplicateSubnetsinSameZoneError which is obvious as soon we can not create more ENI's for the TGW attachement.
The question and the feature request is: What to do in this particular case to Manipulate changes on the subnet RT in a proper way.

Hub template ListOfCustomCidrBlocks parameter does not match the AllowedValue pattern

Describe the bug
When using the Hub template, the parameters for ListOfCustomCidrBlocks gives an example with space in the comma delimited list. This is not allowed as per the allowed pattern so the deployment fails. Removing the AllowedPattern seems to unblock but leads to an issue when CreateRoute CRUD operations runs and it fails.

To Reproduce
Deploy the current template from the repository will not allow you to deploy the stack due to allowed value pattern

Expected behavior
When VPC is created, it should wait for the attachment to be created and then try to create the association and propagation

Please complete the following information about the solution:

  • [3.2.0 ] Version: [e.g. v1.0.0]

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0009) - The AWS CloudFormation template for deployment of the aws-centralized-logging. Version v1.0.0". You can also find the version from releases

  • [ eu-west-2] Region: [e.g. us-east-1]
  • [No ] Was the solution modified from the version published on this repository?
  • If the answer to the previous question was yes, are the changes available on GitHub?
  • [Yes ] Have you checked your service quotas for the sevices this solution uses?
  • [ No] Were there any errors in the CloudWatch Logs? How to enable debug mode?

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
Add any other context about the problem here.

Error When Updating ConsoleLoginInformationEmail Variable

Describe the bug
When updating the email address for Cognito, ConsoleLoginInformationEmail, CloudFormation throws the following errors.

2022-05-26 16:04:26 UTC-0500 UserPoolUserReadOnly UPDATE_FAILED CloudFormation cannot update a stack when a custom-named resource requires replacing. Rename us-west-2_K6oTYrTti|readonlyuser and update the stack again.
2022-05-26 16:04:26 UTC-0500 UserPoolUserAdmin UPDATE_FAILED CloudFormation cannot update a stack when a custom-named resource requires replacing. Rename us-west-2_K6oTYrTti|adminuser and update the stack again.

To Reproduce

Deploy v3.0.1 with CloudFormation. Update the stack and set ConsoleLoginInformationEmail to a different email address.

Expected behavior

The email address in Cognito is updated to the new email address.

Please complete the following information about the solution:

Version: v3.0.1

Allow STNO to update the main route table

We have a number of accounts that we connect to a central VPC with STNO. We use the main route table for the spoke accounts, with a default route to our central VPC. STNO should be able to add when accepting the peering request via the console.

After testing out a number of options, it appears that STNO will not add any routes into the main route table. We have tested using various DefaultRoute parameters in the stackset: All-Traffic (0/0), RFC-1918, and Custom Destinations (eg. 0.0.0.0/0). STNO will only add the requested route into a custom subnet route table.

Creating and managing a large number of route tables is not a great option to get this working.

A boolean parameter for UpdateDefaultRouteTable in the hub stackset, and associated step logic seems like it should be able to fix this.

There is an older ticket #10 which is the same behaviour that I am seeing. I can confirm that for the subnets we create do not have an existing default route in the main route table.

Default Route (All-Traffic (0/0)) not propagating to respective attached subnet's route table

Hi,

I have setup the hub and spoke successfully. However when attaching an existing subnet of an associated VPC to the transit gateway (using the Attach-to-tgw tag) , the route table of that subnet is not updated to point all traffic to the TGW (as set in the parameters on the hub stack creation i.e. All-traffic (0/0)

Am I missing something here?

image

Would appreciate if something can assist with this please.

Extend Association and Propagation to Site-to-site VPN Transit Gateway Attachements

Currently Cloudformation has an open issue aws-cloudformation/cloudformation-coverage-roadmap#308 where Transit Gateway attachment ID for VPN is not exposed on AWS::EC2::VPCEndpoint resource attached to Transit Gateway.

This creates an issue in automating association and propagation of VPN attachments using CloudFormation.

It would be really helpful if STNO can handle VPN attachments as well similar to Subnets (support for Associate-with and Propagate-to tags and EventBridge events).

Update route tables for multiple subnets in one availability zone

Is your feature request related to a problem? Please describe.
We have a 3x3 VPC with 3 subnet tiers across 3 availability zones. e.g. 3 public subnets and 6 private subnets, all using separate route tables. If we tag multiple subnets with "Attach-to-TGW', the solution errors out.

Describe the feature you'd like
A method to tag a subnet in each AZ for the transit gateway attachment and another method to tag every subnet that should have its route table updated by the STNO spoke.

missing the HUB stack update procedure description from version v3.0.0 to v3.0.1

What were you initially searching for in the docs?
We updated the HUB stack well from v2 to v3.0.0. The documentation was helping for that. But with the new version v3.0.1 we are missing the update description. I guess it's the same procedure like the SPOKE stack: "Choose Update, then choose Replace current template."

Is this related to an existing part of the documentation? Please share a link
https://docs.aws.amazon.com/solutions/latest/serverless-transit-network-orchestrator/update-the-stack.html
https://github.com/aws-solutions/serverless-transit-network-orchestrator/releases

Describe how we could make it clearer
A separate description would be helpful to update from v3.0.0 to v3.0.1.
It would also be important to have a notice of the bridge name change: "STNO-EventBridge-v3" perhaps in the release documentation.

If you have a proposed update, please share it here

Make web frontend optional

Is your feature request related to a problem? Please describe.
Some customers do not need the web interface, but the other automation components (i.e. tag-based attachment, association, and propagation) are very valuable. Additionally, several of the services that are used in the web interface aren't supported in GovCloud (CloudFront, AppSync).

Describe the feature you'd like
Make the web interface an optional component so that customers that don't require that functionality can opt out of deploying it. This should also help make this solution usable in GovCloud.

Additional context

Transit gateway attachment fails due to missing IAM service role

Seen when using the latest STNO version. In a new aws account where no previous transit gateway attachments were completed via the console, the IAM service role 'AWSServiceRoleForVPCTransitGateway' does not exist. When attempting to attach a vpc/subnet to the transit gateway via tagging with 'Attach-to-tgw' the attachment will stay stuck in a pending state for about 10minutes and then go to a failed state. To work around the issue you can manually create transit gateway attachments in the console this action then creates the required service role ''AWSServiceRoleForVPCTransitGateway' and future STNO initiated attachments are successful in the account.

v2 to v3 upgrade and providing existing global network id

What were you initially searching for in the docs?

I'm working through upgrading from v2 to v3. This mentions to copy over the global network id and the existing transit gateway id, however, since I have the stack deployed in multiple regions as part of a Custom Control Tower Configuration [1], I have multiple global network ids and transit gateway ids. It is unclear if I need to now split the stackset deployment in one per region or add a mapping to the CF template to pick the right global network fo reach region.

Provide the existing global network id: copy the global network created by v2 deployment (leave blank if v2 did not create global network).

Is this related to an existing part of the documentation? Please share a link
https://docs.aws.amazon.com/solutions/latest/serverless-transit-network-orchestrator/update-the-stack.html
Describe how we could make it clearer
Cover multi region scenario for STNO upgrade path
If you have a proposed update, please share it here

[1] https://controltower.aws-management.tools/automation/cfct/#setup-central-networking-using-serverless-transit-network-orchestrator-stno-advanced-lab

Update Hub Cloudformation Template to support Disabling External Principals for Resource Share

Currently when a Resource Share is created for the Transit Gateway it is configured with the Default Configuration of AllowExternalPrincipals: True

This can be a security issue and a good enhancement (which we have done ourselves by amending the Cfn) is to allow users to disable External principals by a parameter in the Hub Cloudformation and then using a condition set the AllowExternalPrincipals to false so that only AWS accounts within an Organisation can be shared access to the Transit Gateway

An example configuration to support this (I set No for default for our needs but for public generic requirements yes may still be OK)

Add to Parameters:

    "AllowExternalPrincipals": {
        "Type": "String",
        "AllowedValues": [
            "Yes",
            "No"
        ],
        "Default": "No"
    },

Add to Conditions:

"NoExternalPrincipals": {
"Fn::Equals": [
{
"Ref": "AllowExternalPrincipals"
},
"No"
]
},

Add Property to TGW Resource Share:

"AllowExternalPrincipals": {
"Fn::If": [
"NoExternalPrincipals",false,true ]
},

The spoke is not deployable on accounts that already have a transit gateway attachment

Making a transit gateway attachment seems to automatically create the AWSVPCTransitGatewayServiceRolePolicy service linked role which will conflict with this line: https://github.com/awslabs/serverless-transit-network-orchestrator/blob/8a2f7ff/deployment/aws-transit-network-orchestrator-spoke.template#L47

I can't remove the AWSVPCTransitGatewayServiceRolePolicy service linked role because the transit gateway is using it, so the only to get it working seems to be removing that resource from the template.

Which begs the question, since the service linked role is actually created automatically by the transit gateway attachment, why do we need it in the template?

Using Customizations for Control Tower to Deploy newest STNO v3

Hello team,

Hopefully looking to get an answer for this soon. I'm consistently getting errors with my CodePipeline for Customizations for Control Tower (CfCT) to deploy this new solution for STNO. The error that I'm getting from CloudFormation:

"ResourceLogicalId:TgwPeeringLambda, ResourceType:AWS::Lambda::Function, ResourceStatusReason:Properties validation failed for resource TgwPeeringLambda with message: #/Code/S3Bucket: failed validation constraint for keyword [pattern]."

There are no parameters that I've configured for this particular resource. I'm really out of guesses how to fix this error. Could someone help me fix this problem?

Screenshot:
Screen Shot 2022-08-08 at 8 14 26 PM

Failing Unit tests - missing lib.s3

Hey guys. Love the project, but I noticed a few of the unit tests are failing because of the missing lib.s3:

_______________ ERROR collecting tests/test_cfn_cr_secure_ssm.py _______________
ImportError while importing test module '/home/circleci/project/source/tests/test_cfn_cr_secure_ssm.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/test_cfn_cr_secure_ssm.py:3: in <module>
    import lambda_custom_resource
lambda_custom_resource.py:18: in <module>
    from custom_resource_handler import StepFunctions, SecureSSMParameters, CWEventPermissions, S3ConsoleDeploy, CFNMetrics
custom_resource_handler.py:24: in <module>
    from lib.s3 import S3
E   ModuleNotFoundError: No module named 'lib.s3'
______________ ERROR collecting tests/test_cfn_cr_sm_execution.py ______________
ImportError while importing test module '/home/circleci/project/source/tests/test_cfn_cr_sm_execution.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/test_cfn_cr_sm_execution.py:1: in <module>
    import lambda_custom_resource
lambda_custom_resource.py:18: in <module>
    from custom_resource_handler import StepFunctions, SecureSSMParameters, CWEventPermissions, S3ConsoleDeploy, CFNMetrics
custom_resource_handler.py:24: in <module>
    from lib.s3 import S3
E   ModuleNotFoundError: No module named 'lib.s3'

It doesn't break any functionality it seems, but I thought I would let you guys know.

It looks like lib.s3 is used by custom_resource_handler.py for class S3ConsoleDeploy.

Upgrade path for STNO v2.0.0 to v3.0.x will orphan transit gateway and global network from stack

What were you initially searching for in the docs?
We're in the process of trying to work out how we're going to migrate from STNO v2.0.0 to STNO v3.0.x and run into a problem whereas the documentation lists the following:

Deploy the v3 template along side v2 template and delete v2 template later.

This has the implication that the new stack will simply reference the transit gateway and global network by their ID's but no longer manage them as resources as part of a stack

Is this related to an existing part of the documentation? Please share a link
https://docs.aws.amazon.com/solutions/latest/serverless-transit-network-orchestrator/update-the-stack.html

Describe how we could make it clearer
Could the documentation be improved to make the fact that resources will be orphaned clear, and could they be improved to include steps to re-integrate the orphaned resources back into the hub stack?

Also is there any way that the v3.0.x stack can be deployed over the top of the v2.0.0 stack instead? We leveraged the following lab to do our deployments: https://controltower.aws-management.tools/automation/cfct/#setup-central-networking-using-serverless-transit-network-orchestrator-stno-advanced-lab. Which means our stacks are deployed with stacksets, so doing a side by side is not easy and means we're looking like we're going to be stuck on 2.0.0 indefinitely.

Misc
Additionally, for users stuck on previous versions (prior to v3.0.0) could security patches be released (in a version 2.0.1 for example) for changes made in v3.0.1 (https://github.com/aws-solutions/serverless-transit-network-orchestrator/releases/tag/v3.0.1) so that we have a path forward

manifest-generator: No such file or directory

Description:
I am trying to make some changes to code(custom_resource_handler.py) and trying to build. I get below error when its trying to generate the console-manifest.json

./build-s3-dist.sh: line 99: cd: /Users/<>/<>/repo/serverless-transit-network-orchestrator/deployment/manifest-generator: No such file or directory

and due to above issue,
cd /Users/<>/<>/repo/serverless-transit-network-orchestrator/deployment/regional-s3-assets && zip -rv ./aws-transit-network-orchestrator-cr.zip ./console-manifest.json zip warning: name not matched: ./console-manifest.json

and the final package aws-transit-network-orchestrator-cr.zip doesn't contain the console-manifest.json.

Can you please suggest if I am doing anything wrong.

S3 interface endpoint to manage STNO page privately would boost security of the solution.

Is your feature request related to a problem? Please describe.
I find the STNO console page provided through cloudfront as insecure. It doesn't leverage MFA or WAF and the page is "public" just behind a password.

Describe the feature you'd like
An S3 interface endpoint can be leveraged and linked with a private R53 zone so that STNO can be managed from a VPC or On-Premises

Additional context
A feature to choose between public and private management would also add value.

empty PREFIX_LISTS is not supported by Custom-Destinations, invalid value for parameter destination-cidr-block:

Describe the bug

After the approve step via the cognito GUI the state machine causes an error:
"An error occurred (InvalidParameterValue) when calling the CreateRoute operation: invalid value for parameter destination-cidr-block:"
This means, the lambda is trying to add a route with an empty CIDR parameter.
This is caused by trying to read the PREFIX_LISTS but this list is empty. The step to use CIDR_BLOCK is working well.

To Reproduce

This environment is used:

variable value
CIDR_BLOCKS 10.0.0.0/8
DEFAULT_ROUTE Custom-Destinations
PREFIX_LISTS

With this workaround it's running well:
PREFIX_LISTS=10.0.0.0/8

Expected behavior

Request can be approved.

Please complete the following information about the solution:

  • [v3.2.1] Version: [e.g. v1.0.0]

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0009) - The AWS CloudFormation template for deployment of the aws-centralized-logging. Version v1.0.0". You can also find the version from releases

  • [eu-central-1 ] Region: [e.g. us-east-1]
  • [no] Was the solution modified from the version published on this repository?
  • [no] If the answer to the previous question was yes, are the changes available on GitHub?
  • [yes] Have you checked your service quotas for the sevices this solution uses?
  • [yes] Were there any errors in the CloudWatch Logs? How to enable debug mode?

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context

*** Source Code

    def _update_route_table_with_prefix_lists(self, ec2, existing_routes):
        prefix_lists = environ.get("PREFIX_LISTS").split(",")         ## <------ this returns 1 and not 0 if the variable exists
        if len(prefix_lists) > 0:
            for prefix_list_id in prefix_lists:
                self.logger.info(f"Adding prefix list id: {prefix_list_id}")
                self._find_existing_default_route(existing_routes, prefix_list_id)
                self._update_route_table(ec2, prefix_list_id).           ## <---- this cannot work with the undefined route

*** CloudWatch

"error-info": {
            "Error": "ClientError",
            "Cause": "{\"errorMessage\":\"An error occurred (InvalidParameterValue) when calling the CreateRoute operation: invalid value for parameter destination-cidr-block:\",\"errorType\":\"ClientError\",\"requestId\":\"de5388cf-b914-4fa3-8c2a-4ca32fac2a96\",\"stackTrace\":[\"  File \\\"/var/task/state_machine/index.py\\\", line 45, in lambda_handler\\n    return vpc(event, function_name)\\n\",\"  File \\\"/var/task/state_machine/index.py\\\", line 116, in vpc\\n    response = vpc.default_route_crud_operations()\\n\",\"  File \\\"/var/task/state_machine/lib/handlers/vpc_handler.py\\\", line 649, in default_route_crud_operations\\n    self._update_route_table_with_prefix_lists(ec2, existing_routes)\\n\",\"  File \\\"/var/task/state_machine/lib/handlers/vpc_handler.py\\\", line 677, in _update_route_table_with_prefix_lists\\n    self._update_route_table(ec2, prefix_list_id)\\n\",\"  File \\\"/var/task/state_machine/lib/handlers/vpc_handler.py\\\", line 600, in _update_route_table\\n    self._create_route(ec2, route)\\n\",\"  File \\\"/var/task/state_machine/lib/handlers/vpc_handler.py\\\", line 520, in _create_route\\n    ec2.create_route_cidr_block(\\n\",\"  File \\\"/var/task/state_machine/lib/clients/ec2.py\\\", line 95, in create_route_cidr_block\\n    response = self.ec2_client.create_route(\\n\",\"  File \\\"/var/task/botocore/client.py\\\", line 530, in _api_call\\n    return self._make_api_call(operation_name, kwargs)\\n\",\"  File \\\"/var/task/botocore/client.py\\\", line 960, in _make_api_call\\n    raise error_class(parsed_response, operation_name)\\n\"]}"
        }

CreateTransitGatewayVpcAttachment failing with Client.UnauthorizedOperation in spoke account

TGW attachment creation is failing in spoke account due to an access denied error. Haven't changed any templates and all the roles seems have the required permissions and trust relationships

{
......
"sessionContext": {
"sessionIssuer": {
"type": "Role",
"principalId": "",
"arn": "arn:aws:iam::*:role/TransitNetworkExecutionRole-us-east-1",
"accountId": "",
"userName": "TransitNetworkExecutionRole-us-east-1"
},
.....,
"eventTime": "",
"eventSource": "ec2.amazonaws.com",
"eventName": "CreateTransitGatewayVpcAttachment",
"awsRegion": "us-east-1",
"sourceIPAddress": "",
"userAgent": "",
"errorCode": "Client.UnauthorizedOperation",
"errorMessage": "You are not authorized to perform this operation. Encoded authorization failure message: *,
.....
}

AccessDenied error preventing STNO actions

This is an account that had it's aws-transit-network-orchestrator-hub stack deleted. I removed the old resources (TGW with all attachments) and recreated the stack.

Now I'm getting the following error

An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::123456789012:assumed-role/TransitNetworkOrchestratorSMLambdaRole-eu-west-1/TransitNetworkOrchestratorSMLambda is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::123456789012:role/TransitNetworkExecutionRole-eu-west-1

The TransitNetworkOrchestratorSMLambda policy inside arn:aws:iam::123456789012:role/TransitNetworkOrchestratorSMLambdaRole-eu-west-1 is unchanged, see below. It explicitly includes the sts:AssumeRole action that it is complaining about.
Any ideas on how to fix a broken STNO?

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:eu-west-1:123456789012:log-group:/aws/lambda/*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "xray:PutTraceSegments",
                "xray:PutTelemetryRecords"
            ],
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "iam:GetRole"
            ],
            "Resource": "arn:aws:iam::eu-west-1:role/*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "ssm:PutParameter",
                "ssm:GetParameter",
                "ssm:GetParameters",
                "ssm:DeleteParameter",
                "ssm:GetParametersByPath"
            ],
            "Resource": "arn:aws:ssm:eu-west-1:123456789012:parameter/*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "ssm:DescribeParameters"
            ],
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "ec2:CreateTransitGatewayRoute",
                "ec2:DeleteTransitGatewayRoute",
                "ec2:ModifyTransitGatewayVpcAttachment",
                "ec2:CreateTransitGatewayVpcAttachment",
                "ec2:DeleteTransitGatewayVpcAttachment",
                "ec2:AssociateTransitGatewayRouteTable",
                "ec2:DisableTransitGatewayRouteTablePropagation",
                "ec2:DisassociateTransitGatewayRouteTable",
                "ec2:EnableTransitGatewayRouteTablePropagation"
            ],
            "Resource": [
                "arn:aws:ec2:eu-west-1:*:transit-gateway-route-table/*",
                "arn:aws:ec2:eu-west-1:123456789012:transit-gateway/*",
                "arn:aws:ec2:eu-west-1:*:transit-gateway-attachment/*",
                "arn:aws:ec2:eu-west-1:*:vpc/*",
                "arn:aws:ec2:eu-west-1:*:subnet/*",
                "arn:aws:ec2:eu-west-1:*:route-table/*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "ec2:DescribeTransitGatewayVpcAttachments",
                "ec2:DescribeTransitGatewayAttachments",
                "ec2:DescribeTransitGatewayRouteTables",
                "ec2:GetTransitGatewayAttachmentPropagations",
                "ec2:GetTransitGatewayRouteTableAssociations",
                "ec2:GetTransitGatewayRouteTablePropagations",
                "ec2:SearchTransitGatewayRoutes"
            ],
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "sts:AssumeRole"
            ],
            "Resource": "arn:aws:iam::*:role/TransitNetworkExecutionRole-eu-west-1",
            "Effect": "Allow"
        },
        {
            "Action": [
                "lambda:InvokeFunction"
            ],
            "Resource": [
                "arn:aws:lambda:eu-west-1:123456789012:function:TransitNetworkOrchestratorSMLambda"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "dynamodb:PutItem"
            ],
            "Resource": "arn:aws:dynamodb:eu-west-1:123456789012:table/StackSet-CustomControlTower-stno-hub-development-6aab4035-584b-4b00-93ae-7f6ea75a7706-DynamoDbTable-R9U8XGY0HRNO",
            "Effect": "Allow"
        },
        {
            "Action": [
                "sns:Publish"
            ],
            "Resource": "arn:aws:sns:eu-west-1:123456789012:AWS-Transit-Network-Approval-Notifications",
            "Effect": "Allow"
        }
    ]
}

Allow option to add OU as a Principal Type

Currently STNO only allows to share with a List of accounts and Organization. It would be good option to add an option to share with just an OU by passing the OU arn

The date in changelog for v2 is wrong

Hi,

Unless you guys somehow timetravelled this date doesn't seem correct: https://github.com/awslabs/serverless-transit-network-orchestrator/blame/master/CHANGELOG.md#L10

Meanwhile wanted to let you know seeing this new release really made me happy because everything on that changelog are things that we actually needed.

Funny thing was that yesterday I was trying to deploy this but I was getting permission errors from S3, and I couldn't probably because you guys were deploying the new version to S3. Now I'm happy that I couldn't deploy yesterday since I get to use the new features(specially the support for prefix lists).

Empty PREFIX_LISTS environment variable causes Step Function to fail in case of "Custom-Destinations" value for DEFAULT_ROUTE environment variable

Describe the bug

PREFIX_LISTS environment variable in STNO-State-Machine Lambda Function should-be/is optional where "Custom-Destinations" is specified for DEFAULT_ROUTE variable if CIDR_BLOCKS is provided (as per the CustomerManagedPrefixListIds parameter in network-orchestration-hub.yaml CloudFormation template).

However, by keeping PREFIX_LISTS empty while specifying CIDR_BLOCKS and setting "Custom-Destinations" as a value for DEFAULT_ROUTE, causes Step Function to fail and respective TGW VPC attachment doesn't get associated/propagated to the specified TGW Route Table.

To Reproduce

Specify ListOfCustomCidrBlocks in network-orchestration-hub.yaml CloudFormation template while setting "Custom-Destinations" as a value for DEFAULT_ROUTE and keep CustomerManagedPrefixListIds empty.

Once STNO Hub and Spoke stacks are deployed, create a VPC with populated "Associate-with" and/or "Propagate-to" tags and then create subnets having tag "Attach-to-tgw" added.

This will cause step function to fail as in the underlying Lambda code following two functions are getting called for Custom-Destinations in DEFAULT_ROUTE:

  • _update_route_table_with_cidr_blocks(...)

This works fine as CIDR_BLOCKS environment variable would have value.

  • _update_route_table_with_prefix_lists(...)

This errors as PREFIX_LISTS environment variable comes up as an empty string and upon getting split by "," its length would remain 1 and hence it is able to bypass the > 0 check causing _update_route_table(...) to fail as prefix_list_id would be empty.

Expected behavior

Empty PREFIX_LISTS should be effectively handled in the Lambda code so that Step Function could succeed in case of "Custom-Destinations" for the DEFAULT_ROUTE with just requiring CIDR_BLOCKS environment variable.

For this in the code block the first check should be making sure PREFIX_LISTS is a non-empty string like following:
if environ.get("PREFIX_LISTS") != "": or simply if environ.get("PREFIX_LISTS"):
Then the PREFIX_LISTS should be split to avoid the above issue.

Similar thing can be done in case just PREFIX_LISTS is required but not the CIDR_BLOCKS, in code block of _update_route_table_with_cidr_blocks(...)

Please complete the following information about the solution:

  • Version: v3.2.1

  • Region: eu-west-1

  • Was the solution modified from the version published on this repository?

  • If the answer to the previous question was yes, are the changes available on GitHub?

  • Have you checked your service quotas for the sevices this solution uses?

  • Were there any errors in the CloudWatch Logs?

Yes: "exception": "Traceback (most recent call last):\n File "/var/task/state_machine/lib/clients/ec2.py", line 95, in create_route_cidr_block\n response = self.ec2_client.create_route(\n File "/var/task/botocore/client.py", line 530, in _api_call\n return self._make_api_call(operation_name, kwargs)\n File "/var/task/botocore/client.py", line 960, in _make_api_call\n raise error_class(parsed_response, operation_name)\nbotocore.exceptions.ClientError: An error occurred (InvalidParameterValue) when calling the CreateRoute operation: invalid value for parameter destination-cidr-block:",
"exception_name": "ClientError"

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
Add any other context about the problem here.

Provide SSO options for the portal instead of username/password

Having another password to share and rotate makes keeps me awake at nights. Ideally hub account administrators should be able to view the portal without any additional passwords or authentication via saml or other providers could be provided.
The fact that the portal is exposed to world via CloudFront, doesn't make things any easier.

DuplicateTransitGatewayAttachment error received on 1 of 3 subnets

I create three subnets exactly the same way through CloudFormation and consistently 1 of the 3 is not being attached even though tag Attach-to-tgw is present. The subnets are in 3 different AZs. The error tag is showing the following:

STNOStatus-Subnet-Error | 2020-01-31T03:22:19Z: An error occurred (DuplicateTransitGatewayAttachment) when calling the CreateTransitGatewayVpcAttachment operation: tgw-04e93f5458d6a0662 has non-deleted Transit Gateway Attachments with same VPC ID.

I am able to correct the issue manually and attach to the subnet without issue.

Attaching CloudFormation template to produce the error.

Thanks for taking a look!
k8s-stno-error-test-vpc.template.json.txt

STNO Hub Upgrade with Landing Zone

Hello,

IHAC who is trying to upgrade their STNO hub to version 2. They currently use STNO as a core resource in their network account. Sample Manifest below:

core_accounts:

Network account

  • name: network
    email: [email protected]
    ssm_parameters:
    • name: /org/member/network/account_id
      value: $[AccountId]
      core_resources:
    • name: STNO-Hub
      template_file: templates/core_accounts/aws-transit-network-orchestrator-hub.template
      parameter_file: parameters/core_accounts/aws-transit-network-orchestrator-hub.json
      deploy_method: stack_set
      regions:
      • us-east-1
      • us-west-2
      • ca-central-1
      • ap-southeast-1
      • eu-central-1
      • eu-west-2
      • ap-southeast-2

The problem is with V2 they want to pass different prefix list IDs and custom CIDRs to the different regions, but landing zone core resources do not allow parameter overrides (like baseline_resources do). The customer is also using a single JSON parameter file.

Any guidance on how to work around or restructure ?

Shared NAT Gateway

Please up-vote if you want this feature. Do not submit a new feature request.

Allow Deploying Spoke into Hub Account

When trying to deploy the Spoke template into the same account as the Hub we recieve the following error

Logical ID: CustomServiceLinkedRole
SLR [AWSServiceRoleForVPCTransitGateway] already exists but does not have a description. Please verify your SLR use case. If you are sure the use case is correct please modify your CloudFormation template and keep SLR description consistent.

I believe this is because the Service Linked Role is created in both the Hub and Spoke templates https://github.com/aws-solutions/serverless-transit-network-orchestrator/search?q=CustomServiceLinkedRole&type=code

Could the role creation be put behind a Conditional check for a parameter, something like the following?

Parameters:
  SpokeInHubOverride:
    Type: String
    Description: Override the default action of not allowing the spoke template to be deployed into the hub account

...

Conditions:
  NoSpokeInHub:  !Not 
    - !Equals  
      - !Ref SpokeInHubOverride
      - 'true'

...

Resources:
  # The following description enables the idempotency and CFN template will not rollback if the role
  # already exist. Do not change the description below.
  CustomServiceLinkedRole:
    Type: "AWS::IAM::ServiceLinkedRole"
    Condition: NoSpokeInHub
    Properties:
      AWSServiceName: 'transitgateway.amazonaws.com'
      Description: Allows TGW and VPC Attachment operations.

This would be handy for users who might have other Networking infrastructure such as client VPN configurations that might require VPC's and attachments in the networking account.

STNO Static Routes Creation in TransitGateway RouteTables

Is your feature request related to a problem? Please describe.

when we are doing transit-gateway attachment Propagation in TGW routetables, what ever the secondary cidrs attached to VPC in spoke account are propagated by default. This is causing networking issues in our Landing zone environment. So instead of TGW propagation we want to create static routes with spoke VPC list of primary cidr's and excluding list of secondary cidr's

Describe the feature you'd like

we have same secondary CIDR's allocated for each spoke account VPC. for example spoke VPC cidr's : 10.240.0.0/25, 10.239.0.0/28 and 100.64.0.0/16,100.65.0.0/16, 100.66.0.0/16 , 100.67.0.0/16.

primary cidr's are : 10.240.0.0/25, 10.239.0.0/28 (this will vary for each account with a list two cidr's )
secondary cidr's are : 100.64.0.0/16,100.65.0.0/16, 100.66.0.0/16 , 100.67.0.0/16 (this is same for each spoke account having same cidr's range )

i want to create static routes in TGW routetable with spoke vpc primary cidr's list 10.240.0.0/25, 10.239.0.0/28 and exclude list secondary cidr's list 100.64.0.0/16,100.65.0.0/16, 100.66.0.0/16 , 100.67.0.0/16 in hub account.

Additional context

Provide a way to have approval via invoking a Lambda function workflow

We want to approve any VPC who's CIDR is registered in our IPAM system and have a certain tag. We would like to approve any VPC in our organisation but in case someone tries to attach a VPC with a CIDR that don't belong to them, this might ruin connectivity for some other teams and we would like to avoid this.

Allow option to mix Principal Types

Currently STNO allows only to share with a List of Account IDs or Organization ARN. It would be great to allow for a mixture of both. For example in the case when using the Organization ARN but also wanted to add one account from a different AWS Organizations organization.

Using existing TGW already registered to existing global network fails

The solutions allows an existing TGW and existing global network to be set. Looks like the solution assumes the TGW would not already be registered with the global network. In my case this is not the case. Would be nice if it was possible to say the TGW is already registered with the global network to skip this step.

Removing Subnet from TGW Attachment results in VPC default route being removed

Describe the bug

If a VPC has multiple subnets attached to the TGW via the Attach-to-tgw tag and one of the tags are removed to remove a single subnet, the default route for the VPC is removed even though additional subnets are still attached to the TGW.

To Reproduce

  1. Create a VPC with subnets in multiple AZs
  2. Attach multiple subnets to the TGW via the Attach-to-tgw tag
  3. Verify there is a default route of 0.0.0.0/0 in the VPC route table and multiple subnets are attached to the TGW.
  4. Remove the Attach-to-tgw tag from a single subnet
  5. Notice there are still subnet(s) attached to the TGW, but there is no default route in the VPC route table and the subnets attached could not route to the TGW.

Expected behavior

Default Route is only removed when all subnets are detached.

Please complete the following information about the solution:

  • [ v3.1 ] Version: [e.g. v1.0.0]

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0009) - The AWS CloudFormation template for deployment of the aws-centralized-logging. Version v1.0.0". You can also find the version from releases

  • [ us-east-2 ] Region: [e.g. us-east-1]
  • [ No ] Was the solution modified from the version published on this repository?
  • If the answer to the previous question was yes, are the changes available on GitHub?
  • [ Yes ] Have you checked your service quotas for the sevices this solution uses?
  • [ No ] Were there any errors in the CloudWatch Logs? How to enable debug mode?

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
The error is being caused here in the vpc_handler. I'm wondering if self.event.get("Action") == "RemoveSubnet" should not be there.

How to add an Additional Group as Admin

We have STNO integrated via Azure AD Group, the group is created in cognito with the users, but I cannot see anyway to grant this Group Admin access. By default its readonly user. I tried adding the Group to AppSync Schema where AdminGroup was defined but it didn't seem to make any difference.

I dont want to have to add the users from this group to the AdminGroup, but rather have the group assigned as Admin,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.