Giter Club home page Giter Club logo

terraform-nixos's Introduction

terraform-nixos

built with nix

This repository contains a set of Terraform Modules designed to deploy NixOS machines. These modules are designed to work together and support different deployment scenarios.

What is Terraform?

Terraform is a tool that allows to declare infrastructures as code.

What is Nix, nixpkgs and NixOS?

Nix is a build system and package manager that allows to manage whole system configurations as code. nixpkgs is a set of 20k+ packages built with Nix. NixOS is a Linux distribution built on top of nixpkgs.

What is a Terraform Module?

A Terraform Module refers to a self-contained package of Terraform configurations that are managed as a group. This repo contains a collection of Terraform Modules which can be composed together to create useful infrastructure patterns.

Terraform + Nix vs NixOps

NixOps is a great tool for personal deployments. It handles a lot of things like cloud resource creation, machine NixOS bootstrapping and deployment.

The difficulty is when the cloud resources are not supported by NixOps. It takes a lot of work to map all the cloud APIs. Compared to NixOps, Terraform has become an industry standard and has thousands of people contributing new cloud API mapping all the time.

Another issue is when sharing the configuration as code with multiple developers. Both NixOps and Terraform maintain a state file of "known applied" configuration. Unlike NixOps, Terraform provides facilities to sync and lock the state file so it's available by other users.

The approach here is to use Terraform to create all the cloud resources. By using the google_image_nixos_custom module it's possible to pre-build images in auto-scaling scenarios. Or use a push model similar to NixOps with the generic deploy_nixos module.

So overall Terraform + Nix is more flexible and scales better. But it's also more cumbersome to use as it requires to learn two languages instead of one and the integration between both is also a bit clunky.

Terraform Modules

The list of modules provided by this project:

Using these modules from your terraform configuration

Terraform supports importing modules directly from a GitHub repository.

For example, to use the deploy_nixos module:

module "deploy_nixos" {
  source = "github.com/tweag/terraform-nixos//deploy_nixos?ref=ced68729b6a0382dda02401c8f663c9b29c29368"

  … module-specific fields …
}

Beware the double //, which separates the github repository url from the subdirectory that contains the module. ?ref= specifies a specific git ref of the repository, in this case the commit ced687….

Examples

To better understand how these modules can be used together, look into the ./examples folder.

Related projects

Future

  • Support other cloud providers.
  • Support nixos-infect bootstrapping method.

Contributions are welcome!

Thanks

Thanks to Digital Asset for generously sponsoring this work!

Thanks to Tweag for enabling this work and the continuous support!

License

This code is released under the Apache 2.0 License. Please see LICENSE for more details.

Copyright © 2018 Tweag I/O.

terraform-nixos's People

Contributors

adrian-gierakowski avatar asymmetric avatar betaboon avatar exarkun avatar fpletz avatar friede80 avatar ixmatus avatar knl avatar lucianu avatar nlewo avatar onsails avatar phfroidmont avatar pingiun avatar profpatsch avatar roberth avatar zimbatm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

terraform-nixos's Issues

How to deploy a new nixOS version?

Hey!

I'm trying to deploy a new nixOS version to a GCE VM.

I've both modules, but they come with versions.
It would be cool to have a complete example for a minimal deployment.

I'll tag the mantainer just in case: @adrian-gierakowski
Thanks!

Error when using config and config_pwd option

Hello,

When I use the config and config_pwd option to pass a dynamic nixos configuration I get the error :

error: cannot coerce a set to a string, at (string):6:14

By looking at the code I can see that the configuration variable is used with the import function which only work if the type is a string.

https://github.com/tweag/terraform-nixos/blob/646cacb12439ca477c05315a7bfd49e9832bc4e3/deploy_nixos/nixos-instantiate.sh#L30

Am I using the config parameter wrong ?

Google Cloud services fail to start

Describe the bug
After running terraform apply, I receive this error message:
https://defuse.ca/b/QnVKBcVO
I'm pointing out that I enabled build_on_target true in case that's relevant, as I was facing invalid signatures for one of the packages mentioned here:

error: cannot add path '/nix/store/0km4ablsx26i1755jq4vq49d21q7p5vp-unit-google-clock-skew-daemon.service' because it lacks a valid signature

To Reproduce
Relevant snippet of main.tf:

module "nixos_image_1809" {
  source = "github.com/tweag/terraform-nixos/google_image_nixos"
  nixos_version = "latest"
}

module "deploy_nixos" {
    source = "git::https://github.com/tweag/terraform-nixos.git//deploy_nixos?ref=646cacb12439ca477c05315a7bfd49e9832bc4e3"
    nixos_config = "${path.module}/configuration.nix"
    target_host = google_compute_instance.example.network_interface.0.access_config.0.nat_ip 
    target_user = "USER_NAME"
    ssh_agent = false
    ssh_private_key_file = "/home/USER/.ssh/SSHKEY"
    build_on_target = "true"
}

resource "google_compute_instance" "example" {
  name         = "example"
  machine_type = "e2-micro"

  boot_disk {
    initialize_params {
      image = module.nixos_image_1809.self_link
      size = 30
    }
  }

  network_interface {
    network       = "default"
    access_config {
    }
  }

  metadata = {
    enable-oslogin = "TRUE"
  }

and configuration.nix is the default setup:

{ modulesPath, ... }:
{
  imports = [
    "${toString modulesPath}/virtualisation/google-compute-image.nix"
  ];
}

I've just ran terraform init, and terraform apply.

Expected behavior
I expected the deployment to complete successfully.

Environment

  • OS name + version: Building on NixOS Unstable with Nix 2.4pre20210802_47e96bb, VM is running Nix 2.3.15
  • Version of the code: Latest Git commit

Additional context
Add any other context about the problem here.

Managing kernel upgrades

Currently, a kernel upgrade is not well supported since the old kernel is still running after the kernel upgrade applied by terraform apply.
So, when the kernel on a host is upgraded, the host should be rebooted or the new kernel should be loaded with kexec.

This is hard for Terraform to handle this since we don't want to reboot the machine on every the configuration change. Maybe, this should be realized by the nixos-rebuild script itself.

Tooling broken on 19.09

This project doesn't pin a known good version of nixpkgs for its tooling, breaking the formatter, pre-commit etc.
The main problem is that terraform has been updated from 0.11 to 0.12.

error: unknown flag '--builders '''

Describe the bug
While trying to deploy, I get this error:

module.deploy_nixos.null_resource.deploy_nixos (local-exec): Executing: [".terraform/modules/deploy_nixos/deploy_nixos/nixos-deploy.sh" "/nix/store/5dgmlkwk2kyqm6lvbvfanqqcb5v0841m-nixos-system-unnamed-20.03post-git.drv" "/nix/store/6g6fcqv01ryarf204bhx8w5vcmrilgky-nixos-system-unnamed-20.03post-git" "[email protected]" "22" "false" "./id_rsa.pem" "switch" "--option" "substituters" "https://cache.nixos.org/" "--option" "trusted-public-keys" "cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY=" "--builders ''" "ignoreme"]
module.deploy_nixos.null_resource.deploy_nixos (local-exec): --- building on deployer
module.deploy_nixos.null_resource.deploy_nixos (local-exec): error: unknown flag '--builders '''

To Reproduce

I'm not sure, to be honest. I just followed the example...

Expected behavior

A succesful deployment.

  • OS name + version:
  • Version of the code:

Additional context
Add any other context about the problem here.

Google storage bucket md5hash changes on every run of Terraform apply in GitHub actions

Describe the bug
When running terraform apply using a GitHub actions workflow with google_image_nixos_custom as shown in the example config:

resource "random_id" "bucket" {
  byte_length = 8
}

# create a bucket to upload the image into
resource "google_storage_bucket" "nixos-images" {
  name     = "nixos-images-${random_id.bucket.hex}"
  location = "US"
}

# create a custom nixos base image the deployer can SSH into
#
# this could also include much more configuration and be used to feed the
# auto-scaler with system images
module "nixos_image_custom" {
  source      = "github.com/tweag/terraform-nixos/google_image_nixos_custom"
  bucket_name = google_storage_bucket.nixos-images.name
  nixos_config = "${path.module}/image_nixos_custom.nix"
}

Terraform detects that the md5hash of the google_storage_bucket_object has changed, even when I haven't made any changes to the repo. This is the output:

Terraform will perform the following actions:

  # module.nixos_image_custom.google_storage_bucket_object.nixos must be replaced
+/- resource "google_storage_bucket_object" "nixos" ***
      ~ crc32c           = "hptang==" -> (known after apply)
      ~ detect_md5hash   = "okiQS+ha88pmBeqKlPAG1Q==" -> "different hash" # forces replacement
      - event_based_hold = false -> null
      ~ id               = "nixos-images-2a682647b7c[45](https://github.com/robbins/infra-2/actions/runs/3876298638/jobs/6609960756#step:6:46)337-images/m8ky02n1ik2gfyf7wsmjv0saiczb4r54-nixos-image-23.05pre-git-x86_64-linux.raw.tar.gz" -> (known after apply)
      + kms_key_name     = (known after apply)
      ~ md5hash          = "okiQS+ha88pmBeqKlPAG1Q==" -> (known after apply)
      ~ media_link       = "https://storage.googleapis.com/download/storage/v1/b/nixos-images-2a6826[47](https://github.com/robbins/infra-2/actions/runs/3876298638/jobs/6609960756#step:6:48)b7c45337/o/images%2Fm8ky02n1ik2gfyf7wsmjv0saiczb4r54-nixos-image-23.05pre-git-x86_64-linux.raw.tar.gz?generation=1673285901733547&alt=media" -> (known after apply)
      - metadata         = *** -> null
        name             = "images/m8ky02n1ik2gfyf7wsmjv0saiczb4r54-nixos-image-23.05pre-git-x86_64-linux.raw.tar.gz"
      ~ output_name      = "images/m8ky02n1ik2gfyf7wsmjv0saiczb4r54-nixos-image-23.05pre-git-x86_64-linux.raw.tar.gz" -> (known after apply)
      ~ self_link        = "https://www.googleapis.com/storage/v1/b/nixos-images-2a682647b7c45337/o/images%2Fm8ky02n1ik2gfyf7wsmjv0saiczb4r54-nixos-image-23.05pre-git-x86_64-linux.raw.tar.gz" -> (known after apply)
      ~ storage_class    = "STANDARD" -> (known after apply)
      - temporary_hold   = false -> null
        # (3 unchanged attributes hidden)
    ***

Plan: 1 to add, 0 to change, 1 to destroy.
module.nixos_image_custom.google_storage_bucket_object.nixos: Creating...
module.nixos_image_custom.google_storage_bucket_object.nixos: Still creating... [10s elapsed]
module.nixos_image_custom.google_storage_bucket_object.nixos: Creation complete after 12s [id=nixos-images-2a682647b7c45337-images/m8ky02n1ik2gfyf7wsmjv0saiczb4r54-nixos-image-23.05pre-git-x86_64-linux.raw.tar.gz]
module.nixos_image_custom.google_storage_bucket_object.nixos (deposed object 320[48](https://github.com/robbins/infra-2/actions/runs/3876298638/jobs/6609960756#step:6:49)425): Destroying... [id=nixos-images-2a682647b7c4[53](https://github.com/robbins/infra-2/actions/runs/3876298638/jobs/6609960756#step:6:54)37-images/m8ky02n1ik2gfyf7wsmjv0saiczb4r[54](https://github.com/robbins/infra-2/actions/runs/3876298638/jobs/6609960756#step:6:55)-nixos-image-23.05pre-git-x86_[64](https://github.com/robbins/infra-2/actions/runs/3876298638/jobs/6609960756#step:6:65)-linux.raw.tar.gz]
module.nixos_image_custom.google_storage_bucket_object.nixos: Destruction complete after 0s

Apply complete! Resources: 1 added, 0 changed, 1 destroyed.
::debug::Terraform exited with code 0.

The image stored in the bucket isn't modified, so I'm not sure why the bucket hash is changing.

To Reproduce
Steps to reproduce the behavior.
Use the above config for main.tf and use GitHub actions to run terraform apply -auto-approve -input=false

Expected behavior
A clear and concise description of what you expected to happen.
Terraform should detect that nothing has changed and not make any changes.

Environment

  • OS name + version: Runner is on ubuntu-latest with Terraform 1.3.6
  • Version of the code: Latest commit

Additional context
When running terraform apply on my local machine, the md5 hash does not change, and terraform correctly detects that no changes are needed.

Random file provisioner error and SSH authentication failure with AWS EC2

Describe the bug
When provisioning a new instance, it will sometimes (usually, but not always) fail with a "file provisioner error" with SSH authentication failed

To Reproduce
terraform init and terraform apply with the following configuration (main.tf is placed in terraform/main.tf, and .nix files are in nixos/configuration.nix and nixos/git-server.nix: https://gist.github.com/spearman/58db5a31afd88c8962d9a5b3da78ac00

Expected behavior
I would expect it to be reproducible and not fail randomly.

Environment

  • OS name + version: NixOS 21.11
  • Version of the code: rev 646cacb

Additional context
Here is the full output when running terraform apply:

https://gist.github.com/spearman/5f19ffb4c80791f0444c4a2a3b88afab

This was after it had been successfully deployed and I was trying to change the configuration. Usually when it occurs during creation I can log in as root with the generated .pem file, but the nixos configuration has not been applied.

I thought maybe it was a problem with the particular AMI I was using, but I have experienced the problem with 20.09, 21.05, and 21.11.

[WARN] no schema for provisioner … so provisioner block references cannot be detected

2020/03/27 15:43:55 [WARN] no schema for provisioner "file" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "file" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "file" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "remote-exec" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "local-exec" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "file" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "file" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "file" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "remote-exec" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
2020/03/27 15:43:55 [WARN] no schema for provisioner "local-exec" is attached to module.staging.module.nixos-staging.module.deploy_nixos.null_resource.deploy_nixos, so provisioner block references cannot be detected
> terraform version
Terraform v0.12.8

Don’t know what that means, but it might be a bug.

Pass arguments to nixos_config derivation

Is your feature request related to a problem? Please describe.
I want to parameterize my NixOS config based on some resource attributes (i.e. aws_instance.xxx.public_dns), but this module doesn't have a good way to pass inputs to the configuration.

Describe the solution you'd like

  • This module should provide an extra argument for derivation parameters, i.e. derivation_args or similar.
  • The NixOS configuration file should be a function (i.e. { myAddress }: ... ) when arguments are passed.
  • The derivation should rebuild and redeploy every time input changes cause a different output.

Describe alternatives you've considered
It gets really annoying to use a raw string instead of a .nix file, especially when I need string interpolation, so I want to keep it in a separate file.

Additional context
I made a really primitive branch that does this here. I can then do extra_eval_args = [ "--arg" "configArgs" "..." ]; but the inputs don't update on subsequent runs of terraform plan. Changes to the configuration are detected and applied correctly, but with an old value of configArgs that doesn't apply anymore. This is probably because I'm using Terraform string interpolation in the argument, and I want it to reevaluate when the result of interpolation changes.

keys in /var/keys are not in the `keys` group

# l /var/keys
total 24K
drwxr-x---  2 root keys 4.0K Jun 19 13:14 .
drwxr-xr-x 10 root root 4.0K Jun 19 13:04 ..
-rw-r-----  1 root root 1.9K Jun 19 13:14 certificate_pem

/var/keys itself is root:keys, but the keys in it are root:root and readable only by user and group, so they are unreadable by the keys group.

When nix_image_custom is updated, instances are not re-created correctly

I use nixos_image_custom to create a NixOS image from a given NixOS configuration and put it into a bucket. Then I use this image in google_compute_instance_template (via disk.source_image). The problem is that when I change the configuration I get the following error:

* google_compute_instance_template.buildkite_nixos: 1 error occurred:
* google_compute_instance_template.buildkite_nixos: reading body EOF

So I have to go delete the instance group first, then delete the template, and only after that I can run terraform plan and terraform apply. What should be happening instead is that terraform should be able to figure out that it should drop the old object in the bucket (containing the old image) and create a new object with the new image, then it should re-create template and instances accordingly.

SSH authentication fails

Hello! First off, thank you for the project!

I'm not too familiar with Terraform so maybe I'm doing something dumb. I've essentially copied example to another directory and modified the deploy_nixos.tf file.

Previous to the I ran eval "$(ssh-agent -s)" and ssh-add ~/.ssh/id_rsa.

Error: Error applying plan:

1 error occurred:
    * module.deploy_nixos.null_resource.deploy_nixos: timeout - last error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
data "google_compute_network" "default" {
  name = "default"
}

resource "google_compute_firewall" "deploy-nixos" {
  name    = "deploy-nixos"
  network = "${data.google_compute_network.default.name}"

  allow {
    protocol = "icmp"
  }

  // Allow SSH access
  allow {
    protocol = "tcp"
    ports    = ["22", "80", "443"]
  }

  source_tags = ["nixos"]
}

resource "google_compute_instance" "deploy-nixos" {
  name         = "deploy-nixos-example"
  machine_type = "g1-small"
  zone         = "europe-west2-a"
  # region      = "eu-west2"

  // Bind the firewall rules
  tags = ["nixos"]

  boot_disk {
    initialize_params {
      // Start with an image the deployer can SSH into
      image = "${module.nixos_image_custom.self_link}"
      size  = "25"
    }
  }

  network_interface {
    network = "default"

    // Give it a public IP
    access_config {}
  }

  lifecycle {
    // No need to re-deploy the machine if the image changed
    // NixOS is already immutable
    ignore_changes = ["boot_disk"]
  }
}

module "deploy_nixos" {
  source = "../../deploy_nixos"

  // Deploy the given NixOS configuration. In this case it's the same as the
  // original image. So if the configuration is changed later it will be
  // deployed here.
  nixos_config = "${path.module}/image_nixos_custom.nix"

  target_user = "root"
  target_host = "${google_compute_instance.deploy-nixos.network_interface.0.access_config.0.nat_ip}"

  triggers = {
    // Also re-deploy whenever the VM is re-created
    instance_id = "${google_compute_instance.deploy-nixos.id}"
  }
}

If I manually try to SSH I get the same authentication error. I've mounted the disk on another instance and inspecting /root shows no .ssh directory:

root@instance-2:/home/chris/mount/root# ls -altr
total 12
drwx------  3 root root 4096 Jun  8 20:47 .
drwx------  2 root root 4096 Jun  8 20:47 .nix-defexpr
drwxr-xr-x 16 root root 4096 Jun  8 21:09 ..

Speed up the process of copying drv files to the remote host

Is your feature request related to a problem? Please describe.

When deploying changes with deploy_nixos a large portion of the deploy time is spent copying .drv files to the remote host. These files are very small, there are often very many of them, and round-trip latency ends up accounting for most of the time spent.

Describe the solution you'd like

According to #nixos on Matrix, the nix-copy-closure protocol is "very chatty" (ie, has many round-trips) because it tries not to send any objects that the remote already has and it tries not to send anything that the remote doesn't already have all dependencies for.

When buildOnTarget is set, deploy_nixos runs nix-copy-closure in nixos-deploy.sh specifically to copy derivations only. Instead of using nix-copy-closure in this case, it could perform an export, transfer the export, then perform an import. This would be a much less "chatty" exchange and largely eliminate the effects of round-trip latency. Since drvs are small, the penalty of possibly transferring some which do not need to be transferred is minimal. An entire NixOS system's drvs might amount to 30MB or 40MB.

This option could be guarded by a configuration toggle if there's some desire to put the choice about this trade-off into the end-user's hands.

Describe alternatives you've considered

None

Additional context

None

Consider Terranix Integration

This would probably be better as a discussion, but discussions weren't enabled on this repo at time of creation

Overview

Terranix is "a NixOS way to create terraform json files." It leverages the NixOS module system to generate a terraform config file.

By providing the utilities in this package as terranix module(s) in addition to (or instead of) the base terraform module, we could potentially simplify the implementation and provide a more flexible interface for consumers of this module.

Pros

  • NixOS modules are far more flexible than Terraform modules, allowing overriding of values created by the modules.
  • Tighter coupling to the Nix language simplifies implementation of deploy_nix (I believe the nix-instantiate.sh script could be removed entirely as all that info could be computed directly in the nix expression)

Cons

  • Must continue to maintain the terraform HCL version of the module to allow non-terranix users to consume the module.
  • Duplicating logic between the Terranix and HCL versions leads to substantially increased maintenance cost and significantly increase the chance for bugs to be introduced.
  • Resources/data/etc created by Terranix modules are not namespaced like with native Terraform modules. This could potentially cause naming collisions

deploy_nixos leaks files from the working directory into the world-readable nix store

Describe the bug
deploy_nixos evaulates an expression that has ./. as src and leaks the contents of the working directory into the world-readable nix store. The working directory may contain (.gitignored) secrets so this is a security issue.
https://github.com/tweag/terraform-nixos/blob/646cacb12439ca477c05315a7bfd49e9832bc4e3/deploy_nixos/nixos-instantiate.sh#L22

To Reproduce
Use deploy_nixos module

Expected behavior
Don't leak files from working directory.

Environment

  • OS name + version: NixOS unstable
  • Version of the code: 646cacb

terraform-nixos doesn't work on Terraform Cloud

Describe the bug

Terraform v0.14.4
Configuring remote state backend...
Initializing Terraform configuration...
tls_private_key.state_ssh_key: Refreshing state... [id=73c4bc5aee756477cd0e5329a0217a5d5538bed7]
local_file.machine_ssh_key: Refreshing state... [id=403f29bde18a7b04fbb51d277140357369628b1f]
aws_key_pair.generated_key: Refreshing state... [id=generated-key-597bc4e3ec93b09f8543849173beca0a55dd2c5ce00ad482b4ca79bde84c7732]
aws_security_group.ssh_and_egress: Refreshing state... [id=sg-0ead032fef275ba39]
aws_instance.machine: Refreshing state... [id=i-063cd3c8a22644746]

Error: failed to execute ".terraform/modules/deploy_nixos/deploy_nixos/nixos-instantiate.sh": running (instantiating):  'nix-instantiate' '--show-trace' '--expr' $'\n  { system, configuration, ... }:\n  let\n    os = import <nixpkgs/nixos> { inherit system configuration; };\n    inherit (import <nixpkgs/lib>) concatStringsSep;\n  in {\n    substituters = concatStringsSep " " os.config.nix.binaryCaches;\n    trusted-public-keys = concatStringsSep " " os.config.nix.binaryCachePublicKeys;\n    drv_path = os.system.drvPath;\n    out_path = os.system;\n    inherit (builtins) currentSystem;\n  }' '--argstr' 'configuration' '/terraform/configuration.nix' '--argstr' 'system' 'x86_64-linux' -A out_path
.terraform/modules/deploy_nixos/deploy_nixos/nixos-instantiate.sh: line 44: nix-instantiate: command not found

To Reproduce

Follow the guide at https://nixos.org/guides/deploying-nixos-using-terraform.html (it's at the part with the configuration.nix file).

Expected behavior

Environment

  • system: "x86_64-linux"
  • host os: Linux 5.10.1-zen1, NixOS, 21.03.20210109.257cbbc (Okapi)
  • multi-user?: yes
  • sandbox: yes
  • version: nix-env (Nix) 2.4pre20201205_a5d85d0
  • channels(root): "nixos-21.03pre260232.733e537a8ad"
  • channels(bbigras): "home-manager-20.09"
  • nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixos

Terraform v0.14.4

I tried with 5f5a040 and f0f6232.

Additional context

How to pass parameters to Nix script from Terraform?

Hi!
Let's say I want to pass PublicIP and other parameters from Terraform to Nix file, how should I do that?
I tried using

extra_eval_args = ["--argstr", "some_arg", aws_instance.node.public_ip]
or
extra_build_args = ["--argstr", "some_arg", aws_instance.node.public_ip]

but none of those are shown in the Nix file as arguments.

What other approach should I use?

Thanks

Use the nixos-rebuild script

As noted by zimbatm in #25, the deploy script is starting to look like nixos-rebuild.
It seems like my use case from #25 is actually subsumed by nixos-rebuild, unlike my earlier quick interpretation of the docs. Clearly I didn't find the parenthesized bit from --target-host:

(and no build artifacts will be copied to the local machine)

challenges

nixos-rebuild does need some bootstrapping. nixos-rebuild.sh has a build process that substitutes some dependencies into the script. Replicating this is a step back from the current script, which takes dependencies from the environment, allowing it to run without a proper /nix/store store. This is useful when deploying from restrictive environments without root access or mount namespace capability. So instead of substituting dependencies from the nix store into the script, we should allow values from the environment.

Note: static nix is not officially supported yet, but here's a branch that makes it work on x86_64-linux deployer machines. https://github.com/tweag/terraform-nixos/compare/master...hercules-ci:install-static-user-nix?expand=1

nixos-instantiate.sh produces invalid output

Describe the bug
Running terraform plan using the latest commit (e3cfe7c) give the following error:

Error: command ".terraform/modules/development.deploy_nixos/deploy_nixos/nixos-instantiate.sh" produced invalid JSON: invalid character 'r' looking for beginning of value

  on .terraform/modules/development.deploy_nixos/deploy_nixos/main.tf line 123, in data "external" "nixos-instantiate":
 123: data "external" "nixos-instantiate" {

The newly added check for the type of readlink on the system gives the following output to stdout:

> readlink --version | grep GNU
readlink (GNU coreutils) 8.29
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.

Terraform sees this output and tries to interpret is as the JSON output of the program and fails.

To Reproduce
Run terraform plan on terraform project using the latest deploy_nixos module, on a machine with GNU readlink

Expected behavior
terraform plan to complete without error.

Environment

  • OS name + version: 20.04.1 Ubuntu
  • Version of the code: Terraform v0.12.20

deploy_nixos: hash changes when files change in the working directory, even .gitignored files with flakes

Describe the bug
deploy_nixos evaluates an expression which has ./. as source. The hash of this derivation changes when local files change, even .gitignored files, even when using a flake. This causes unnecessary reevaluation.
https://github.com/tweag/terraform-nixos/blob/646cacb12439ca477c05315a7bfd49e9832bc4e3/deploy_nixos/nixos-instantiate.sh#L22

To Reproduce
terraform apply a deploy_nixos flake config. Change a .gitignored file. apply again. The evaluation is slow and the hash changes, requiring a new deploy.

Expected behavior
The evaluation should be fast and no changes should be detected.

Environment

  • OS name + version: NixOS unstable
  • Version of the code: 646cacb

`ssh_key_file` bootstrap problem

Context

I am attempting to migrate my infrastructure from NixOps to Terraform to be able to use a more mature deployment system. I have been loosely following this tutorial on nix.dev, with the addition of using terranix to generate my terraform config.

Problem

It does not appear to be possible to use ssh_key_file with a file generated by terraform (ie tls_private_key + local_sensitive_file) due to limitations on the file() function. In my attempts, I always get the following:

│ Error: Invalid function argument
│
│   on .terraform/modules/deploy_nixos/main.tf line 91, in locals:
│   91:   ssh_private_key      = local.ssh_private_key_file == "-" ? var.ssh_private_key : file(local.ssh_private_key_file)
│     ├────────────────
│     │ while calling file(path)
│     │ local.ssh_private_key_file is "./id_rsa"
│
│ Invalid value for "path" parameter: no file exists at "./id_rsa"; this function works only with files that are distributed as part
│ of the configuration source code, so if this file will be created by a resource in this configuration you must instead obtain this
│ result from an attribute of that resource.

Workarounds

  1. It is somewhat possible to workaround this by doing a terraform apply without the deploy_nixos module to first generate the file, then a second terraform apply, however doing so is not ideal for CI/CD workflows as it would require maintaining multiple terraform config files.
  2. Alternatively one could simply use ssh_key instead which does work properly with terraforms dependency system. However this also is problematic because it means that the output from deploy_nixos gets omitted by default since it will print the contents of the ssh private key to stdout.

Questions

  1. I seem to be the only one having issues with this; is there a better approach that would allow me to have terraform manage my ssh key used for deployment? If not it seems like workaround (1) is probably my best option.
  2. Should I just not even be attempting to manage the ssh key used for deployment with IaC and use some sort of out-of-band method for distributing ssh keys to my deployer hosts instead?

NixOS upgrades may break due to lack of stateVersion

Describe the bug

Installations of NixOS should set the stateVersion option, such that NixOS can take legacy filesystem state locations and such into account.

To Reproduce

(hypothetical but bound to happen)

  1. Deploy NixOS, say 19.09. This creates a system state compatible with 19.09.
  2. Deploy NixOS, say 20.03. This now expects a system state that is like a fresh 20.03 install. It does not apply its compatibility measures, because it doesn't know that the system is still in a 19.09-like state.
  3. A database is down because the files are in the 19.09 location rather than the fresh 20.03 location and another service is misconfigured because the default values for its options have changed

Expected behavior
terraform-nixos saves the stateVersion on first deployment and sets it until the machine is destroyed.

Environment

  • OS name + version: n/a
  • Version of the code: master as of reporting

Additional context

From the docs

Every once in a while, a new NixOS release may change configuration defaults in a way incompatible with stateful data. For instance, if the default version of PostgreSQL changes, the new version will probably be unable to read your existing databases. To prevent such breakage, you should set the value of this option to the NixOS release with which you want to be compatible. The effect is that NixOS will use defaults corresponding to the specified release (such as using an older version of PostgreSQL). It‘s perfectly fine and recommended to leave this value at the release version of the first install of this system. Changing this option will not upgrade your system. In fact it is meant to stay constant exactly when you upgrade your system. You should only bump this option, if you are sure that you can or have migrated all state on your system which is affected by this option.

deploy_nixos: ssh ControlPersist keeps connection open until timeout

module.deploy_nixos.null_resource.deploy_nixos (local-exec): 103 store paths deleted, 226.86 MiB freed
module.deploy_nixos.null_resource.deploy_nixos (local-exec): debug1: client_input_channel_req: channel 2 rtype exit-status reply 0
module.deploy_nixos.null_resource.deploy_nixos (local-exec): debug1: channel 2: free: client-session, nchannels 3
module.deploy_nixos.null_resource.deploy_nixos (local-exec): debug1: channel 1: free: mux-control, nchannels 2
module.deploy_nixos.null_resource.deploy_nixos: Still creating... [30s elapsed]
module.deploy_nixos.null_resource.deploy_nixos: Still creating... [40s elapsed]
module.deploy_nixos.null_resource.deploy_nixos: Still creating... [50s elapsed]
module.deploy_nixos.null_resource.deploy_nixos: Still creating... [1m0s elapsed]
module.deploy_nixos.null_resource.deploy_nixos: Still creating... [1m10s elapsed]
module.deploy_nixos.null_resource.deploy_nixos: Still creating... [1m20s elapsed]

The timeout is ~100s.

Copy .nix files to /etc/nixos as well?

First apologies for asking a noob question. I am playing around with deploy_nixos w/ DigitalOcean droplets and everything works great so far. I use a templatefile to generate configuration.nix (for IPs and stuff) and have some imports ala

 imports = [
    <nixpkgs/nixos/modules/virtualisation/digital-ocean-image.nix>
    ./common.nix
    ./private.nix
  ]

Now what I would like to achieve is beside applying the configuration also copying the closure of all nix files to /etc/nixos on the machine somehow (you might imagine that also common.nix imports some blahblah/one.nix and that should become /etc/nixos/blahblah/one.nix then and so on). The server is converted through nixos-infect from Ubuntu 20.04 so in /etc/nixos there is still the stuff left from nixos-infect. Of course the usually way should be terraform apply from my workstation but I am worried that I might call nixos-rebuild switch by mistake directly on the server which could lead to problems. Or should I just rm -rf /etc/nixos on the server?

Transfer request to nix-community

Is your feature request related to a problem? Please describe.
This project is mature now and could benefit from shared maintenance. I'm also doing most of the maintenance right now but am not associated with Tweag anymore.

Describe the solution you'd like
Move the repo to https://github.com/nix-community

Describe alternatives you've considered
Stop maintaining the project, or fork it.

Additional context
/cc @aspiwack

It is difficult to propagate key changes to other parts of a system

Is your feature request related to a problem? Please describe.

When using the keys feature to put secret values into /var/keys it is difficult to cause other parts of the system to be updated with the new key values.

My understanding is that the old keys are first deleted from the remote system and then the new keys are uploaded and written to the correct location. After this, the build and instantiate steps are performed.

This means that the old system configuration is briefly exposed to the new keys but then (arbitrarily quickly) it is interrupted to be replaced by the new system. When the new system is fully in place, the new keys have already been written and it's difficult or impossible to know if they are new or not.

A systemd.path unit watching the keys doesn't help because it triggers too early in the process (when the old system and the new keys are in place).

The specific motivating use-case I have is that I am trying to deploy Matomo using the NixOS Matomo service. The NixOS service doesn't have a lot of configuration affordances. Instead, there's just a config.ini.php file that needs to be rewritten when configuration/secrets change. My goal is to rewrite this file whenever the Matomo secrets I'm deloying w/ deploy_nixos and the keys feature change.

Describe the solution you'd like

I'm not going to be very picky. Anything that provides a straightforward way to run some code on a system that has "settled" (ie, has completely switched to the new configuration) after keys have changed should resolve this issue. Ideally, it wouldn't involve writing a lot of custom systemd units or shell scripts to use this feature.

One idea is that there could be a "keys-in-place" unit that is (re-)started after the new configuration has settled whenever the keys were rewritten earlier in the same deploy. I'm not exactly sure how the rest of this idea goes, though. Other units declare they're required by and run after this unit?

Another possibility entirely is that it's actually already easy to do this if you know One Weird Systemd Trick™. In this case, perhaps this trick could just be added near the keys documentation so non-systemd wizards can use it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.