Giter Club home page Giter Club logo

google-cloud-ops-agents-ansible's Introduction

Ansible Role for Cloud Ops

This Ansible role installs the Cloud Ops agents.

Install the Role

  • [Recommended] To use Ansible Galaxy to handle dependencies, use this command to install:

    ansible-galaxy install googlecloudplatform.google_cloud_ops_agents

  • To use GitHub submodules to handle dependencies, install this directory in your roles path (usually in a roles directory alongside your playbook) under

    the name googlecloudplatform.google_cloud_ops_agents:

    git clone <this-git-repo> roles/googlecloudplatform.google_cloud_ops_agents

There is a detailed tutorial as well.

Requirements

Permissions to the Google Cloud API. If you are running an old Compute Engine instance or Compute Engine instances created without the default credentials, then you must complete the following steps https://cloud.google.com/monitoring/agent/authorization#before_you_begin.

Role Variables

The agent_type is a required variable used to specify which agent is being configured. The available options are monitoring, logging and ops-agent.

The package_state variable can be used to specify the desired state of the agent. The allowed values are present (default) and absent.

The version variable can be used to specify which version of the agent to install. The allowed values are latest (default), MAJOR_VERSION.*.* and MAJOR_VERSION.MINOR_VERSION.PATCH_VERSION, which are described in detail below.

version=latest This setting makes it easier to keep the agent version up to date, however it does come with a potential risk. When a new major version is released, the policy may install the latest version of the agent from the new major release, which may introduce breaking changes. For production environments, consider using the version=MAJOR_VERSION.*.* setting below for safer agent deployments.

version=MAJOR_VERSION.*.* When a new major release is out, this setting ensures that only the latest version from the specified major version is installed, which avoids accidentally introducing breaking changes. This is recommended for production environments to ensure safer agent deployments.

version=MAJOR_VERSION.MINOR_VERSION.PATCH_VERSION This setting is not recommended since it prevents upgrades of new versions of the agent that include bug fixes and other improvements.

The main_config_file variable can be used to supply an absolute or relative path to a custom configuration file. This file will overwrite the configuration file on the target VM.

For more information, please see Configuring the Monitoring Agent, Configuring the Logging Agent or Configuring the Ops Agent.

By default, the agent only monitors and logs system resources like cpu, memory, disk etc. Third party application monitoring and logging can be configured by supplying a path to a directory containing plugin configuration files using the variable additional_config_dir. All .conf files under this directory will be deployed to the agent's plugin directory on the target VM. The main config file should have a line that includes this directory. Please note that this variable can only be specified when configuring the monitoring or logging agents.

For more information, please see Monitoring third-party applications.

Example Playbooks

# Installing the Monitoring and Logging agents
- hosts: all
  become: true
  roles:
    - role: googlecloudplatform.google_cloud_ops_agents
      vars:
        agent_type: monitoring

    - role: googlecloudplatform.google_cloud_ops_agents
      vars:
        agent_type: logging
# Installing the Monitoring and Logging agents with custom configurations
- hosts: all
  become: true
  roles:
    - role: googlecloudplatform.google_cloud_ops_agents
      vars:
        agent_type: monitoring
        version: latest
        main_config_file: monitoring_agent.conf
        additional_config_dir: monitoring_agent_dir/

    - role: googlecloudplatform.google_cloud_ops_agents
      vars:
        agent_type: logging
        version: 1.*.*
        main_config_file: logging_agent.conf
        additional_config_dir: logging_agent_dir/
# Installing the Ops-Agent
- hosts: all
  become: true
  roles:
    - role: googlecloudplatform.google_cloud_ops_agents
      vars:
        agent_type: ops-agent
# Installing the Ops-Agent with custom configuration
- hosts: all
  become: true
  roles:
    - role: googlecloudplatform.google_cloud_ops_agents
      vars:
        agent_type: ops-agent
        version: 1.0.1
        main_config_file: ops_agent.yaml

Compatibility

The matrix below lists the versions of this Ansible role and the agent versions it supports.

Ansible Role Version Compatible Ops Agent Version(s) Compatible Logging Agent Version(s) Compatible Monitoring Agent Version(s)
1.x.x 2.x.x 1.x.x 6.x.x

Bug report and feature request

Please file a case via https://cloud.google.com/support-hub to get official support that follows SLOs.

License

Copyright 2020 Google Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use
this file except in compliance with the License.  You may obtain a copy of the
License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed
under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND, either express or implied.  See the License for the
specific language governing permissions and limitations under the License.

google-cloud-ops-agents-ansible's People

Contributors

bdesbiolles avatar cyclenerd avatar ehmo avatar igorpeshansky avatar ittkm avatar martijnvans avatar qingling128 avatar ridwanmsharif avatar rimey avatar rmoriar1 avatar shuuji3 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

google-cloud-ops-agents-ansible's Issues

Add contributors.md file

We need to add contributors.md so that we can maintain the list of contributors for this repo.

I have the list ready with me, let me know if I can work on it?

Please delete branch "1.0.6" (HTTP Error 300: Multiple Choices)

Description

I wanted to use 1.0.6, but I get a HTTP error:
[ERROR]: failed to download the file: HTTP Error 300: Multiple Choices

Reproduction

  1. Have this requirements in requirements.yml:
roles:
 - name: googlecloudplatform.google_cloud_ops_agents
   version: 1.0.6
  1. Run the command ansible-galaxy install -r requirements.yml --force
  2. Get the following error:
Starting galaxy role install process
- downloading role 'google_cloud_ops_agents', owned by googlecloudplatform
- downloading role from https://github.com/GoogleCloudPlatform/stackdriver-ansible-role/archive/1.0.6.tar.gz
 [ERROR]: failed to download the file: HTTP Error 300: Multiple Choices
[WARNING]: - googlecloudplatform.google_cloud_ops_agents was NOT installed successfully.
ERROR! - you can use --ignore-errors to skip failed roles and finish processing the list.
  1. If you use curl, you see that it has 2 options to serve the reference 1.0.6 for this URL:
~$ curl "https://github.com/GoogleCloudPlatform/google-cloud-ops-agents-ansible/archive/1.0.6.tar.gz"
the given path has multiple possibilities: #<Git::Ref:0x00007f805cda8a98>, #<Git::Ref:0x00007f805cda85c0>
~$ 

Resolution

I guess this is caused by the existence of a git tag 1.0.6 and a git branch 1.0.6. As the branch won't be needed in my opinion, could you please delete it?

Workaround

  • Use ansible-galaxy install googlecloudplatform.google_cloud_ops_agents as it will download master.
  • Use version: master in requirements.yml

Task Add repo and install agent or remove repo and uninstall agent failed

Sample play book running on RHEL 8.5
roles:
- role: googlecloudplatform.google_cloud_ops_agents
vars:
agent_type: ops-agent
version: latest

Error:
FAILED - RETRYING: [ansible-test-vm1]: Add repo and install agent or remove repo and uninstall agent (5 retries left).
FAILED - RETRYING: [ansible-test-vm1]: Add repo and install agent or remove repo and uninstall agent (4 retries left).
FAILED - RETRYING: [ansible-test-vm1]: Add repo and install agent or remove repo and uninstall agent (3 retries left).
FAILED - RETRYING: [ansible-test-vm1]: Add repo and install agent or remove repo and uninstall agent (2 retries left).
FAILED - RETRYING: [ansible-test-vm1]: Add repo and install agent or remove repo and uninstall agent (1 retries left).
fatal: [ansible-test-vm1]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["bash", "add-google-cloud-ops-agent-repo.sh", "--also-install", "--version=latest"], "delta": "0:00:03.308186", "end": "2023-08-15 01:19:02.902590", "msg": "non-zero return code", "rc": 1, "start": "2023-08-15 01:18:59.594404", "stderr": "Repository google-cloud-ops-agent is listed more than once in the configuration\nErrors during downloading metadata for repository 'google-cloud-ops-agent':\n - Status code: 404 for https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-Ootpa-x86_64-all/repodata/repomd.xml (IP: 172.217.13.110)\nError: Failed to download metadata for repo 'google-cloud-ops-agent': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried\nAttempt 1 of 3 failed: yum -y list updates\nRepository google-cloud-ops-agent is listed more than once in the configuration\nErrors during downloading metadata for repository 'google-cloud-ops-agent':\n - Status code: 404 for https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-Ootpa-x86_64-all/repodata/repomd.xml (IP: 172.217.13.110)\nError: Failed to download metadata for repo 'google-cloud-ops-agent': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried\nAttempt 2 of 3 failed: yum -y list updates\nRepository google-cloud-ops-agent is listed more than once in the configuration\nErrors during downloading metadata for repository 'google-cloud-ops-agent':\n - Status code: 404 for https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-Ootpa-x86_64-all/repodata/repomd.xml (IP: 172.217.13.110)\nError: Failed to download metadata for repo 'google-cloud-ops-agent': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried\nAttempt 3 of 3 failed: yum -y list updates\nCommand: yum -y list updates failed\n[2023-08-15T01:19:02+0000] Could not refresh the google-cloud-ops-agent yum repositories.\nPlease check your network connectivity and make sure you are running a supported\nrhel distribution. See https://cloud.google.com/stackdriver/docs/solutions/ops-agent/#supported_operating_systems\nfor a list of supported platforms.", "stderr_lines": ["Repository google-cloud-ops-agent is listed more than once in the configuration", "Errors during downloading metadata for repository 'google-cloud-ops-agent':", " - Status code: 404 for https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-Ootpa-x86_64-all/repodata/repomd.xml (IP: 172.217.13.110)", "Error: Failed to download metadata for repo 'google-cloud-ops-agent': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried", "Attempt 1 of 3 failed: yum -y list updates", "Repository google-cloud-ops-agent is listed more than once in the configuration", "Errors during downloading metadata for repository 'google-cloud-ops-agent':", " - Status code: 404 for https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-Ootpa-x86_64-all/repodata/repomd.xml (IP: 172.217.13.110)", "Error: Failed to download metadata for repo 'google-cloud-ops-agent': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried", "Attempt 2 of 3 failed: yum -y list updates", "Repository google-cloud-ops-agent is listed more than once in the configuration", "Errors during downloading metadata for repository 'google-cloud-ops-agent':", " - Status code: 404 for https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-Ootpa-x86_64-all/repodata/repomd.xml (IP: 172.217.13.110)", "Error: Failed to download metadata for repo 'google-cloud-ops-agent': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried", "Attempt 3 of 3 failed: yum -y list updates", "Command: yum -y list updates failed", "[2023-08-15T01:19:02+0000] Could not refresh the google-cloud-ops-agent yum repositories.", "Please check your network connectivity and make sure you are running a supported", "rhel distribution. See https://cloud.google.com/stackdriver/docs/solutions/ops-agent/#supported_operating_systems", "for a list of supported platforms."], "stdout": "Google Cloud Ops Agent Repository 6.3 kB/s | 1.4 kB 00:00 \nGoogle Cloud Ops Agent Repository 6.3 kB/s | 1.4 kB 00:00 \nGoogle Cloud Ops Agent Repository 6.3 kB/s | 1.4 kB 00:00 ", "stdout_lines": ["Google Cloud Ops Agent Repository 6.3 kB/s | 1.4 kB 00:00 ", "Google Cloud Ops Agent Repository 6.3 kB/s | 1.4 kB 00:00 ", "Google Cloud Ops Agent Repository 6.3 kB/s | 1.4 kB 00:00 "]}

google-cloud-monitoring-bullseye-all does not have a Release file

The google-cloud-monitoring-bullseye-all does not have a release file in https://packages.cloud.google.com/apt/dists/

TASK [google-cloud-ops-agents-ansible : Add repo and install agent or remove repo and uninstall agent] ***************************************************************************************************************************************
task path: /Users/DavidMorp/Repos/infrastructure/galaxy/google-cloud-ops-agents-ansible/tasks/linux.yml:19
Wednesday 15 September 2021  15:02:03 +0200 (0:00:00.255)       0:00:20.107 ***
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (5 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (4 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (3 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (2 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (1 retries left).
fatal: [hostname]: FAILED! => changed=true
  attempts: 5
  cmd:
  - bash
  - add-monitoring-agent-repo.sh
  - --also-install
  - --version=latest
  delta: '0:00:01.047746'
  end: '2021-09-15 15:03:00.063662'
  msg: non-zero return code
  rc: 1
  start: '2021-09-15 15:02:59.015916'
  stderr: |-
    E: The repository 'https://packages.cloud.google.com/apt google-cloud-monitoring-bullseye-all Release' does not have a Release file.
    [2021-09-15T15:03:00+0200] Could not refresh the google-cloud-monitoring apt repositories.
    Please check your network connectivity and make sure you are running a supported
    debian distribution. See https://cloud.google.com/monitoring/agent/#supported_operating_systems
    for a list of supported platforms.
  stderr_lines: <omitted>
  stdout: |-
    Hit:1 http://security.debian.org/debian-security bullseye-security InRelease
    Hit:2 http://deb.debian.org/debian bullseye InRelease
    Hit:3 http://deb.debian.org/debian bullseye-updates InRelease
    Hit:4 http://deb.debian.org/debian bullseye-backports InRelease
    Hit:5 http://packages.cloud.google.com/apt cloud-sdk-bullseye InRelease
    Hit:6 http://packages.cloud.google.com/apt google-cloud-packages-archive-keyring-bullseye InRelease
    Hit:7 http://packages.cloud.google.com/apt google-compute-engine-bullseye-stable InRelease
    Ign:8 https://packages.cloud.google.com/apt google-cloud-monitoring-bullseye-all InRelease
    Err:9 https://packages.cloud.google.com/apt google-cloud-monitoring-bullseye-all Release
      404  Not Found [IP: 142.250.185.174 443]
    Reading package lists...
  stdout_lines: <omitted>

Mac OSX lacks realpath

The gcloud scripts depend on realpath to locate the function library, without it you get

+++ dirname /path-to-my-tfdir/modules/agent-policy/scripts/create-update-script.sh module.admin_agent_policy.module.gcloud-upsert.null_resource.run_command[0] (local-exec): ++ realpath /path-to-my-tfdir/modules/agent-policy/scripts module.admin_agent_policy.module.gcloud-upsert.null_resource.run_command[0] (local-exec): /path-to-my-tfdir/modules/agent-policy/scripts/create-update-script.sh: line 38: realpath: command not found module.admin_agent_policy.module.gcloud-upsert.null_resource.run_command[0] (local-exec): + SCRIPT_DIR= module.admin_agent_policy.module.gcloud-upsert.null_resource.run_command[0] (local-exec): + UTILS_ABS_PATH=/script-utils.sh module.admin_agent_policy.module.gcloud-upsert.null_resource.run_command[0] (local-exec): + source /script-utils.sh module.admin_agent_policy.module.gcloud-upsert.null_resource.run_command[0] (local-exec): /path-to-my-tfdir/modules/agent-policy/scripts/create-update-script.sh: line 41: /script-utils.sh: No such file or directory

Interestingly, the scripts don't elevate that error up to terraform, which thinks the terraform apply worked just fine, but leaving an empty state file and no actual policies.

A workaround is to use homebrew to install coreutils, which has realpath, but the scripts really should either not depend on it, or catch the lack and the failure to execute.

Restart of Ops Agent fails

Hi everyone,

executing the role, I experienced the following error:

  • Error: When using the Ansible role to setup the Ops Agent, the restart of the service fails, because the service with the name "google-cloud-ops-agent.target" cannot be found.

  • Experienced on System: Compute Engine instance running Ubuntu 20.04 LTS (Minimal); Installation of latest version of Ops Agent.

  • Solution: Change the value of the variable "ops-agent_service_name" in "vars/main.yml" to "google-cloud-ops-agent".

Best regards,
Flo

Use template module

Use template ansible module to place file to desired destination.

For ex:
In google-cloud-ops-agents-ansible/tasks/linux.yml

on line 35 we have

- when: package_state == 'present'
  block:
    - name: Copy main config file onto the remote machine
      copy:
        src: "{{ main_config_file }}"
        dest: "{{ vars[agent_type + '_config_path'] }}"
        force: true
        mode: 0644

Copy module will only copy the same file present on host machine.

The template module takes the file from host, changes the variables if required by end-user, and then copies to the remote location, provided the file should be present in templates directory besides the tasks directory

Solution:

- when: package_state == 'present'
  block:
    - name: Copy main config file onto the remote machine
      template:
        src: "{{ main_config_file }}"
        dest: "{{ vars[agent_type + '_config_path'] }}"
        force: true
        mode: 0644

Ubuntu14 Install requires sudo to be explicitly specified

When I tried to install to my Ubuntu14 EC2 instance, I was getting:

TASK: [rimey.stackdriver | [Debian] Enable the Stackdriver apt repository.] ***
failed: [52.1.6.25] => {"failed": true}
msg: [Errno 13] Permission denied: '/etc/apt/sources.list.d/.repo_stackdriver_com_apt.list-KMRJAL'

FATAL: all hosts have already failed -- aborting

Adding 'sudo: true' when declaring the role fixed the issue. May want to consider updating the documentation to reflect this.

Alma Linux Not Supported

Alma Linux provided on GCP is not supported by the playbook

TASK [googlecloudplatform.google_cloud_ops_agents : Validate Operating System] ****************************************************************************************************************
fatal: [127.0.0.1]: FAILED! => {
"assertion": "ansible_os_family == 'Windows' or ansible_distribution in ['Debian', 'Ubuntu', 'RedHat', 'CentOS', 'Amazon', 'SLES', 'openSUSE', 'SuSE', 'SLES_SAP', 'Windows']\n",
"changed": false,
"evaluated_to": false,
"msg": "Received invalid Operating System: 'AlmaLinux'. The Cloud Ops Ansible role supports the following OSs: 'Debian', 'Ubuntu', 'RedHat', 'CentOS', 'Amazon', 'SLES', 'openSUSE', 'SuSE', 'SLES_SAP' and 'Windows'.\n"
}

PLAY RECAP ************************************************************************************************************************************************************************************
127.0.0.1 : ok=1 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

Stackdriver agent deployment on Debian is failing.

Hi there,
Until last week or so stackdriver installation was working absolutely fine without any issues, but today i am getting an error, saying the package stackdriver-agent cannot be found.
Is it been moved to a different location or is the package name changed ? Can you pls help here.
I did do a fresh clone of this repo and tried again, but the same issue.

image

Logging and Monitoring not available for Ubuntu 22.04 LTS Jammy

On a ubuntu:22.04 docker image and a gce instance using 22.04, the installation of the logging role fails with the following output:

TASK [google-cloud-ops-agents-ansible : Add repo and install agent or remove repo and uninstall agent] ***
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (5 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (4 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (3 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (2 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["bash", "add-logging-agent-repo.sh", "--also-install", "--version=latest"], "delta": "0:00:01.648203", "end": "2022-06-30 18:49:05.379330", "msg": "non-zero return code", "rc": 1, "start": "2022-06-30 18:49:03.731127", "stderr": "E: The repository 'https://packages.cloud.google.com/apt google-cloud-logging-jammy-all Release' does not have a Release file.\n[2022-06-30T18:49:05+0000] Could not refresh the google-cloud-logging apt repositories.\nPlease check your network connectivity and make sure you are running a supported\nubuntu distribution. See https://cloud.google.com/logging/docs/agent/#agent-os-list\nfor a list of supported platforms.", "stderr_lines": ["E: The repository 'https://packages.cloud.google.com/apt google-cloud-logging-jammy-all Release' does not have a Release file.", "[2022-06-30T18:49:05+0000] Could not refresh the google-cloud-logging apt repositories.", "Please check your network connectivity and make sure you are running a supported", "ubuntu distribution. See https://cloud.google.com/logging/docs/agent/#agent-os-list", "for a list of supported platforms."], "stdout": "Hit:1 http://packages.cloud.google.com/apt cloud-sdk InRelease\nIgn:2 https://packages.cloud.google.com/apt google-cloud-logging-jammy-all InRelease\nErr:3 https://packages.cloud.google.com/apt google-cloud-logging-jammy-all Release\n  404  Not Found [IP: 142.251.32.46 443]\nHit:4 http://security.ubuntu.com/ubuntu jammy-security InRelease\nHit:5 http://archive.ubuntu.com/ubuntu jammy InRelease\nHit:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease\nHit:7 http://archive.ubuntu.com/ubuntu jammy-backports InRelease\nReading package lists...", "stdout_lines": ["Hit:1 http://packages.cloud.google.com/apt cloud-sdk InRelease", "Ign:2 https://packages.cloud.google.com/apt google-cloud-logging-jammy-all InRelease", "Err:3 https://packages.cloud.google.com/apt google-cloud-logging-jammy-all Release", "  404  Not Found [IP: 142.251.32.46 443]", "Hit:4 http://security.ubuntu.com/ubuntu jammy-security InRelease", "Hit:5 http://archive.ubuntu.com/ubuntu jammy InRelease", "Hit:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease", "Hit:7 http://archive.ubuntu.com/ubuntu jammy-backports InRelease", "Reading package lists..."]}

In particular: The repository 'https://packages.cloud.google.com/apt google-cloud-logging-jammy-all Release' does not have a Release file.

If we check the listing of the packages, we can see that google-cloud-network-logging-jammy and google-cloud-network-monitoring-jammy are missing, while google-cloud-ops-agent-jammy-all exists. I expect both of these packages to exist.

Package listing: https://packages.cloud.google.com/apt/dists

Role does not check whether instance has perms to access GCM

In that specific case, everything installs perfectly, but the stackdriver-agent (namely collectd) does not come up and the role does not complain.

When you start the service manually on the machine, it complains that COLLECTD_ENDPOINT is not defined.

Support non cloud servers

Support non cloud servers, this is the hack I have used, probably there is a better way... (identify when there is no cloud id found and only then use the gen...)
command: /opt/stackdriver/stack-config --api-key {{ stackdriver_api_key }} --genhostid

template variable does not working

the task:

  • name: Installing the Google Cloud Ops-Agent
    hosts: cache
    become: yes
    roles:
    • googlecloudplatform.google_cloud_ops_agents
      vars:
      agent_type: ops-agent
      version: latest
      main_config_file: templates/google-ops-agent-redis-conf.j2

#the template file:
metrics:
receivers:
redis:
type: redis
address: "{{ ansible_default_ipv4.address }}:{{ redis_nodes[0]['port'] }}"
service:
pipelines:
redis:
receivers:
- redis

Does not load the variable to template the config file

Playbook fails due to run with Ansible 2.10

Ansible version: 2.10
OS: Ubuntu 20.04
Ansible role using Galaxy

Playbook:

---
- hosts: all
  become: true
  roles:
    - role: googlecloudplatform.google_cloud_ops_agents
      vars:
        agent_type: ops-agent

Error:

ERROR! couldn't resolve module/action 'win_shell'. This often indicates a misspelling, missing collection, or incorrect module path.
The error appears to be in '/root/.ansible/roles/googlecloudplatform.google_cloud_ops_agents/handlers/main.yml': line 8, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
- name: "restart windows {{ agent_type }} agent"
  ^ here
We could be wrong, but this one looks like it might be an issue with
missing quotes. Always quote template expression brackets when they
start a value. For instance:
    with_items:
      - {{ foo }}
Should be written as:
    with_items:
      - "{{ foo }}"

Opentelemetry configuration

This is a feature request, more than anything.

The generated opentelemetry configuration uses the 8888 port. it would be nice to be able to configure that since this port is widely used. This is for instance the default port of tinyproxy

Support Ops Agent 2.x.x versions

When testing the role against Ops Agent 2.0.0.

fatal: [10.33.104.160]: FAILED! => {"changed": false, "msg": "Could not find the requested service google-cloud-ops-agent.target: host"}

When using an older version, 1.0.5, the role completes without error.

Stackdriver deployment fails at configure.yml on debian on GCP

I have cloned this repo to my roles folder:

git clone https://github.com/GoogleCloudPlatform/stackdriver-ansible-role.git roles/stackdriver

and I have started a new playbook gce_cloud_monitor.yml:


---
- hosts: logging
  vars:
    stackdriver_api_key: "BLEEPBLOORP"
  roles:
  - ../roles/stackdriver

and then when I run the playbook I get a failure at the stackdriver configure.yml:

TASK [../roles/stackdriver : Determine the Stackdriver host ID for reporting.] *
fatal: [myhost.example.com]: FAILED! => {"changed": false, "failed": true, "msg": "", "parsed": false}

"Include OS-specific variables." step in tasks/main.yml failing

The "Include OS-specific variables." step in tasks/main.yml fails with the error "No source file given".

I'm using Ansible 1.9.3. I'm installing the role globally using ansible-galaxy install rimey.stackdriver, which puts it under /etc/ansible/roles.

Is it something that I'm doing wrong?

Relevant syslog entries:

Oct  6 12:59:23 play-4 startupscript: TASK: [rimey.stackdriver | Check that the Stackdriver API key has been specified.] *** 
Oct  6 12:59:23 play-4 startupscript: skipping: [127.0.0.1]
Oct  6 12:59:23 play-4 startupscript: 
Oct  6 12:59:23 play-4 startupscript: TASK: [rimey.stackdriver | Include OS-specific variables.] ******************** 
Oct  6 12:59:23 play-4 startupscript: failed: [127.0.0.1] => {"failed": true, "item": null}
Oct  6 12:59:23 play-4 startupscript: msg: No source file given
Oct  6 12:59:23 play-4 startupscript: 
Oct  6 12:59:23 play-4 startupscript: FATAL: all hosts have already failed -- aborting

The repository 'https://packages.cloud.google.com/apt google-cloud-monitoring-bullseye-all Release' does not have a Release file.

Playbook is failing at this step- bash add-google-cloud-ops-agent-repo.sh --also-install --version=latest for all agent-type={ops-agent,monitoring,logging}
Error-

        "E: The repository 'https://packages.cloud.google.com/apt google-cloud-monitoring-bullseye-all Release' does not have a Release file.",
        "Could not refresh the google-cloud-ops-agent apt repositories.",
        "Please check your network connectivity and make sure you are running a supported",
        "debian distribution. See https://cloud.google.com/stackdriver/docs/solutions/ops-agent/#supported_operating_systems",
        "for a list of supported platforms."```
OS info- `Debian GNU/Linux 11 (bullseye)`
Which is supported per [this link](https://cloud.google.com/stackdriver/docs/solutions/agents/ops-agent#supported_operating_systems).

msg: no service or tool found for: stackdriver-extractor

I have this error under docker / ubuntu 14.04. Any ideas why it happens?

TASK: [../../ansible-role-stackdriver | Check that the Stackdriver API key has been specified.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | Include OS-specific variables.] *******
ok: [tag_Role_ansible-webservers] => (item=/ansible/ansible-role-stackdriver/vars/Debian.yml) => {"ansible_facts": {"stackdriver_package_repo": "http://repo.stackdriver.com/apt", "stackdriver_sysconfig": "/etc/default/stackdriver-agent"}, "item": "/ansible/ansible-role-stackdriver/vars/Debian.yml"}

TASK: [../../ansible-role-stackdriver | [Debian] Enable the Stackdriver apt repository.] ***
<tag_Role_ansible-webservers> REMOTE_MODULE apt_repository state=present repo='deb http://repo.stackdriver.com/apt trusty main'
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1453468316.92-187709019389372 && echo $HOME/.ansible/tmp/ansible-tmp-1453468316.92-187709019389372']
<tag_Role_ansible-webservers> PUT /tmp/tmpwv3WgG TO /root/.ansible/tmp/ansible-tmp-1453468316.92-187709019389372/apt_repository
<tag_Role_ansible-webservers> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=ocowpfypefxyxczpxcblgxbqfitguxhv] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-ocowpfypefxyxczpxcblgxbqfitguxhv; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1453468316.92-187709019389372/apt_repository; rm -rf /root/.ansible/tmp/ansible-tmp-1453468316.92-187709019389372/ >/dev/null 2>&1'"'"''
changed: [tag_Role_ansible-webservers] => {"changed": true, "repo": "deb http://repo.stackdriver.com/apt trusty main", "state": "present"}

TASK: [../../ansible-role-stackdriver | [Debian] Ensure Stackdriver's GPG key is available.] ***
<tag_Role_ansible-webservers> REMOTE_MODULE apt_key url="https://app.stackdriver.com/RPM-GPG-KEY-stackdriver" state=present
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1453468325.01-20842688512790 && echo $HOME/.ansible/tmp/ansible-tmp-1453468325.01-20842688512790']
<tag_Role_ansible-webservers> PUT /tmp/tmppit2XM TO /root/.ansible/tmp/ansible-tmp-1453468325.01-20842688512790/apt_key
<tag_Role_ansible-webservers> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=gbtedbuqzfdbmcibbbvtofghdzrzumgq] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-gbtedbuqzfdbmcibbbvtofghdzrzumgq; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1453468325.01-20842688512790/apt_key; rm -rf /root/.ansible/tmp/ansible-tmp-1453468325.01-20842688512790/ >/dev/null 2>&1'"'"''
changed: [tag_Role_ansible-webservers] => {"changed": true}

TASK: [../../ansible-role-stackdriver | [Debian] Install the Stackdriver agent.] ***
<tag_Role_ansible-webservers> REMOTE_MODULE apt name=stackdriver-agent state=present update_cache=yes
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1453468329.85-25458835102960 && echo $HOME/.ansible/tmp/ansible-tmp-1453468329.85-25458835102960']
<tag_Role_ansible-webservers> PUT /tmp/tmplVcKah TO /root/.ansible/tmp/ansible-tmp-1453468329.85-25458835102960/apt
<tag_Role_ansible-webservers> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=hguhzfkbhyyezgjrbodrudkzuybsmehy] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-hguhzfkbhyyezgjrbodrudkzuybsmehy; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1453468329.85-25458835102960/apt; rm -rf /root/.ansible/tmp/ansible-tmp-1453468329.85-25458835102960/ >/dev/null 2>&1'"'"''
changed: [tag_Role_ansible-webservers] => {"changed": true, "stderr": "invoke-rc.d: policy-rc.d denied execution of start.\n", "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following packages were automatically installed and are no longer required:\n  g++ g++-4.8 libstdc++-4.8-dev\nUse 'apt-get autoremove' to remove them.\nSuggested packages:\n  libmysqlclient18 libhiredis0.10 default-jre\nRecommended packages:\n  stackdriver-extractor libyajl2\nThe following NEW packages will be installed:\n  stackdriver-agent\n0 upgraded, 1 newly installed, 0 to remove and 17 not upgraded.\nNeed to get 951 kB of archives.\nAfter this operation, 4118 kB of additional disk space will be used.\nGet:1 http://repo.stackdriver.com/apt/ trusty/main stackdriver-agent amd64 5.5.0-257.trusty [951 kB]\nPreconfiguring packages ...\nFetched 951 kB in 1s (772 kB/s)\nSelecting previously unselected package stackdriver-agent.\n(Reading database ... 23649 files and directories currently installed.)\nPreparing to unpack .../stackdriver-agent_5.5.0-257.trusty_amd64.deb ...\nUnpacking stackdriver-agent (5.5.0-257.trusty) ...\nProcessing triggers for ureadahead (0.100.0-16) ...\nSetting up stackdriver-agent (5.5.0-257.trusty) ...\nProcessing triggers for ureadahead (0.100.0-16) ...\nProcessing triggers for libc-bin (2.19-0ubuntu6.6) ...\n"}

TASK: [../../ansible-role-stackdriver | [Debian] Install YAJL if needed for curl_json.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [Debian] Install hiredis library if needed for redis plugin.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [RedHat] Install the Python bindings for SELinux, for Ansible.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [RedHat] Add the Stackdriver yum repository.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [RedHat] Install the Stackdriver agent.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [RedHat] Install YAJL if needed for curl_json.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [RedHat] Install hiredis library if needed for redis plugin.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [Windows] Check if the agent is already installed.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [Windows] Download installer from Stackdriver.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | [Windows] Install the Stackdriver agent.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | Configure the agent with the Stackdriver API key.] ***
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1453468474.73-84518553236288 && echo $HOME/.ansible/tmp/ansible-tmp-1453468474.73-84518553236288']
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', u'rc=flag; [ -r /etc/default/stackdriver-agent ] || rc=2; [ -f /etc/default/stackdriver-agent ] || rc=1; [ -d /etc/default/stackdriver-agent ] && rc=3; python -V 2>/dev/null || rc=4; [ x"$rc" != "xflag" ] && echo "${rc} "/etc/default/stackdriver-agent && exit 0; (python -c \'import hashlib; BLOCKSIZE = 65536; hasher = hashlib.sha1();\nafile = open("\'/etc/default/stackdriver-agent\'", "rb")\nbuf = afile.read(BLOCKSIZE)\nwhile len(buf) > 0:\n\thasher.update(buf)\n\tbuf = afile.read(BLOCKSIZE)\nafile.close()\nprint(hasher.hexdigest())\' 2>/dev/null) || (python -c \'import sha; BLOCKSIZE = 65536; hasher = sha.sha();\nafile = open("\'/etc/default/stackdriver-agent\'", "rb")\nbuf = afile.read(BLOCKSIZE)\nwhile len(buf) > 0:\n\thasher.update(buf)\n\tbuf = afile.read(BLOCKSIZE)\nafile.close()\nprint(hasher.hexdigest())\' 2>/dev/null) || (echo \'0 \'/etc/default/stackdriver-agent)']
<tag_Role_ansible-webservers> PUT /tmp/tmpi_FKU4 TO /root/.ansible/tmp/ansible-tmp-1453468474.73-84518553236288/source
<tag_Role_ansible-webservers> PUT /tmp/tmp2Q1LYI TO /root/.ansible/tmp/ansible-tmp-1453468474.73-84518553236288/copy
<tag_Role_ansible-webservers> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=lqrbxuwlqlvsyyhrgnvzhezlqxiapdmv] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-lqrbxuwlqlvsyyhrgnvzhezlqxiapdmv; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1453468474.73-84518553236288/copy; rm -rf /root/.ansible/tmp/ansible-tmp-1453468474.73-84518553236288/ >/dev/null 2>&1'"'"''
changed: [tag_Role_ansible-webservers] => {"backup_file": "/etc/default/stackdriver-agent.2016-01-22@14:14:34~", "changed": true, "checksum": "b0bf94dc4fbe8c764a02074b7ab4c4323b1e5d60", "dest": "/etc/default/stackdriver-agent", "gid": 0, "group": "root", "md5sum": "97ccb607224712a1f6da9e6bd6ceaa99", "mode": "0644", "owner": "root", "size": 238, "src": "/root/.ansible/tmp/ansible-tmp-1453468474.73-84518553236288/source", "state": "file", "uid": 0}

TASK: [../../ansible-role-stackdriver | Ensure the collectd configuration directory exists.] ***
<tag_Role_ansible-webservers> REMOTE_MODULE file path=/opt/stackdriver/collectd/etc/collectd.d/managed state=directory
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1453468474.84-74471559010160 && echo $HOME/.ansible/tmp/ansible-tmp-1453468474.84-74471559010160']
<tag_Role_ansible-webservers> PUT /tmp/tmpvbvWV0 TO /root/.ansible/tmp/ansible-tmp-1453468474.84-74471559010160/file
<tag_Role_ansible-webservers> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=fkaydecfiapcndlewceydagyzacqpavk] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-fkaydecfiapcndlewceydagyzacqpavk; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1453468474.84-74471559010160/file; rm -rf /root/.ansible/tmp/ansible-tmp-1453468474.84-74471559010160/ >/dev/null 2>&1'"'"''
changed: [tag_Role_ansible-webservers] => {"changed": true, "gid": 0, "group": "root", "mode": "0755", "owner": "root", "path": "/opt/stackdriver/collectd/etc/collectd.d/managed", "size": 4096, "state": "directory", "uid": 0}

TASK: [../../ansible-role-stackdriver | Render the configuration directory from templates.] ***
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | List the contents of the configuration directory.] ***
<tag_Role_ansible-webservers> REMOTE_MODULE command ls -1 /opt/stackdriver/collectd/etc/collectd.d/managed
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1453468474.97-97273845227727 && echo $HOME/.ansible/tmp/ansible-tmp-1453468474.97-97273845227727']
<tag_Role_ansible-webservers> PUT /tmp/tmpZYX_lu TO /root/.ansible/tmp/ansible-tmp-1453468474.97-97273845227727/command
<tag_Role_ansible-webservers> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=mhajnamoupjpelmdzfacadxpnqnqhwrw] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-mhajnamoupjpelmdzfacadxpnqnqhwrw; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1453468474.97-97273845227727/command; rm -rf /root/.ansible/tmp/ansible-tmp-1453468474.97-97273845227727/ >/dev/null 2>&1'"'"''
ok: [tag_Role_ansible-webservers] => {"changed": false, "cmd": ["ls", "-1", "/opt/stackdriver/collectd/etc/collectd.d/managed"], "delta": "0:00:00.060044", "end": "2016-01-22 14:14:35.100026", "rc": 0, "start": "2016-01-22 14:14:35.039982", "stderr": "", "stdout": "", "stdout_lines": [], "warnings": []}

TASK: [../../ansible-role-stackdriver | Remove unmanaged files if requested.] ***
skipping: [tag_Role_ansible-webservers]

TASK: [../../ansible-role-stackdriver | Ensure collectd is running.] **********
<tag_Role_ansible-webservers> REMOTE_MODULE service name=stackdriver-agent state=started enabled=yes
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1453468475.14-136384380416709 && echo $HOME/.ansible/tmp/ansible-tmp-1453468475.14-136384380416709']
<tag_Role_ansible-webservers> PUT /tmp/tmpJQaoZe TO /root/.ansible/tmp/ansible-tmp-1453468475.14-136384380416709/service
<tag_Role_ansible-webservers> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=slhnyqwnvqobzhaeixmiupdwwvergmky] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-slhnyqwnvqobzhaeixmiupdwwvergmky; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1453468475.14-136384380416709/service; rm -rf /root/.ansible/tmp/ansible-tmp-1453468475.14-136384380416709/ >/dev/null 2>&1'"'"''
changed: [tag_Role_ansible-webservers] => {"changed": true, "enabled": true, "name": "stackdriver-agent", "state": "started"}

TASK: [../../ansible-role-stackdriver | Ensure extractor is running.] *********
<tag_Role_ansible-webservers> REMOTE_MODULE service name=stackdriver-extractor state=started enabled=yes
<tag_Role_ansible-webservers> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1453468857.43-180534333717703 && echo $HOME/.ansible/tmp/ansible-tmp-1453468857.43-180534333717703']
<tag_Role_ansible-webservers> PUT /tmp/tmpvnT4tI TO /root/.ansible/tmp/ansible-tmp-1453468857.43-180534333717703/service
<tag_Role_ansible-webservers> EXEC /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=qgawbuypnwmbquefwtlnpouvpwdccppy] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-qgawbuypnwmbquefwtlnpouvpwdccppy; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1453468857.43-180534333717703/service; rm -rf /root/.ansible/tmp/ansible-tmp-1453468857.43-180534333717703/ >/dev/null 2>&1'"'"''
failed: [tag_Role_ansible-webservers] => {"failed": true}
msg: no service or tool found for: stackdriver-extractor

get "msg: Unable to send msg: HTTP Error 401: Unauthorized" when running role in first time

Hi,
I'm running the role as a park of a playbook on gce instance.

  • { role: stackdriver-ansible-role, stackdriver_api_key: '{{api_key}}', stackdriver_mongodb_enabled: yes }

I have mongo already installed and running.
When I run the role in first time, it throws me this exception:

NOTIFIED: [stackdriver | restart collectd] ************************************
changed: [yg-test] => {"changed": true, "name": "stackdriver-agent", "state": "started"}

NOTIFIED: [stackdriver | restart extractor] ***********************************
changed: [yg-test] => {"changed": true, "name": "stackdriver-extractor", "state": "started"}

NOTIFIED: [stackdriver | stackdriver updated] *********************************
failed: [yg-test -> 127.0.0.1] => {"changed": false, "failed": true}
msg: Unable to send msg: HTTP Error 401: Unauthorized

FATAL: all hosts have already failed -- aborting

In second run, I don't see any errors probably there is no update to run.

Thanks.
Keep up the good work!

ops-agent.service keeps getting masked which fails the restart and overall installation

RUNNING HANDLER [googlecloudplatform.google_cloud_ops_agents : restart ops-agent agent] ***************************************** fatal: []: FAILED! => {"changed": false, "msg": "Unable to start service google-cloud-ops-agent: Failed to start google-cloud-ops-agent.service: Unit google-cloud-ops-agent.service is masked.\n"}

Ansible version: 7.1.0

Support both legacy and new stackdriver agents

I'm currently in the process of navigating how to migrate from the legacy Stackdriver to the new built on Google's platform. I'm doing it manually locally now, but it would be great if this role allowed one to configure running:

  • legacy agent
  • 'new'/'current' agent
  • both simultaneously

Can't install on Ubuntu 15.04

When installing on Ubuntu 15.04, Ansible was reporting the following error: No package matching 'stackdriver-agent' is available

I changed Line 17 of tasks/Debian.yml to read: repo: "deb {{ stackdriver_package_repo }} trusty main" which worked.

I believe that Stackdriver does not have an apt repo for Ubuntu 15.04.

Any ideas how to fix this?

Installation is failing for the Suse 15 - sp4

Using this role to deploy the google-cloud-ops agent on sles15-sp4 is failing with below error.

But if we try executing the script file in /tmp//add-google-cloud-ops-agent-repo.sh it works without any issue.

TASK [gcp-ops-agent-install : Add repo and install agent or remove repo and uninstall agent] ***
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (5 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (4 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (3 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (2 retries left).
FAILED - RETRYING: Add repo and install agent or remove repo and uninstall agent (1 retries left).
fatal: [10.5.0.179]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["bash", "add-google-cloud-ops-agent-repo.sh", "--also-install", "--version=latest"], "delta": "0:00:55.060665", "end": "2023-08-14 15:54:30.222948", "msg": "non-zero return code", "rc": 1, "start": "2023-08-14 15:53:35.162283", "stderr": "Repository 'Google Cloud Ops Agent Repository' is invalid.\n[google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Valid metadata not found at specified URL\nHistory:\n - [google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Repository type can't be determined.\n\nPlease check if the URIs defined for this repository are pointing to a valid repository.\nSkipping repository 'Google Cloud Ops Agent Repository' because of the above error.\nSome of the repositories have not been refreshed because of an error.\nCould not refresh zypper repositories.\nThis is not necessarily a fatal error; proceeding...\nRepository 'Google Cloud Ops Agent Repository' is invalid.\n[google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Valid metadata not found at specified URL\nHistory:\n - [google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Repository type can't be determined.\n\nPlease check if the URIs defined for this repository are pointing to a valid repository.\nSkipping repository 'Google Cloud Ops Agent Repository' because of the above error.\nCould not refresh the repositories because of errors.\n[2023-08-14T15:54:28+0000] Could not refresh the google-cloud-ops-agent zypper repositories.\nPlease check your network connectivity and make sure you are running a supported\nsles distribution. See https://cloud.google.com/stackdriver/docs/solutions/ops-agent/#supported_operating_systems\nfor a list of supported platforms.\nError building the cache:\n[google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Valid metadata not found at specified URL\nHistory:\n - [google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Repository type can't be determined.\n\nSome of the repositories have not been refreshed because of an error.\nNo provider of 'google-cloud-ops-agent' found.\n[2023-08-14T15:54:30+0000] google-cloud-ops-agent installation failed.", "stderr_lines": ["Repository 'Google Cloud Ops Agent Repository' is invalid.", "[google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Valid metadata not found at specified URL", "History:", " - [google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Repository type can't be determined.", "", "Please check if the URIs defined for this repository are pointing to a valid repository.", "Skipping repository 'Google Cloud Ops Agent Repository' because of the above error.", "Some of the repositories have not been refreshed because of an error.", "Could not refresh zypper repositories.", "This is not necessarily a fatal error; proceeding...", "Repository 'Google Cloud Ops Agent Repository' is invalid.", "[google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Valid metadata not found at specified URL", "History:", " - [google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Repository type can't be determined.", "", "Please check if the URIs defined for this repository are pointing to a valid repository.", "Skipping repository 'Google Cloud Ops Agent Repository' because of the above error.", "Could not refresh the repositories because of errors.", "[2023-08-14T15:54:28+0000] Could not refresh the google-cloud-ops-agent zypper repositories.", "Please check your network connectivity and make sure you are running a supported", "sles distribution. See https://cloud.google.com/stackdriver/docs/solutions/ops-agent/#supported_operating_systems", "for a list of supported platforms.", "Error building the cache:", "[google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Valid metadata not found at specified URL", "History:", " - [google-cloud-ops-agent|https://packages.cloud.google.com/yum/repos/google-cloud-ops-agent-4-x86_64-all] Repository type can't be determined.", "", "Some of the repositories have not been refreshed because of an error.", "No provider of 'google-cloud-ops-agent' found.", "[2023-08-14T15:54:30+0000] google-cloud-ops-agent installation failed."], "stdout": "Repository 'SLE-Module-Basesystem15-SP4-Pool' is up to date.\nRepository 'SLE-Module-Basesystem15-SP4-Updates' is up to date.\nRepository 'SLE-Module-Containers15-SP4-Pool' is up to date.\nRepository 'SLE-Module-Containers15-SP4-Updates' is up to date.\nRepository 'SLE-Module-Desktop-Applications15-SP4-Pool' is up to date.\nRepository 'SLE-Module-Desktop-Applications15-SP4-Updates' is up to date.\nRepository 'SLE-Module-DevTools15-SP4-Pool' is up to date.\nRepository 'SLE-Module-DevTools15-SP4-Updates' is up to date.\nRepository 'SLE-Module-Public-Cloud15-SP4-Pool' is up to date.\nRepository 'SLE-Module-Public-Cloud15-SP4-Updates' is up to date.\nRepository 'SLE-Module-Python3-15-SP4-Pool' is up to date.\nRepository 'SLE-Module-Python3-15-SP4-Updates' is up to date.\nRepository 'SLE-Module-SAP-Applications15-SP4-Pool' is up to date.\nRepository 'SLE-Module-SAP-Applications15-SP4-Updates' is up to date.\nRepository 'SLE-Product-HA15-SP4-Pool' is up to date.\nRepository 'SLE-Product-HA15-SP4-Updates' is up to date.\nRepository 'SLE-Module-Live-Patching15-SP4-Pool' is up to date.\nRepository 'SLE-Module-Live-Patching15-SP4-Updates' is up to date.\nRepository 'SLE-Product-SLES_SAP15-SP4-Pool' is up to date.\nRepository 'SLE-Product-SLES_SAP15-SP4-Updates' is up to date.\nRepository 'SLE-Module-Server-Applications15-SP4-Pool' is up to date.\nRepository 'SLE-Module-Server-Applications15-SP4-Updates' is up to date.\nRepository 'SLE-Module-Web-Scripting15-SP4-Pool' is up to date.\nRepository 'SLE-Module-Web-Scripting15-SP4-Updates' is up to date.\nRetrieving repository 'Google Cloud Ops Agent Repository' metadata [.error]\nRetrieving repository 'Google Cloud Ops Agent Repository' metadata [.error]\nRefreshing service 'Basesystem_Module_x86_64'.\nRefreshing service 'Containers_Module_x86_64'.\nRefreshing service 'Desktop_Applications_Module_x86_64'.\nRefreshing service 'Development_Tools_Module_x86_64'.\nRefreshing service 'Public_Cloud_Module_x86_64'.\nRefreshing service 'Python_3_Module_x86_64'.\nRefreshing service 'SAP_Applications_Module_x86_64'.\nRefreshing service 'SUSE_Linux_Enterprise_High_Availability_Extension_x86_64'.\nRefreshing service 'SUSE_Linux_Enterprise_Live_Patching_x86_64'.\nRefreshing service 'SUSE_Linux_Enterprise_Server_for_SAP_Applications_x86_64'.\nRefreshing service 'Server_Applications_Module_x86_64'.\nRefreshing service 'Web_and_Scripting_Module_x86_64'.\nWarning: Skipping repository 'Google Cloud Ops Agent Repository' because of the above error.\nLoading repository data...\nReading installed packages...\n'google-cloud-ops-agent' not found in package names. Trying capabilities.", "stdout_lines": ["Repository 'SLE-Module-Basesystem15-SP4-Pool' is up to date.", "Repository 'SLE-Module-Basesystem15-SP4-Updates' is up to date.", "Repository 'SLE-Module-Containers15-SP4-Pool' is up to date.", "Repository 'SLE-Module-Containers15-SP4-Updates' is up to date.", "Repository 'SLE-Module-Desktop-Applications15-SP4-Pool' is up to date.", "Repository 'SLE-Module-Desktop-Applications15-SP4-Updates' is up to date.", "Repository 'SLE-Module-DevTools15-SP4-Pool' is up to date.", "Repository 'SLE-Module-DevTools15-SP4-Updates' is up to date.", "Repository 'SLE-Module-Public-Cloud15-SP4-Pool' is up to date.", "Repository 'SLE-Module-Public-Cloud15-SP4-Updates' is up to date.", "Repository 'SLE-Module-Python3-15-SP4-Pool' is up to date.", "Repository 'SLE-Module-Python3-15-SP4-Updates' is up to date.", "Repository 'SLE-Module-SAP-Applications15-SP4-Pool' is up to date.", "Repository 'SLE-Module-SAP-Applications15-SP4-Updates' is up to date.", "Repository 'SLE-Product-HA15-SP4-Pool' is up to date.", "Repository 'SLE-Product-HA15-SP4-Updates' is up to date.", "Repository 'SLE-Module-Live-Patching15-SP4-Pool' is up to date.", "Repository 'SLE-Module-Live-Patching15-SP4-Updates' is up to date.", "Repository 'SLE-Product-SLES_SAP15-SP4-Pool' is up to date.", "Repository 'SLE-Product-SLES_SAP15-SP4-Updates' is up to date.", "Repository 'SLE-Module-Server-Applications15-SP4-Pool' is up to date.", "Repository 'SLE-Module-Server-Applications15-SP4-Updates' is up to date.", "Repository 'SLE-Module-Web-Scripting15-SP4-Pool' is up to date.", "Repository 'SLE-Module-Web-Scripting15-SP4-Updates' is up to date.", "Retrieving repository 'Google Cloud Ops Agent Repository' metadata [.error]", "Retrieving repository 'Google Cloud Ops Agent Repository' metadata [.error]", "Refreshing service 'Basesystem_Module_x86_64'.", "Refreshing service 'Containers_Module_x86_64'.", "Refreshing service 'Desktop_Applications_Module_x86_64'.", "Refreshing service 'Development_Tools_Module_x86_64'.", "Refreshing service 'Public_Cloud_Module_x86_64'.", "Refreshing service 'Python_3_Module_x86_64'.", "Refreshing service 'SAP_Applications_Module_x86_64'.", "Refreshing service 'SUSE_Linux_Enterprise_High_Availability_Extension_x86_64'.", "Refreshing service 'SUSE_Linux_Enterprise_Live_Patching_x86_64'.", "Refreshing service 'SUSE_Linux_Enterprise_Server_for_SAP_Applications_x86_64'.", "Refreshing service 'Server_Applications_Module_x86_64'.", "Refreshing service 'Web_and_Scripting_Module_x86_64'.", "Warning: Skipping repository 'Google Cloud Ops Agent Repository' because of the above error.", "Loading repository data...", "Reading installed packages...", "'google-cloud-ops-agent' not found in package names. Trying capabilities."]}

failing Validate Operating Systen on rocky linux operating system

TASK [google_cloud_ops_agents : Validate Operating Systen] ***********************************************************************************************************************************************
fatal: []: FAILED! => {
"assertion": "ansible_os_family == 'Windows' or ansible_distribution in ['Debian', 'Ubuntu', 'RedHat', 'CentOS', 'Amazon', 'SLES', 'openSUSE', 'SuSE', 'SLES_SAP', 'Windows']\n",

Support the --write-gcm installation option.

I think this is just a matter of adding DETECT_GCM="yes" to stackdriver.sysconfig and dropping the API key, but for EC2 we will also need to support uploading of the credentials JSON file to the host.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.