Giter Club home page Giter Club logo

efs-utils's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

efs-utils's Issues

`make rpm` fails on CentOS 8

I realize CentOS 8 is not listed as supported distribution, this issues serves to log the missing CentOS 8 support and include the log output of the build error. A PR has been opened to add CentOS 8 support.

# make rpm
./mangle-shebangs.sh
rm -rf build/rpmbuild
rm -rf amazon-efs-utils
rm -f amazon-efs-utils.tar.gz
rm -f amazon-efs-utils.spec
mkdir -p amazon-efs-utils
mkdir -p amazon-efs-utils/dist
cp -p dist/amazon-efs-mount-watchdog.conf amazon-efs-utils/dist
cp -p dist/amazon-efs-mount-watchdog.service amazon-efs-utils/dist
cp -p dist/efs-utils.conf amazon-efs-utils/dist
cp -p dist/efs-utils.crt amazon-efs-utils/dist
mkdir -p amazon-efs-utils/src
cp -rp src/mount_efs amazon-efs-utils/src
cp -rp src/watchdog amazon-efs-utils/src
mkdir -p amazon-efs-utils/man
cp -rp man/mount.efs.8 amazon-efs-utils/man
tar -czf amazon-efs-utils.tar.gz amazon-efs-utils/*
ln -sf dist/amazon-efs-utils.spec amazon-efs-utils.spec
mkdir -p build/rpmbuild/{SPECS,COORD_SOURCES,DATA_SOURCES,BUILD,RPMS,SOURCES,SRPMS}
cp amazon-efs-utils.spec build/rpmbuild/SPECS
cp amazon-efs-utils.tar.gz build/rpmbuild/SOURCES
rpmbuild -ba --define "_topdir `pwd`/build/rpmbuild" build/rpmbuild/SPECS/amazon-efs-utils.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.29JJ7K
+ umask 022
+ cd /opt/efs-utils/build/rpmbuild/BUILD
+ cd /opt/efs-utils/build/rpmbuild/BUILD
+ rm -rf amazon-efs-utils
+ /usr/bin/gzip -dc /opt/efs-utils/build/rpmbuild/SOURCES/amazon-efs-utils.tar.gz
+ /usr/bin/tar -xvvof -
drwxr-xr-x root/root         0 2020-04-15 07:19 amazon-efs-utils/dist/
drwxr-xr-x                  Creating directory: amazon-efs-utils
-rw-r--r-- root/root       571 2020-04-15 06:05 amazon-efs-utils/dist/amazon-efs-mount-watchdog.conf
-rw-r--r-- root/root       481 2020-04-15 06:05 amazon-efs-utils/dist/amazon-efs-mount-watchdog.service
-rw-r--r-- root/root      1510 2020-04-15 06:05 amazon-efs-utils/dist/efs-utils.conf
-rw-r--r-- root/root      4789 2020-04-15 06:05 amazon-efs-utils/dist/efs-utils.crt
drwxr-xr-x root/root         0 2020-04-15 07:19 amazon-efs-utils/man/
-rw-r--r-- root/root      7068 2020-04-15 06:05 amazon-efs-utils/man/mount.efs.8
drwxr-xr-x root/root         0 2020-04-15 07:19 amazon-efs-utils/src/
drwxr-xr-x root/root         0 2020-04-15 07:19 amazon-efs-utils/src/mount_efs/
-rwxr-xr-x root/root     57653 2020-04-15 07:19 amazon-efs-utils/src/mount_efs/__init__.py
drwxr-xr-x root/root         0 2020-04-15 07:19 amazon-efs-utils/src/watchdog/
-rwxr-xr-x root/root     38580 2020-04-15 07:19 amazon-efs-utils/src/watchdog/__init__.py
+ STATUS=0
+ '[' 0 -ne 0 ']'
+ cd amazon-efs-utils
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.tkSnoG
+ umask 022
+ cd /opt/efs-utils/build/rpmbuild/BUILD
+ '[' /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64 '!=' / ']'
+ rm -rf /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64
++ dirname /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT
+ mkdir /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64
+ cd amazon-efs-utils
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/etc/amazon/efs
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/lib/systemd/system
+ install -p -m 644 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/dist/amazon-efs-mount-watchdog.service /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/lib/systemd/system
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/sbin
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/bin
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/var/log/amazon/efs
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/share/man/man8
+ install -p -m 644 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/dist/efs-utils.conf /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/etc/amazon/efs
+ install -p -m 444 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/dist/efs-utils.crt /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/etc/amazon/efs
+ install -p -m 755 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/src/mount_efs/__init__.py /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/sbin/mount.efs
+ install -p -m 755 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/src/watchdog/__init__.py /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/bin/amazon-efs-mount-watchdog
+ install -p -m 644 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/man/mount.efs.8 /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/share/man/man8
+ /usr/lib/rpm/check-buildroot
+ /usr/lib/rpm/redhat/brp-ldconfig
/sbin/ldconfig: Warning: ignoring configuration file that cannot be opened: /etc/ld.so.conf: No such file or directory
+ /usr/lib/rpm/brp-compress
+ /usr/lib/rpm/brp-strip /usr/bin/strip
+ /usr/lib/rpm/brp-strip-comment-note /usr/bin/strip /usr/bin/objdump
+ /usr/lib/rpm/brp-strip-static-archive /usr/bin/strip
+ /usr/lib/rpm/brp-python-bytecompile 1
+ /usr/lib/rpm/brp-python-hardlink
+ PYTHON3=/usr/libexec/platform-python
+ /usr/lib/rpm/redhat/brp-mangle-shebangs
*** ERROR: ambiguous python shebang in /usr/bin/amazon-efs-mount-watchdog: #!/usr/bin/env python. Change it to python3 (or python2) explicitly.
*** ERROR: ambiguous python shebang in /sbin/mount.efs: #!/usr/bin/env python. Change it to python3 (or python2) explicitly.
error: Bad exit status from /var/tmp/rpm-tmp.tkSnoG (%install)


RPM build errors:
    Bad exit status from /var/tmp/rpm-tmp.tkSnoG (%install)
make: *** [Makefile:57: rpm-only] Error 1

Test OS detail:

# cat /etc/os-release 
NAME="CentOS Linux"
VERSION="8 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="8"

Adding 'which' dependency breaks build on Ubuntu 18.04 (possibly others)

Commit 6242b0a just broke my build on Ubuntu 18.04. Here is the tail end of the output from apt-get install -y ./build/amazon-efs-utils*deb:

Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
amazon-efs-utils : Depends: which but it is not installable

efs-utils seems to be unable to deal with IMDSv2

Hi all, in my organization, we have switched to IMDSv2 by setting http-tokens to required in the instance metadata service for every instance. efs-utils fails to mount once this has been set as it reports 'Error retrieving region'

Any help would be appreciated. (I'm using a build based on commit d692ffe currently)

Problems running efs-utils with Python3

I have packaged efs-utils for openSUSE and SLE and patched the default Python interpreter to be Python3.

Unfortunately, trying to run mount.efs on Python3 fails with an error:

ip-XXX-XXX-XXX-XXX:~ # python3 /sbin/mount.efs
Traceback (most recent call last):
  File "/sbin/mount.efs", line 674, in <module>
    main()
  File "/sbin/mount.efs", line 654, in main
    config = read_config()
  File "/sbin/mount.efs", line 520, in read_config
    p = ConfigParser.SafeConfigParser()
AttributeError: type object 'ConfigParser' has no attribute 'SafeConfigParser'
ip-XXX-XXX-XXX-XXX:~ #

Is Python3 supposed to work or is efs-utils currently Python2 only?

Mounting efs fails on Fedora26 even with _netdev

We are using Fedora 26. I'm aware it's EOL.

I have an entry in /etc/fstab. When I execute mount -a, I get this error:
Failed to mount fs-xxxxxxxx because the network was not yet available, add "_netdev" to your mount options

#
# /etc/fstab
# Created by anaconda on Wed Jul  5 21:45:12 2017
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=45738e43-xxxx-xxxx-xxxx-6ac29c4c2329 /                       ext4    defaults        1 1
fs-xxxxxxxx.efs.us-east-1.amazonaws.com:/ /mnt/efs efs defaults,_netdev 0 0

Mounting it from the command line yields the same error result.
mount -t efs fs-xxxxxxxx.efs.us-east-1.amazonaws.com /mnt/efs

The same command works in Debian.

More information below

[root@apiproxywebapps system]# rpm -qa |grep amazon-efs-utils
amazon-efs-utils-1.4-1.fc26.noarch

[root@apiproxywebapps system]# rpm -qa |grep nfs-utils
nfs-utils-2.2.1-4.rc2.fc26.x86_64

timeout...

After installing amazon-efs-mount-watchdog, I execute the following command to mount it, and the result timeout occurs. Why?

[ec2-user@ip-172-31-25-116 ~]$ sudo mount -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-8139eac0.efs.ap-southeast-1.amazonaws.com:/test-efs1/ /mnt/efs1
mount.nfs: Connection timed out

incorrect version of debian-package

the release 1.25-2 - https://github.com/aws/efs-utils/releases/tag/v1.25-2
and the release 1.25-1 - https://github.com/aws/efs-utils/releases/tag/v1.25-1

have the exact same version (1.25) for the debian-package in dist/amazon-efs-utils.control, see:
https://github.com/aws/efs-utils/blob/master/dist/amazon-efs-utils.control

when you build a debian-package with build-deb.sh both will have the exact same version (1.25)
in the end.

aws-efs-utils

So it is not possible to update amazon-efs-utils-1.25-1.deb with amazon-efs-utils-1.25-2.deb because apt assumes the version is already installed.

The following files are also affected: "config.ini", "dist/amazon-efs-utils.spec", "src/watchdog/init.py" and "src/mount_efs/init.py"

Possible fixes for the future may be:

  • new versioning with: Major.Minor-Release
    or
  • old versioning with: Major.Minor (even for small releases/fixes)

Would be happy for a fixed versioning, thanks alot.

Mount fails when using mount -t efs -o tls on RHEL 7.6

AWS Region: Dublin, Ireland

Created 2 ASGs and 1 EFS across 3 subnets (spread over the 3 AZs in Dublin). The 3 subnets are private subnets with appropriate route tables entries pointing to a NAT for internet access. The 2 ASGs have expected, max and min set to 1 right now, and therefore have 1 t2.medium EC2 instance in each ASG running RHEL 7.6 (ami-036affea69a1101c9). I am able to confirm internet access. The RHEL 7.6 use a custom AMI built from ami-036affea69a1101c9, that has additional packages installed, all from existing repos in addition to JDK 1.x, Oracle 11.1 packages.

Spun up a separate t2.micro running the custom AMI and followed the instructions to create the aws-efs-utils v1.22 rpm. Installed the RPM on the 2 EC2 instances from above. Also installed nfs-utils on the 2 RHEL 7.6 EC2 instances above.

Setup security groups for the EC2 instances and EFS with rules allowing TCP traffic on port 2049 between the SGs.

Ran the following command,
mount -t efs -o tls fs-xxxxxxxx:/ /mnt/efs

and got the following output,
2020-02-13 08:38:25,683 - INFO - version=1.22 options={'tls': None, '_netdev': None, 'rw': None}
2020-02-13 08:38:25,708 - INFO - Starting TLS tunnel: "stunnel /var/run/efs/stunnel-config.fs-xxxxxxxx.u04.20359"
2020-02-13 08:38:25,710 - INFO - Started TLS tunnel, pid: 25324
2020-02-13 08:38:25,711 - INFO - Executing: "/sbin/mount.nfs4 127.0.0.1:/ /u04 -o rw,noresvport,nfsvers=4.1,retrans=2,_netdev,hard,wsize=1048576,timeo=600,rsize=1048576,port=20359"
2020-02-13 08:38:26,211 - ERROR - Failed to start TLS tunnel (errno=1). stdout="" stderr="[ ] Clients allowed=500
[.] stunnel 5.44 on x86_64-pc-linux-gnu platform
[.] Compiled/running with OpenSSL 1.0.2k-fips 26 Jan 2017
[.] Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI
[ ] errno: (*__errno_location ())
[.] Reading configuration from file /run/efs/stunnel-config.fs-ab0e5d60.u04.20359
[.] UTF-8 byte order mark not detected
[.] FIPS mode disabled
[ ] Compression disabled
[ ] Snagged 64 random bytes from /dev/urandom
[ ] PRNG seeded successfully
[!] /run/efs/stunnel-config.fs-xxxxxxxx.u04.20359:18: "libwrap = no": Specified option name is not valid here"

I tried installing stunnel versions 5.44, 5.50 and 5.56 and got the same response with all of them.

Finally on a hunch commented the following lines in /sbin/mount.efs
if RHEL8_RELEASE_NAME not in get_system_release_version():
efs_config['libwrap'] = 'no'

and ran the mount command again and that fixed the issue. I was able to confirm from the logs that TLS was enabled and the EFS is usable.

I am not sure if there is anyway to disable the libwrap=no option from the cfg file created on the fly by mount.efs or any additional steps I can take to work around this problem?

Using efs-utils to mount EFS volume with systemd.automount without requiring reboot

I can do this using nfs4 with the following steps:

  1. Add the following line to /etc/fstab (note the systemd fields at the end):
    fs-c9280f81.efs.us-east-1.amazonaws.com:/ /home/ubuntu nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,_netdev,noresvport,x-systemd.automount,x-systemd.device-timeout=10,x-systemd.requires=network-online.target 0 0
  2. run systemctl daemon-reload
  3. run systemctl restart remote-fs.target

Can we do this same thing using efs-utils? In other words, with efs-utils can we mount an EFS volume with systemd.automount without requiring reboot?

mount.efs --version show incorrect version number

I just packaged efs-utils 1.16 for openSUSE/SLE and while performing some basic tests I noticed that mount.efs --version returns the version number 1.13:

suse-laptop:~ # mount.efs --version
/sbin/mount.efs Version: 1.13
suse-laptop:~ #

OSError: [Errno 2] No such file or directory

Description

I am unable to run the mount command on my ec2 which is using amazon-linux

Details

My EC2 is actually an EKS worker node. I was trying to use a persistent volume(EFS) with a pod which resulted in the mount error. So, after some reading, I installed amazon-efs-utils which most people pointed that will be missing from the worker node. Then I manually tried to mount the EFS on my worker node, which resulted in this error.

$ sudo yum install -y amazon-efs-utils
$ sudo mkdir efs
$ sudo mount -t efs fs-xxxexxax:/ efs
Traceback (most recent call last):
  File "/sbin/mount.efs", line 694, in <module>
    main()
  File "/sbin/mount.efs", line 690, in main
    mount_nfs(dns_name, path, mountpoint, options)
  File "/sbin/mount.efs", line 480, in mount_nfs
    proc = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True)
  File "/usr/lib64/python2.7/subprocess.py", line 390, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.7/subprocess.py", line 1025, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Then I tried the other instruction given at the EFS console,

$ sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-xxxxxx.efs.xx-xx-xx.amazonaws.com:/ efs
mount: /home/ec2-user/efs: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program.

According to my understanding of the 2nd error, we need a helper program for the filesystem which we are using, as we are using nfs4, there might be /sbin/mount.nfs4 file. But turns out there isn't any file at /sbin/mount.nfs4 location.

$ ls /sbin/ | grep mount
mount.cifs
mount.efs

Same thing is there with that python stacktrace, https://github.com/aws/efs-utils/blob/master/src/mount_efs/__init__.py#L476 it is also looking for /sbin/mount.nfs4 file, which is not there.

OS Details:

OS - Amazon Linux
AMI - eks-worker-v20 (ami-73a6e20b)

$ cat /etc/system-release
Amazon Linux release 2 (Karoo)

I am stuck on this issue, so please direct me in the right way. I mean, is something wrong with my setup or is this a bug?

Not able to mount efs using efs mount helper on ec2-instance

I have just created EFS and launched the ec2 instance in the same AZ and trying to mount EFS on EC2 instance using EFS mount helper.

Steps to regenerate error

  1. Create EFS
  2. Launch EC2 instance in same AZ and share the same SG with EFS
  3. sudo yum install -y amazon-efs-utils
  4. sudo mkdir efs
  5. sudo mount -t efs FILE_SYSTEM_ID:/ efs

Error Getting
Failed to resolve "fs-xxxxxxx.efs.us-east-1.amazonaws.com" - check that your file system ID is correct.

Not able to mount the EFS file system using ansible with elevated privilege

Description:
we installed efs utility and configured the EFS filesystem with EFS Mount points with in the VPC.
Added the entry in /etc/fstab for permanent mount like below.

echo "mount fs-xxxxxxx /mnt/efs efs tls,_netdev 0 0" >> /etc/fstab
after this when i manually run the mount -a -t efs defaults - it is working fine file system got mounted successfully without any issue.

But when i try to invoke the same thing from ansible mount module like below

- name: Mount up efs
  mount:
    path: /mnt/efs
    src: fs-xxxxxxxx
    fstype: efs
    opts: tls
    state: mounted
  become: true
  become_method: pbrun
  become_user: root

Note: Ansible is running as root privilaged user on the target host.

Expected Result:
EFS filesystem should get mounted without any issue.

Actual Result:
We are getting error in ansible saying like

Error:
only root can run mount.efs

when i start debugging the issue i see the entry in init.py for efs
https://github.com/aws/efs-utils/blob/555154b79572cd2a9f63782cac4c1062eb9b1ebd/src/mount_efs/init.py

we are validating the user with getpass python module but some how even i am using the become in the ansible it is not help me to get ride of this error.

Could you please anyone help me to resolve tis issue

Mount failed due to timing out to IMDS to fetch region

/kind bug

What happened?
If we mount for many times on container, we could see occasional failure due to timing out to IMDS to fetch region.

# fetch driver logs
$kubectl logs $(kubectl get po -l app=efs-csi-node -n kube-system -o jsonpath='{.items[0].metadata.name}') -n kube-system efs-plugin

Mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t efs -o tls fs-b6654c1c:/dir1 /var/lib/kubelet/pods/de6a9057-ca0d-4503-ba64-bc842ba93aeb/volumes/kubernetes.io~csi/efs-pv1/mount
Output: Error retrieving region. Please set the "region" parameter in the efs-utils configuration file.
# fetch mount.log
$kubectl exec $(kubectl get po -l app=efs-csi-node -n kube-system -o jsonpath='{.items[0].metadata.name}') -n kube-system -it efs-plugin -- cat /var/log/amazon/efs/mount.log

2020-06-03 01:14:42,675 - WARNING - Region not found in config file and metadata service call failed, falling back to legacy "dns_name_format" check
2020-06-03 01:14:42,676 - WARNING - Legacy check for region in "dns_name_format" failed
2020-06-03 01:14:42,676 - ERROR - Unable to reach instance metadata service at http://169.254.169.254/latest/dynamic/instance-identity/document/, reason is timed out

What you expected to happen?
Mount succeeds all the time.

How to reproduce it (as minimally and precisely as possible)?
Try create and delete the following spec for a couple of times.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv1
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  mountOptions:
    - tls
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-b6654c1c:/dir1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim1
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv2
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  mountOptions:
    - tls
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-b6654c1c:/dir2
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim2
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv3
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  mountOptions:
    - tls
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-b6654c1c:/dir3
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim3
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: efs-app
spec:
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo $(date -u) | tee --append /data-dir1/out.txt /data-dir2/out.txt /data-dir1/out.txt;  sleep 5; done"]
    volumeMounts:
    - name: efs-volume-1
      mountPath: /data-dir1
    - name: efs-volume-2
      mountPath: /data-dir2
    - name: efs-volume-3
      mountPath: /data-dir3
  volumes:
  - name: efs-volume-1
    persistentVolumeClaim:
      claimName: efs-claim1
  - name: efs-volume-2
    persistentVolumeClaim:
      claimName: efs-claim2
  - name: efs-volume-3
    persistentVolumeClaim:
      claimName: efs-claim3

Anything else we need to know?:
It shouldn't be an issue with IMDSv2 since we are hitting the linklocal address directly. So maybe throttling?
The issue is IMDSv2 doesn't work by default on container environment (need to bump up the max hop allowed). aws/aws-sdk-go#2972

Can we get the timeout limit increased?

response = urlopen(request, timeout=1)

Environment

Is amazon-ssm-agent required for efs-utils to work?

I'm getting the following error message:

INFO Entering SSM Agent hibernate - AccessDeniedException: User: arn:aws:sts::ABCDEFGHIJK:assumed-role/group/i-00a457c44cf7c1e42 is not authorized to perform: ssm:UpdateInstanceInformation on resource: arn:aws:ec2:us-west-2:ABCDEFGHIJK:instance/iadsfadsfa

I see that the amazon-ssm-agent is installed by default. Do I need to add the required role to the EC2 so I don't get this error message to make the app work fine?

I could not find any reference about amazon-ssm-agent in the README.md file

RHEL8 workaround should be applied to CentOS 8

There's a workaround guarding an unsupported option:

if RHEL8_RELEASE_NAME not in get_system_release_version():
    efs_config['libwrap'] = 'no'

This is happening on CentOS 8 because the condition apparently evals to true, so the option's getting passed in and causing the TLS connection set up to fail. The log mentions stunnel not supporting libwrap based on what I'm seeing. Commenting out the condition entirely on 8 is resulting in success, so the fix is probably just to expand that conditional.

Support alpine linux

Add installation support for Alpine linux. This would be very helpful when docker containers need to be used in a CI environment.

Missing Trusted Certificate

I attempted to recreate the behavior of this tool by running a local stunnel and found that I was unable to verify each of the provided certs when connecting to my Amazon EFS endpoint. stunnel was reporting the following:

2018.11.30 13:16:24 LOG4[17547:140558419277568]: CERT: Verification error: unable to get local issuer certificate
2018.11.30 13:16:24 LOG4[17547:140558419277568]: Certificate check failed: depth=3, /C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./CN=Starfield Services Root Certificate Authority - G2
2018.11.30 13:16:24 LOG7[17547:140558419277568]: SSL alert (write): fatal: unknown CA
2018.11.30 13:16:24 LOG3[17547:140558419277568]: SSL_connect: 14090086: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
2018.11.30 13:16:24 LOG5[17547:140558419277568]: Connection reset: 0 byte(s) sent to SSL, 0 byte(s) sent to socket

I copied the provided efs-utils.crt to my stunnel configuration and found it was missing the Starfield Services Root Certificate Authority - G2 cert. I was forced to add https://certs.secureserver.net/repository/sf-class2-root.crt in order for stunnel to trust the provided CA certs and actually start using the stunnel verify = 2 option.

I also setup efs-utils completely and found this cert was not needed for it's trust chain.

Please add this cert to your efs-utils.crt, or help explain why the efs-utils tool doesn't need this.

Allow connection via IP address when DNS isn't enabled

Due to configurations that are established by our enterprise's VPC configuration, we do not have the ability to use the DNS name for our EFS shares. We are able to look up the IP address fairly easily, however, and it would be great if we could leverage the mount helper utility to manage TLS encryption using the IP address of our EFS instance.

Given that the typical mount command supports using an IP endpoint, it seems like mount.efs should support this as well with similar syntax: mount -t efs -o tls 10.1.2.3:/ /myefsvolume.

I've made these changes internally and validated that it works in our installation, and I am interested in submitting a PR with the changes. Prior to that, I want to ensure that the usage syntax above makes sense with any future plans for the utility.

Support custom DNS names and mount target IP address in the mount option

Currently mount only supports the target formats "fs_id:/" or "efs_fqdn:/". This is restricting the use of custom DNS servers. If the mount helper support custom DNS names or mount target IP addresses, customers can either setup a A record in their DNS server to resolve the mount target IP addresses or specify the IP address directly in the mount command.

Expected options:

mount -t efs myefs.mydomain.com:/ /mnt
mount -t efs 172.31.40.212:/ /mnt

df not reporting file system usage correctly

OS: CentOS Linux release 7.4.1708 (Core)
EC2: m5large

after mounting an efs filesystem using efs-utils and -o tls...df no longer reports the file system usage correctly.

127.0.0.1:/ 8.0E 121M 8.0E 1% /mnt/efs

du -sh /mnt/efs 11G /mnt/efs

Using python3 subprocess doesn't seem to pass options (tls,iam) to underlying mount commands

Hi guys,

I've been trying to automate the mount command with subprocess and I've had some interesting results.

This is amazon 2: amzn2-ami-hvm-2.0.20190823.1-x86_64-gp2 (ami-03ed5bd63ba378bd8)

efs-utils has been installed yum
Python3 has been installed with venv.

The relevant snippet

            args = ['mount', f'-t {self.type}', f'-o {self.options}', f'{source_arg}', f'{self.mount_point}']
            completed_process = subprocess.run(args, capture_output=True, text=True)
            if completed_process.returncode != 0:
                message = f'Failed to mount {completed_process.returncode}\n'
                message += 'Attempted to run in subprocess :\n'
                message += completed_process.args
                message += 'StdErr\n'
                message += completed_process.stderr
                logger.error(message)

When I run my script -

Failed to mount 1
Attempted to run in subprocess:
mount -t efs -o tls,iam fs-xxxx /mnt/cross_account
mount: unsupported option format:  tls,iam

I can run the constructed command directly fine, and /var/log/amazon/efs/mount.log shows

2020-03-19 09:25:33,211 - INFO - version=1.22 options={'tls': None, 'iam': None, 'rw': None}
2020-03-19 09:25:33,264 - INFO - Starting TLS tunnel: "stunnel /var/run/efs/stunnel-config.fs-c65954ff.mnt.cross_account.20224"
2020-03-19 09:25:33,266 - INFO - Started TLS tunnel, pid: 220802020-03-19 09:25:33,267 - INFO - Executing: "/sbin/mount.nfs4 127.0.0.1:/ /mnt/cross_account -o rw,noresvport,nfsvers=4.1,retrans=2,hard,wsize=1048576,timeo=600,rsize=1048576,port=20224"
2020-03-19 09:25:33,910 - INFO - Successfully mounted fs-xxxx.efs.ap-southeast-2.amazonaws.com at /mnt/cross_account

A second attempt trying to call mount.efs directly:

args = ['mount.efs',f'{source_arg}', f'{self.mount_point}', f'-o {self.options}']
mount.efs fs-xxxx /mnt/cross_account -o tls,iam
b'mount.nfs4: access denied by server while mounting fs-xxxx.efs.ap-southeast-2.amazonaws.com:/'

/var/log/amazon/efs/mount.log

2020-03-19 09:33:10,921 - INFO - version=1.22 options={}
2020-03-19 09:33:10,932 - INFO - Executing: "/sbin/mount.nfs4 fs-xxxx.efs.ap-southeast-2.amazonaws.com:/ /mnt/cross_account -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport"
2020-03-19 09:33:11,261 - ERROR - Failed to mount fs-xxxxx.efs.ap-southeast-2.amazonaws.com at /mnt/cross_account: returncode=32, stderr="b'mount.nfs4: access denied by server while mounting fs-xxxx.efs.ap-southeast-2.amazonaws.com:/'"

Is this some weird interaction between shells, python going on?

Failed to initialize TLS tunnel for [AWS EFS ID] on Ubuntu 18 FIPS kernel

Am running on Ubuntu 18.04.4 server with the certified FIPS 140-2 kernel, have created a script to automate mounting to my encrypted EFS:

efsHost=AWS_FS_ID
sudo apt install -y git make binutils jq
git clone https://github.com/aws/efs-utils
cd efs-utils
./build-deb.sh
sudo apt-get -y install ./build/amazon-efs-utils*deb


# mount EFS
sudo mkdir -p /mnt/efs
sudo mount -t efs -o tls ${efsHost}:/ /mnt/efs

The response that I get is Failed to initialize TLS tunnel for AWS_FS_ID.

Could the FIPS 140-2 kernel be causing the issue? I did not compile stunnel separately as this is obviously quite a recent version of the OS.

If it helps:

$ uname -r
4.15.0-1011-fips

$ openssl version
OpenSSL 1.1.1  11 Sep 2018

... and if I attach to my EFS at the point of spinning up the instance (using the AWS console) it connects just fine (although I don't believe that uses transport encryption which I do require), which leads me to think that it's not the kernel, but thought I'd mention it all the same...

From the /var/log/amazon/efs/mount.logfile:

2020-06-03 22:19:16,428 - ERROR - Failed to start TLS tunnel (errno=1). stdout="b''" stderr="b'[ ] Clients allowed=500\n[.] stunnel 5.44 on x86_64-pc-linux-gnu platform\n[.] Compiled with OpenS
SL 1.1.0g  2 Nov 2017\n[.] Running  with OpenSSL 1.1.1  11 Sep 2018\n[.] Update OpenSSL shared libraries or rebuild stunnel\n[.] Threading:PTHREAD Sockets:POLL,IPv6,SYSTEMD TLS:ENGINE,FIPS,OCSP
,PSK,SNI Auth:LIBWRAP\n[ ] errno: (*__errno_location ())\n[.] Reading configuration from file /run/efs/stunnel-config.fs-d0a344d5.mnt.efs.20403\n[.] UTF-8 byte order mark not detected\n[.] FIPS
 mode disabled\n[ ] Compression disabled\n[ ] PRNG seeded successfully\n[ ] Initializing service [efs]\n[!] SSL_CTX_new: 140A90F2: error:140A90F2:SSL routines:SSL_CTX_new:unable to load ssl3 md
5 routines\n[!] Service [efs]: Failed to initialize TLS context'"

Using EFS-UTILS on non ec2 machine for local development

I am dockerizing an application that utilizes the efs-utils to mount a directory in an app I am dockerizing.

Can you only use this on an ec2 machine that has access to an IAM role? Or can you provide the efs-utils with access keys on your local environment ?

Failed to initialize TLS tunnel

Doesn't work under Ubuntu 18.04.

Even compiled latest stunnel and your efs utils.

Can you at least put some effort into supporting the biggest distros in your cloud and not just Amazon AMI.

That would be fantastic.

Oh, and update your docs. They are out of date and you wasted approximately 5 hours of my time.

Releases are only tagged but not published

Would be nice if the releases of efs-utils would be published, so that they are reachable via the Github-API. In the moment an API-call for the the latest release:

curl -s https://api.github.com/repos/aws/efs-utils/releases/latest

will only show an error-message

{
"message": "Not Found",
"documentation_url": "https://developer.github.com/v3/repos/releases/#get-the-latest-release"
}

Other aws-projects (e.g. aws-toolkit-vscode) support the releases via API:

curl -s https://api.github.com/repos/aws/aws-toolkit-vscode/releases/latest

If the releases get published the latest release would also be flagged on the WebUI with a green label saying "Latest release"

aws-toolkit-vscode

Would be happy for a fixed repository, thanks alot.

Drop mandatory use of Python2

Python2 is already EOL, and Python2 libs are already starting to disappear from, e.g., Debian Bullseye. I realize that the code in this project supports Python3, but please consider adding packaging support for Python 3, at least for some versions of distributions.

I'm not sure the best way to go about this, but here are some places that I think would need to be changed:

  • The code here explicitly forces Python2 for all versions of Fedora.
  • The code here forces Python2 as a dependency for any Debian packages that are built.
  • The code here forces Python2 as a dependency for any RPMs packages that are built.

Occasional "Connection reset by peer" errors when mounting

Hi guys,

We are using efs-utils from within Docker containers spawned from AWS Batch. It works great, but occasionally we receive this error about 26 seconds after attempting to mount EFS over TLS:

mount.nfs4: Connection reset by peer

We are using the recommended mount command:

mount -t efs -o tls [EFS file system ID]:/ /mnt

This happens in ~0.2% of all mount attempts from all our VPCs. It's a particularly nasty issue because it seems to prevent the mount process from being killed cleanly. Since Apache Commons Exec 1.3's basic ExecuteWatchdog is not able to destroy it, the only remedy I have found is to terminate the EC2 instance.

Any ideas or insights would be greatly appreciated.

Thanks!

Cheers,
-Jon

Not able to mount with TLS from another region

Hi there,

Following instructions to configure /etc/hosts, VPC peering, etc, I've been able to successfully mount from Paris to an EFS residing in Ireland issuing sudo mount -t efs fs-fXXXXXXX:/ /efs .

However when I add the TLS option, Stunnel fails:

[ec2-user@ip-X-X-X-X ~]$ sudo mount -t efs -o tls fs-fXXXXXXX:/ /efs
mount.nfs4: Connection reset by peer
Failed to initialize TLS tunnel for fs-fXXXXXXX

Is it possible to use TLS in EFS between Regions?

Cheers

Manuel

efs helper fails with "no such device" on Amazon Linux 2 docker volume, yet nfs works

Hi. I've isolated a scenario where the EFS mount type helper fails to create a docker volume, yet using the NFS mount type directly is successful. ๐Ÿค” The error is no such device. I suspect this is related to the addr= NFS mount option.

Setup

  • Amazon Linux 2 currently with kernel 4.19.75-28.73.amzn2.x86_64
  • ECS optimized AMI having docker 18.09.9-ce
  • ECS agent/init 1.36.1
  • amazon-efs-utils 1.18

Repro of Failure

  1. Create your VPC, security groups, NACLs, etc.
  2. Create an EFS filesystem, encrypted with default key, general purpose, bursting.
  3. Note the EFS fs id. For writing this repo, I will use fs-12345678
  4. Create EC2 instance from an Amazon ECS-optimized AMI
  5. Create your ECS cluster with that instance. (this step is probably not needed)
  6. SSH into the EC2 instance
  7. Create a docker volume
    docker volume create \
        -d local \
        -o "type=efs" \
        -o "device=fs-12345678:/" \
        -o "o=tls" \
        myefs
  8. Note there is no error.
  9. Run a docker container that mounts this volume into the container's filesystem
    docker run -ti -v myefs:/mnt/one alpine sh

Result

The container does not run โ˜น and the below no such device error is given:

docker: Error response from daemon: error while mounting volume '/var/lib/docker/volumes/myefs/_data': failed to mount local volume: mount fs-12345678:/:/var/lib/docker/volumes/myefs/_data, data: tls: no such device.

Expected

No error. Running container. And the EFS filesystem is mounted into the container's /mnt/one.


Repro of Success with NFS

  1. docker container prune and yes
  2. docker volume rm myefs
  3. Run
    docker volume create \
        -d local \
        -o "type=nfs" \
        -o "device=:/" \
        -o "o=addr=fs-12345678.efs.us-east-1.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport" \
        myefs
  4. Note no error.
  5. Run a docker container that mounts this volume into the container's filesystem
    docker run -ti -v myefs:/mnt/one alpine sh

Successful Result with NFS

Yay. ๐Ÿ‘ You are now in the running Alpine container and the EFS filesystem was successfully mounted to the container's /mnt/one

Please note the params on the volume creation. I used the addr= param instead of putting the DNS on the device.


Repro of FAILURE with NFS

Below is a failure scenario using the typical NFS parameters, where the filesystem DNS is on the device and does not use the addr= parameter. This approach is what appears in the EFS console itself.

  1. docker container prune and yes
  2. docker volume rm myefs
  3. Run
    docker volume create \
        -d local \
        -o "type=nfs" \
        -o "device=fs-12345678.efs.us-east-1.amazonaws.com:/" \
        -o "o=nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport" \
        myefs
  4. Note no error.
  5. Run a docker container that mounts this volume into the container's filesystem
    docker run -ti -v myefs:/mnt/one alpine sh

Failure Result with NFS

The container does not run โ˜น and the below invalid argument error is given:

docker: Error response from daemon: error while mounting volume '/var/lib/docker/volumes/myefs/_data': failed to mount local volume: mount fs-12345678.efs.us-east-1.amazonaws.com:/:/var/lib/docker/volumes/myefs/_data, data: nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport: invalid argument.

Expected Result with NFS

No error. Running container. And the EFS filesystem is mounted into the container's /mnt/one.

fails on codebuild containers with "error retrieving region"

Hi,

I pursued documentation[1] hand configured codebuild to mount an EFS drive. The error I get is "Error retrieving region". I located that in the efs-utils code base[2] and noticed that it is trying to get current region from ec2 instance metadata. That won't work in codebuild containers will it?

[Container] 2019/12/17 05:03:23 Running command mount -t efs fs-11112222.efs.ap-south-1.amazonaws.com:/ /efs 
Error retrieving region 

P.S: I have additionally tried with just the host name unlike above:

[Container] 2019/12/17 05:03:23 Running command mount -t efs fs-11112222:/ /efs 
Error retrieving region 

I have attempted mounting in ec2 instance and that works fine. My codebuild containers is an amazon linux 2 1.0 image.

I have created a forum post as well[3]

[1] https://docs.aws.amazon.com/codebuild/latest/userguide/sample-efs.html#sample-efs-create-acb
[2] https://github.com/aws/efs-utils/blob/master/src/mount_efs/__init__.py#L130
[3] https://forums.aws.amazon.com/thread.jspa?threadID=314322&tstart=0

Unclear versioning

Please add git tags for releases of the efs-utils package. Right now the only way to find the commit for a certain version is to search through file history. This is error prone and not "best-practice".

git tags and/or github release feature are available to track this.

Invalid file system name

Hello,

I've created an Encrypted EFS and I want to mount it on a Debian Instance using TLS.

I followed the build deb package and installed it successfully (amazon-efs-utils-1.2-1.deb)

Now I'm stuck on mounting the EFS...

Here is my efs-utils.conf file:

#
# Copyright 2017-2018 Amazon.com, Inc. and its affiliates. All Rights Reserved.
#
# Licensed under the MIT License. See the LICENSE accompanying this file
# for the specific language governing permissions and limitations under
# the License.
#
 
[DEFAULT]
logging_level = INFO
logging_max_bytes = 1048576
logging_file_count = 10
 
[mount]
dns_name_format = fs-******.efs.*****.amazonaws.com
stunnel_debug_enabled = true
 
# Validate the certificate hostname on mount. This option is not supported by certain stunnel versions.
stunnel_check_cert_hostname = true
 
# Use OCSP to check certificate validity. This option is not supported by certain stunnel versions.
stunnel_check_cert_validity = true
 
# Define the port range that the TLS tunnel will choose from
port_range_lower_bound = 20049
port_range_upper_bound = 20449
 
[mount-watchdog]
enabled = true
poll_interval_sec = 1
unmount_grace_period_sec = 30

Here is the content of the /etc/fstab:

fs-******.efs.*****.amazonaws.com:/ /mnt/Encrypted-NAS efs defaults,_netdev,tls 0 0

Running

mount -a

results in

Invalid file system name: fs-******.efs.*****.amazonaws.com:/ 
ERROR:root:Invalid file system name: fs-******.efs.*****.amazonaws.com:/ 

Could you help me please?

Thank you

TLS with OCSP when efs-utils running behind proxy

Trying to mount an EFS filesystem with encryption in transit. Mounting the filesystem fails because stunnel cannot establish a connection due to a failure when validating the certificate (OCSP).

From the stunnel debug log:

2019.05.29 12:43:45 LOG5[31598:139665592420096]: OCSP: Connecting the AIA responder "http://ocsp.rootca1.amazontrust.com"
2019.05.29 12:43:45 LOG6[31598:139665592420096]: connect_blocking: connecting 52.222.146.227:80
2019.05.29 12:43:45 LOG7[31598:139665592420096]: connect_blocking: s_poll_wait 52.222.146.227:80: waiting 10 seconds
2019.05.29 12:43:55 LOG3[31598:139665592420096]: connect_blocking: s_poll_wait 52.222.146.227:80: TIMEOUTconnect exceeded
2019.05.29 12:43:55 LOG4[31598:139665592420096]: OCSP check failed: depth=1, /C=US/O=Amazon/OU=Server CA 1B/CN=Amazon
2019.05.29 12:43:55 LOG7[31598:139665592420096]: SSL alert (write): fatal: handshake failure
2019.05.29 12:43:55 LOG3[31598:139665592420096]: SSL_connect: 14090086: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed
2019.05.29 12:43:55 LOG5[31598:139665592420096]: Connection reset: 0 byte(s) sent to SSL, 0 byte(s) sent to socket

The HTTP/HTTPS proxy allows requests to ocsp.rootca1.amazontrust.com. But it seems like stunnel is ignoring any proxy settings.

Any ideas?

Additional Command Line Arguments Silently Break Encryption

additional command line arguments breaks the expected argument structure.

sudo mount -v -t efs -o tls fs-25d214ac /efs

yields the following sys.argv

['/sbin/mount.efs', 'fs-25d214ac', '/efs', '-v', '-o', 'rw,tlsโ€™]

the current code only looks for '-o' at index 3 and completely ignores the tls option and fails silently to use encryption in transit.

Mounting EFS with TLS option from within a Docker container

Hi guys,

I would like to mount an EFS volume over a TLS tunnel from a Java process running within a privileged Docker container. It does seem to work fine using the recommended -o tls option, but the amazon-efs-mount-watchdog fails to start with the following message:

Could not start amazon-efs-mount-watchdog, unrecognized init system "java"

I noticed during my initial testing that Java was unable to kill/destroy the mount process and hung forever (when I forgot to add the --privileged flag to docker run), which I suspect is also related to the watchdog not being launched.

Any tips/advice/help would be greatly appreciated.

Thank you.

[Bug] stunnel process is not killed after unmount

There are numerous stunnel processes left after mount point is unmounted for many times.
This causes process leak and port leak.

The following is htop output of the leaking stunnel process:

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
    1 root       20   0  113M 14064 10172 S  0.0  0.2  0:01.27 /bin/aws-efs-csi-driver --endpoint=unix:/csi/csi.sock --logtostderr --v=5
  103 root       20   0  111M  5684  4824 S  0.0  0.1  0:00.04 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.49697c3c-123e-11ea-84e4-02e886441bde.volumes.k
   96 root       20   0  111M  5732  4872 S  0.0  0.1  0:00.22 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.49697c3c-123e-11ea-84e4-02e886441bde.volumes.k
   82 root       20   0  111M  5940  5080 S  0.0  0.1  0:00.03 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.c836f47c-123d-11ea-bb3d-0a95942502dc.volumes.k
   75 root       20   0  111M  6064  5204 S  0.0  0.1  0:00.04 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.c836f47c-123d-11ea-bb3d-0a95942502dc.volumes.k
   63 root       20   0  111M  6152  5292 S  0.0  0.1  0:00.03 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.cc6f0b25-123c-11ea-84e4-02e886441bde.volumes.k
   56 root       20   0  111M  5920  5060 S  0.0  0.1  0:00.04 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.cc6f0b25-123c-11ea-84e4-02e886441bde.volumes.k
   28 root       20   0  111M  6164  5300 S  0.0  0.1  0:00.03 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.4bf0b8f8-123c-11ea-84e4-02e886441bde.volumes.k
   20 root       20   0  111M  6096  5236 S  0.0  0.1  0:00.03 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.4bf0b8f8-123c-11ea-84e4-02e886441bde.volumes.k
  129 root       20   0 11752  3072  2720 S  0.0  0.0  0:00.00 bash
  135 root       20   0 17128  3712  2744 R  0.0  0.0  0:00.03 `- htop

linux distro: amazonlinux:2 container image
kernel version: 4.14.133
efs-utils version: 1.9

Disable stunnel dns-caching

We found during testing that by default stunnel is resolving dnsnames only once and caches the resulting IP forever.
In a rare case where an EFS mount target for an AZ gets recreated, stunnel is not able to reconnect if the IP of the mount target changed.

The behavior is controlled by stunnel delay = yes | no option. see docs

Setting this option to yes in efs_mount helper would make the tunnel more resilient in face of dynamic IP addresses.

Mount hang indefinitely after previously mounting/unmounting successfully on the same instance

Hello,

This is AWS EKS team. The customer issue that get reported to us is interesting. Let my try to summarize here.

All mounts work against the same EFS file system.

20:49:55 mount volume a-1.    # a is a k8s pod and each pod mounts three volumes with different subpath. 
20:49:55 mount volume a-2
20:49:56 mount volume a-3

20:51:52 unmount volume a-1
20:51:52 unmount volume a-2
20:51:52 unmount volume a-3

# so far so good 

20:52:47 mount volume b-1   # hang indefinitely 
20:52:48 mount volume b-2   # hang indefinitely 
20:52:48 mount volume b-3   

20:57:23 unmount volume b-3 

21:00:51 mount volume c-1.  # hang indefinitely 
21:00:51 mount volume c-2  # hang indefinitely 
21:01:25 mount volume c-3  # hang indefinitely 

By hanging, I mean if we check the running process we can see

root     32307 32205  0 20:52 ?        00:00:00 stunnel <equivalent of b1>
root     32473 32205  0 20:52 ?        00:00:00 /sbin/mount.nfs4 127.0.0.1:<b1> 
root     32307 32205  0 20:52 ?        00:00:00 stunnel <b2>
root     32473 32205  0 20:52 ?        00:00:00 /sbin/mount.nfs4 127.0.0.1:<b2> 
root     32307 32205  0 20:52 ?        00:00:00 stunnel <c1>
root     32473 32205  0 20:52 ?        00:00:00 /sbin/mount.nfs4 127.0.0.1:<c1> 
...

After customer manually killed those hanging processes, it is able to successfully mount the volumes again for some time, before the problem eventually reappears.

dmesg:

[Sun Mar  8 21:00:04 2020] INFO: task mount.nfs4:32473 blocked for more than 120 seconds.
[Sun Mar  8 21:00:04 2020]       Not tainted 4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u1
[Sun Mar  8 21:00:04 2020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Mar  8 21:00:04 2020] mount.nfs4      D    0 32473  32205 0x00000000
[Sun Mar  8 21:00:04 2020]  0000000000000086 ffff8d6d58069800 0000000000000000 ffff8d6d5638a9c0
[Sun Mar  8 21:00:04 2020]  ffff8d6d6d118980 ffff8d6d5b39e300 ffffa6320732b9c8 ffffffffb6417609
[Sun Mar  8 21:00:04 2020]  ffff8d6d6d3fbcc0 0000000200000000 ffff8d6d6d118980 21174dd02544d08b
[Sun Mar  8 21:00:04 2020] Call Trace:
[Sun Mar  8 21:00:04 2020]  [<ffffffffb6417609>] ? __schedule+0x239/0x6f0
[Sun Mar  8 21:00:04 2020]  [<ffffffffb6417af2>] ? schedule+0x32/0x80
[Sun Mar  8 21:00:04 2020]  [<ffffffffb6417daa>] ? schedule_preempt_disabled+0xa/0x10
[Sun Mar  8 21:00:04 2020]  [<ffffffffb6419804>] ? __mutex_lock_slowpath+0xb4/0x130
[Sun Mar  8 21:00:04 2020]  [<ffffffffb641989b>] ? mutex_lock+0x1b/0x30
[Sun Mar  8 21:00:04 2020]  [<ffffffffc0782b04>] ? nfs4_discover_server_trunking+0x44/0x2a0 [nfsv4]
[Sun Mar  8 21:00:04 2020]  [<ffffffffc0786792>] ? nfs_callback_up+0x182/0x470 [nfsv4]
[Sun Mar  8 21:00:04 2020]  [<ffffffffc078ad10>] ? nfs4_init_client+0x120/0x2a0 [nfsv4]
[Sun Mar  8 21:00:04 2020]  [<ffffffffc06ef361>] ? __fscache_acquire_cookie+0x61/0x150 [fscache]
[Sun Mar  8 21:00:04 2020]  [<ffffffffc053f38b>] ? __rpc_init_priority_wait_queue+0x7b/0xb0 [sunrpc]
[Sun Mar  8 21:00:04 2020]  [<ffffffffc0724575>] ? nfs_get_client+0x2c5/0x3b0 [nfs]
[Sun Mar  8 21:00:04 2020]  [<ffffffffc078a301>] ? nfs4_set_client+0xb1/0x140 [nfsv4]
[Sun Mar  8 21:00:04 2020]  [<ffffffffb5fa908f>] ? wb_init+0x18f/0x220
[Sun Mar  8 21:00:04 2020]  [<ffffffffc078b7fa>] ? nfs4_create_server+0x12a/0x360 [nfsv4]
[Sun Mar  8 21:00:04 2020]  [<ffffffffc07834b8>] ? nfs4_remote_mount+0x28/0x50 [nfsv4]
[Sun Mar  8 21:00:04 2020]  [<ffffffffb601078b>] ? mount_fs+0x3b/0x160
[Sun Mar  8 21:00:04 2020]  [<ffffffffb602e192>] ? vfs_kern_mount+0x62/0x100

More details can be found at kubernetes-sigs/aws-efs-csi-driver#141 .

efs-utils version=1.21

Do we see patterns like this before?

Cryptic error message when openssl binary is missing

Traceback (most recent call last):
  File "/sbin/mount.efs", line 1369, in <module>
    main()
  File "/sbin/mount.efs", line 1363, in main
    mount_tls(config, init_system, dns_name, path, fs_id, ap_id, mountpoint, options)
  File "/sbin/mount.efs", line 1295, in mount_tls
    with bootstrap_tls(config, init_system, dns_name, fs_id, ap_id, mountpoint, options) as tunnel_proc:
  File "/lib64/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/sbin/mount.efs", line 697, in bootstrap_tls
    base_path=state_file_dir)
  File "/sbin/mount.efs", line 842, in create_certificate
    private_key = check_and_create_private_key(base_path)
  File "/sbin/mount.efs", line 905, in check_and_create_private_key
    do_with_lock(generate_key)
  File "/sbin/mount.efs", line 888, in do_with_lock
    return function()
  File "/sbin/mount.efs", line 901, in generate_key
    subprocess_call(cmd, 'Failed to create private key')
  File "/sbin/mount.efs", line 963, in subprocess_call
    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True)
  File "/lib64/python2.7/subprocess.py", line 394, in __init__
    errread, errwrite)
  File "/lib64/python2.7/subprocess.py", line 1047, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Ideally, script should abort with error telling the user that openssl is required for tls option, and how to install it.

Allow mounting specific AZ mount target

In a similar vein to #9, we cannot rely on the Amazon DNS servers resolving the correct mount target IP for our AZ as we're using Directory Service for our VPC DNS.

The dns_name_format option in the configuration file seems like an ideal place to add a {az} replacement, to allow mounting {az}.{fs_id}.efs.{region}.amazonaws.com.

I'm happy to raise a PR for this, but I'm not sure how long the per-AZ mount points are going to be around for, based on the docs (which states they're only there for 'backwards compatibility').

Incorrect DNS name for EFS filesystem in China region

EFS mount helper uses an incorrect DNS name for filesystem and leads to error when mounting filesystem:

[ec2-user@ip-172-31-40-70 ~]$ sudo mount -t efs fs-bd02e558:/ efs 
Failed to resolve "fs-bd02e558.efs.cn-northwest-1.amazonaws.com (http://fs-bd02e558.efs.cn-northwest-1.amazonaws.com/)" - check that your file system ID is correct. 
See https://docs.aws.amazon.com/console/efs/mount-dns-name for more detail. 

[ec2-user@ip-172-31-40-70 ~]$ sudo mount -t efs -o tls fs-bd02e558:/ efs 
Failed to resolve "fs-bd02e558.efs.cn-northwest-1.amazonaws.com (http://fs-bd02e558.efs.cn-northwest-1.amazonaws.com/)" - check that your file system ID is correct. 
See https://docs.aws.amazon.com/console/efs/mount-dns-name for more detail.

correct DNS name should be ended with efs.Region.amazonaws.com.cn
currently we can only mount efs filesystem using NFS client, not EFS mount helper from efs-utils

When correcting the DNS name, we could use NFS client to mount filesystem:

[ec2-user@ip-172-31-40-70 ~]$ sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-bd02e558.efs.cn-northwest-1.amazonaws.com.cn:/ efs

I checked the source file and maybe we should determine whether it's running in commercial region or China region so that the mount helper could use the correct DNS name:

https://github.com/aws/efs-utils/blob/master/src/mount_efs/__init__.py
line 133:

EFS_FQDN_RE = re.compile(r'^(?P<fs_id>fs-[0-9a-f]+)\.efs\.(?P<region>[a-z0-9-]+)\.amazonaws.com$')

Could you please kindly help to resolve this issue? Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.