aws / efs-utils Goto Github PK
View Code? Open in Web Editor NEWUtilities for Amazon Elastic File System (EFS)
License: MIT License
Utilities for Amazon Elastic File System (EFS)
License: MIT License
I realize CentOS 8 is not listed as supported distribution, this issues serves to log the missing CentOS 8 support and include the log output of the build error. A PR has been opened to add CentOS 8 support.
# make rpm
./mangle-shebangs.sh
rm -rf build/rpmbuild
rm -rf amazon-efs-utils
rm -f amazon-efs-utils.tar.gz
rm -f amazon-efs-utils.spec
mkdir -p amazon-efs-utils
mkdir -p amazon-efs-utils/dist
cp -p dist/amazon-efs-mount-watchdog.conf amazon-efs-utils/dist
cp -p dist/amazon-efs-mount-watchdog.service amazon-efs-utils/dist
cp -p dist/efs-utils.conf amazon-efs-utils/dist
cp -p dist/efs-utils.crt amazon-efs-utils/dist
mkdir -p amazon-efs-utils/src
cp -rp src/mount_efs amazon-efs-utils/src
cp -rp src/watchdog amazon-efs-utils/src
mkdir -p amazon-efs-utils/man
cp -rp man/mount.efs.8 amazon-efs-utils/man
tar -czf amazon-efs-utils.tar.gz amazon-efs-utils/*
ln -sf dist/amazon-efs-utils.spec amazon-efs-utils.spec
mkdir -p build/rpmbuild/{SPECS,COORD_SOURCES,DATA_SOURCES,BUILD,RPMS,SOURCES,SRPMS}
cp amazon-efs-utils.spec build/rpmbuild/SPECS
cp amazon-efs-utils.tar.gz build/rpmbuild/SOURCES
rpmbuild -ba --define "_topdir `pwd`/build/rpmbuild" build/rpmbuild/SPECS/amazon-efs-utils.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.29JJ7K
+ umask 022
+ cd /opt/efs-utils/build/rpmbuild/BUILD
+ cd /opt/efs-utils/build/rpmbuild/BUILD
+ rm -rf amazon-efs-utils
+ /usr/bin/gzip -dc /opt/efs-utils/build/rpmbuild/SOURCES/amazon-efs-utils.tar.gz
+ /usr/bin/tar -xvvof -
drwxr-xr-x root/root 0 2020-04-15 07:19 amazon-efs-utils/dist/
drwxr-xr-x Creating directory: amazon-efs-utils
-rw-r--r-- root/root 571 2020-04-15 06:05 amazon-efs-utils/dist/amazon-efs-mount-watchdog.conf
-rw-r--r-- root/root 481 2020-04-15 06:05 amazon-efs-utils/dist/amazon-efs-mount-watchdog.service
-rw-r--r-- root/root 1510 2020-04-15 06:05 amazon-efs-utils/dist/efs-utils.conf
-rw-r--r-- root/root 4789 2020-04-15 06:05 amazon-efs-utils/dist/efs-utils.crt
drwxr-xr-x root/root 0 2020-04-15 07:19 amazon-efs-utils/man/
-rw-r--r-- root/root 7068 2020-04-15 06:05 amazon-efs-utils/man/mount.efs.8
drwxr-xr-x root/root 0 2020-04-15 07:19 amazon-efs-utils/src/
drwxr-xr-x root/root 0 2020-04-15 07:19 amazon-efs-utils/src/mount_efs/
-rwxr-xr-x root/root 57653 2020-04-15 07:19 amazon-efs-utils/src/mount_efs/__init__.py
drwxr-xr-x root/root 0 2020-04-15 07:19 amazon-efs-utils/src/watchdog/
-rwxr-xr-x root/root 38580 2020-04-15 07:19 amazon-efs-utils/src/watchdog/__init__.py
+ STATUS=0
+ '[' 0 -ne 0 ']'
+ cd amazon-efs-utils
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.tkSnoG
+ umask 022
+ cd /opt/efs-utils/build/rpmbuild/BUILD
+ '[' /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64 '!=' / ']'
+ rm -rf /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64
++ dirname /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT
+ mkdir /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64
+ cd amazon-efs-utils
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/etc/amazon/efs
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/lib/systemd/system
+ install -p -m 644 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/dist/amazon-efs-mount-watchdog.service /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/lib/systemd/system
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/sbin
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/bin
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/var/log/amazon/efs
+ mkdir -p /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/share/man/man8
+ install -p -m 644 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/dist/efs-utils.conf /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/etc/amazon/efs
+ install -p -m 444 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/dist/efs-utils.crt /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/etc/amazon/efs
+ install -p -m 755 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/src/mount_efs/__init__.py /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/sbin/mount.efs
+ install -p -m 755 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/src/watchdog/__init__.py /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/bin/amazon-efs-mount-watchdog
+ install -p -m 644 /opt/efs-utils/build/rpmbuild/BUILD/amazon-efs-utils/man/mount.efs.8 /opt/efs-utils/build/rpmbuild/BUILDROOT/amazon-efs-utils-1.24-4.el8.x86_64/usr/share/man/man8
+ /usr/lib/rpm/check-buildroot
+ /usr/lib/rpm/redhat/brp-ldconfig
/sbin/ldconfig: Warning: ignoring configuration file that cannot be opened: /etc/ld.so.conf: No such file or directory
+ /usr/lib/rpm/brp-compress
+ /usr/lib/rpm/brp-strip /usr/bin/strip
+ /usr/lib/rpm/brp-strip-comment-note /usr/bin/strip /usr/bin/objdump
+ /usr/lib/rpm/brp-strip-static-archive /usr/bin/strip
+ /usr/lib/rpm/brp-python-bytecompile 1
+ /usr/lib/rpm/brp-python-hardlink
+ PYTHON3=/usr/libexec/platform-python
+ /usr/lib/rpm/redhat/brp-mangle-shebangs
*** ERROR: ambiguous python shebang in /usr/bin/amazon-efs-mount-watchdog: #!/usr/bin/env python. Change it to python3 (or python2) explicitly.
*** ERROR: ambiguous python shebang in /sbin/mount.efs: #!/usr/bin/env python. Change it to python3 (or python2) explicitly.
error: Bad exit status from /var/tmp/rpm-tmp.tkSnoG (%install)
RPM build errors:
Bad exit status from /var/tmp/rpm-tmp.tkSnoG (%install)
make: *** [Makefile:57: rpm-only] Error 1
Test OS detail:
# cat /etc/os-release
NAME="CentOS Linux"
VERSION="8 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="8"
Commit 6242b0a just broke my build on Ubuntu 18.04. Here is the tail end of the output from apt-get install -y ./build/amazon-efs-utils*deb:
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
amazon-efs-utils : Depends: which but it is not installable
Hi all, in my organization, we have switched to IMDSv2 by setting http-tokens to required in the instance metadata service for every instance. efs-utils fails to mount once this has been set as it reports 'Error retrieving region'
Any help would be appreciated. (I'm using a build based on commit d692ffe currently)
NFS supports these options so it seems natural that EFS-Utils should also support these options. Is there a reason why it doesn't?
--Jamie
I have packaged efs-utils
for openSUSE and SLE and patched the default Python interpreter to be Python3.
Unfortunately, trying to run mount.efs
on Python3 fails with an error:
ip-XXX-XXX-XXX-XXX:~ # python3 /sbin/mount.efs
Traceback (most recent call last):
File "/sbin/mount.efs", line 674, in <module>
main()
File "/sbin/mount.efs", line 654, in main
config = read_config()
File "/sbin/mount.efs", line 520, in read_config
p = ConfigParser.SafeConfigParser()
AttributeError: type object 'ConfigParser' has no attribute 'SafeConfigParser'
ip-XXX-XXX-XXX-XXX:~ #
Is Python3 supposed to work or is efs-utils
currently Python2 only?
We are using Fedora 26. I'm aware it's EOL.
I have an entry in /etc/fstab. When I execute mount -a
, I get this error:
Failed to mount fs-xxxxxxxx because the network was not yet available, add "_netdev" to your mount options
#
# /etc/fstab
# Created by anaconda on Wed Jul 5 21:45:12 2017
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=45738e43-xxxx-xxxx-xxxx-6ac29c4c2329 / ext4 defaults 1 1
fs-xxxxxxxx.efs.us-east-1.amazonaws.com:/ /mnt/efs efs defaults,_netdev 0 0
Mounting it from the command line yields the same error result.
mount -t efs fs-xxxxxxxx.efs.us-east-1.amazonaws.com /mnt/efs
The same command works in Debian.
More information below
[root@apiproxywebapps system]# rpm -qa |grep amazon-efs-utils
amazon-efs-utils-1.4-1.fc26.noarch
[root@apiproxywebapps system]# rpm -qa |grep nfs-utils
nfs-utils-2.2.1-4.rc2.fc26.x86_64
After installing amazon-efs-mount-watchdog, I execute the following command to mount it, and the result timeout occurs. Why?
[ec2-user@ip-172-31-25-116 ~]$ sudo mount -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-8139eac0.efs.ap-southeast-1.amazonaws.com:/test-efs1/ /mnt/efs1
mount.nfs: Connection timed out
the release 1.25-2 - https://github.com/aws/efs-utils/releases/tag/v1.25-2
and the release 1.25-1 - https://github.com/aws/efs-utils/releases/tag/v1.25-1
have the exact same version (1.25) for the debian-package in dist/amazon-efs-utils.control, see:
https://github.com/aws/efs-utils/blob/master/dist/amazon-efs-utils.control
when you build a debian-package with build-deb.sh both will have the exact same version (1.25)
in the end.
So it is not possible to update amazon-efs-utils-1.25-1.deb with amazon-efs-utils-1.25-2.deb because apt assumes the version is already installed.
The following files are also affected: "config.ini", "dist/amazon-efs-utils.spec", "src/watchdog/init.py" and "src/mount_efs/init.py"
Possible fixes for the future may be:
Would be happy for a fixed versioning, thanks alot.
AWS Region: Dublin, Ireland
Created 2 ASGs and 1 EFS across 3 subnets (spread over the 3 AZs in Dublin). The 3 subnets are private subnets with appropriate route tables entries pointing to a NAT for internet access. The 2 ASGs have expected, max and min set to 1 right now, and therefore have 1 t2.medium EC2 instance in each ASG running RHEL 7.6 (ami-036affea69a1101c9). I am able to confirm internet access. The RHEL 7.6 use a custom AMI built from ami-036affea69a1101c9, that has additional packages installed, all from existing repos in addition to JDK 1.x, Oracle 11.1 packages.
Spun up a separate t2.micro running the custom AMI and followed the instructions to create the aws-efs-utils v1.22 rpm. Installed the RPM on the 2 EC2 instances from above. Also installed nfs-utils on the 2 RHEL 7.6 EC2 instances above.
Setup security groups for the EC2 instances and EFS with rules allowing TCP traffic on port 2049 between the SGs.
Ran the following command,
mount -t efs -o tls fs-xxxxxxxx:/ /mnt/efs
and got the following output,
2020-02-13 08:38:25,683 - INFO - version=1.22 options={'tls': None, '_netdev': None, 'rw': None}
2020-02-13 08:38:25,708 - INFO - Starting TLS tunnel: "stunnel /var/run/efs/stunnel-config.fs-xxxxxxxx.u04.20359"
2020-02-13 08:38:25,710 - INFO - Started TLS tunnel, pid: 25324
2020-02-13 08:38:25,711 - INFO - Executing: "/sbin/mount.nfs4 127.0.0.1:/ /u04 -o rw,noresvport,nfsvers=4.1,retrans=2,_netdev,hard,wsize=1048576,timeo=600,rsize=1048576,port=20359"
2020-02-13 08:38:26,211 - ERROR - Failed to start TLS tunnel (errno=1). stdout="" stderr="[ ] Clients allowed=500
[.] stunnel 5.44 on x86_64-pc-linux-gnu platform
[.] Compiled/running with OpenSSL 1.0.2k-fips 26 Jan 2017
[.] Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI
[ ] errno: (*__errno_location ())
[.] Reading configuration from file /run/efs/stunnel-config.fs-ab0e5d60.u04.20359
[.] UTF-8 byte order mark not detected
[.] FIPS mode disabled
[ ] Compression disabled
[ ] Snagged 64 random bytes from /dev/urandom
[ ] PRNG seeded successfully
[!] /run/efs/stunnel-config.fs-xxxxxxxx.u04.20359:18: "libwrap = no": Specified option name is not valid here"
I tried installing stunnel versions 5.44, 5.50 and 5.56 and got the same response with all of them.
Finally on a hunch commented the following lines in /sbin/mount.efs
if RHEL8_RELEASE_NAME not in get_system_release_version():
efs_config['libwrap'] = 'no'
and ran the mount command again and that fixed the issue. I was able to confirm from the logs that TLS was enabled and the EFS is usable.
I am not sure if there is anyway to disable the libwrap=no option from the cfg file created on the fly by mount.efs or any additional steps I can take to work around this problem?
I can do this using nfs4
with the following steps:
fs-c9280f81.efs.us-east-1.amazonaws.com:/ /home/ubuntu nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,_netdev,noresvport,x-systemd.automount,x-systemd.device-timeout=10,x-systemd.requires=network-online.target 0 0
systemctl daemon-reload
systemctl restart remote-fs.target
Can we do this same thing using efs-utils
? In other words, with efs-utils
can we mount an EFS volume with systemd.automount without requiring reboot?
I just packaged efs-utils
1.16 for openSUSE/SLE and while performing some basic tests I noticed that mount.efs --version
returns the version number 1.13:
suse-laptop:~ # mount.efs --version
/sbin/mount.efs Version: 1.13
suse-laptop:~ #
I am unable to run the mount command on my ec2 which is using amazon-linux
My EC2 is actually an EKS worker node. I was trying to use a persistent volume(EFS) with a pod which resulted in the mount error. So, after some reading, I installed amazon-efs-utils
which most people pointed that will be missing from the worker node. Then I manually tried to mount the EFS on my worker node, which resulted in this error.
$ sudo yum install -y amazon-efs-utils
$ sudo mkdir efs
$ sudo mount -t efs fs-xxxexxax:/ efs
Traceback (most recent call last):
File "/sbin/mount.efs", line 694, in <module>
main()
File "/sbin/mount.efs", line 690, in main
mount_nfs(dns_name, path, mountpoint, options)
File "/sbin/mount.efs", line 480, in mount_nfs
proc = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True)
File "/usr/lib64/python2.7/subprocess.py", line 390, in __init__
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1025, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Then I tried the other instruction given at the EFS console,
$ sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-xxxxxx.efs.xx-xx-xx.amazonaws.com:/ efs
mount: /home/ec2-user/efs: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program.
According to my understanding of the 2nd error, we need a helper program for the filesystem which we are using, as we are using nfs4, there might be /sbin/mount.nfs4
file. But turns out there isn't any file at /sbin/mount.nfs4
location.
$ ls /sbin/ | grep mount
mount.cifs
mount.efs
Same thing is there with that python stacktrace, https://github.com/aws/efs-utils/blob/master/src/mount_efs/__init__.py#L476 it is also looking for /sbin/mount.nfs4
file, which is not there.
OS - Amazon Linux
AMI - eks-worker-v20 (ami-73a6e20b)
$ cat /etc/system-release
Amazon Linux release 2 (Karoo)
I am stuck on this issue, so please direct me in the right way. I mean, is something wrong with my setup or is this a bug?
I have just created EFS and launched the ec2 instance in the same AZ and trying to mount EFS on EC2 instance using EFS mount helper.
Steps to regenerate error
Error Getting
Failed to resolve "fs-xxxxxxx.efs.us-east-1.amazonaws.com" - check that your file system ID is correct.
Description:
we installed efs utility and configured the EFS filesystem with EFS Mount points with in the VPC.
Added the entry in /etc/fstab for permanent mount like below.
echo "mount fs-xxxxxxx /mnt/efs efs tls,_netdev 0 0" >> /etc/fstab
after this when i manually run the mount -a -t efs defaults - it is working fine file system got mounted successfully without any issue.
But when i try to invoke the same thing from ansible mount module like below
- name: Mount up efs
mount:
path: /mnt/efs
src: fs-xxxxxxxx
fstype: efs
opts: tls
state: mounted
become: true
become_method: pbrun
become_user: root
Note: Ansible is running as root privilaged user on the target host.
Expected Result:
EFS filesystem should get mounted without any issue.
Actual Result:
We are getting error in ansible saying like
Error:
only root can run mount.efs
when i start debugging the issue i see the entry in init.py for efs
https://github.com/aws/efs-utils/blob/555154b79572cd2a9f63782cac4c1062eb9b1ebd/src/mount_efs/init.py
we are validating the user with getpass python module but some how even i am using the become in the ansible it is not help me to get ride of this error.
Could you please anyone help me to resolve tis issue
The latest version was tagged as "v.17", should be "v1.7" :).
/kind bug
What happened?
If we mount for many times on container, we could see occasional failure due to timing out to IMDS to fetch region.
# fetch driver logs
$kubectl logs $(kubectl get po -l app=efs-csi-node -n kube-system -o jsonpath='{.items[0].metadata.name}') -n kube-system efs-plugin
Mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t efs -o tls fs-b6654c1c:/dir1 /var/lib/kubelet/pods/de6a9057-ca0d-4503-ba64-bc842ba93aeb/volumes/kubernetes.io~csi/efs-pv1/mount
Output: Error retrieving region. Please set the "region" parameter in the efs-utils configuration file.
# fetch mount.log
$kubectl exec $(kubectl get po -l app=efs-csi-node -n kube-system -o jsonpath='{.items[0].metadata.name}') -n kube-system -it efs-plugin -- cat /var/log/amazon/efs/mount.log
2020-06-03 01:14:42,675 - WARNING - Region not found in config file and metadata service call failed, falling back to legacy "dns_name_format" check
2020-06-03 01:14:42,676 - WARNING - Legacy check for region in "dns_name_format" failed
2020-06-03 01:14:42,676 - ERROR - Unable to reach instance metadata service at http://169.254.169.254/latest/dynamic/instance-identity/document/, reason is timed out
What you expected to happen?
Mount succeeds all the time.
How to reproduce it (as minimally and precisely as possible)?
Try create and delete the following spec for a couple of times.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv1
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
mountOptions:
- tls
csi:
driver: efs.csi.aws.com
volumeHandle: fs-b6654c1c:/dir1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-claim1
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv2
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
mountOptions:
- tls
csi:
driver: efs.csi.aws.com
volumeHandle: fs-b6654c1c:/dir2
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-claim2
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv3
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
mountOptions:
- tls
csi:
driver: efs.csi.aws.com
volumeHandle: fs-b6654c1c:/dir3
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-claim3
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
name: efs-app
spec:
containers:
- name: app
image: centos
command: ["/bin/sh"]
args: ["-c", "while true; do echo $(date -u) | tee --append /data-dir1/out.txt /data-dir2/out.txt /data-dir1/out.txt; sleep 5; done"]
volumeMounts:
- name: efs-volume-1
mountPath: /data-dir1
- name: efs-volume-2
mountPath: /data-dir2
- name: efs-volume-3
mountPath: /data-dir3
volumes:
- name: efs-volume-1
persistentVolumeClaim:
claimName: efs-claim1
- name: efs-volume-2
persistentVolumeClaim:
claimName: efs-claim2
- name: efs-volume-3
persistentVolumeClaim:
claimName: efs-claim3
Anything else we need to know?:
It shouldn't be an issue with IMDSv2 since we are hitting the linklocal address directly. So maybe throttling?
The issue is IMDSv2 doesn't work by default on container environment (need to bump up the max hop allowed). aws/aws-sdk-go#2972
Can we get the timeout limit increased?
efs-utils/src/mount_efs/__init__.py
Line 299 in 7b1f2d0
Environment
I'm getting the following error message:
INFO Entering SSM Agent hibernate - AccessDeniedException: User: arn:aws:sts::ABCDEFGHIJK:assumed-role/group/i-00a457c44cf7c1e42 is not authorized to perform: ssm:UpdateInstanceInformation on resource: arn:aws:ec2:us-west-2:ABCDEFGHIJK:instance/iadsfadsfa
I see that the amazon-ssm-agent is installed by default. Do I need to add the required role to the EC2 so I don't get this error message to make the app work fine?
I could not find any reference about amazon-ssm-agent
in the README.md file
There's a workaround guarding an unsupported option:
if RHEL8_RELEASE_NAME not in get_system_release_version():
efs_config['libwrap'] = 'no'
This is happening on CentOS 8 because the condition apparently evals to true, so the option's getting passed in and causing the TLS connection set up to fail. The log mentions stunnel not supporting libwrap based on what I'm seeing. Commenting out the condition entirely on 8 is resulting in success, so the fix is probably just to expand that conditional.
Add installation support for Alpine linux. This would be very helpful when docker containers need to be used in a CI environment.
I attempted to recreate the behavior of this tool by running a local stunnel and found that I was unable to verify each of the provided certs when connecting to my Amazon EFS endpoint. stunnel was reporting the following:
2018.11.30 13:16:24 LOG4[17547:140558419277568]: CERT: Verification error: unable to get local issuer certificate
2018.11.30 13:16:24 LOG4[17547:140558419277568]: Certificate check failed: depth=3, /C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./CN=Starfield Services Root Certificate Authority - G2
2018.11.30 13:16:24 LOG7[17547:140558419277568]: SSL alert (write): fatal: unknown CA
2018.11.30 13:16:24 LOG3[17547:140558419277568]: SSL_connect: 14090086: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
2018.11.30 13:16:24 LOG5[17547:140558419277568]: Connection reset: 0 byte(s) sent to SSL, 0 byte(s) sent to socket
I copied the provided efs-utils.crt
to my stunnel configuration and found it was missing the Starfield Services Root Certificate Authority - G2 cert. I was forced to add https://certs.secureserver.net/repository/sf-class2-root.crt in order for stunnel to trust the provided CA certs and actually start using the stunnel verify = 2
option.
I also setup efs-utils completely and found this cert was not needed for it's trust chain.
Please add this cert to your efs-utils.crt
, or help explain why the efs-utils tool doesn't need this.
Due to configurations that are established by our enterprise's VPC configuration, we do not have the ability to use the DNS name for our EFS shares. We are able to look up the IP address fairly easily, however, and it would be great if we could leverage the mount helper utility to manage TLS encryption using the IP address of our EFS instance.
Given that the typical mount command supports using an IP endpoint, it seems like mount.efs should support this as well with similar syntax: mount -t efs -o tls 10.1.2.3:/ /myefsvolume
.
I've made these changes internally and validated that it works in our installation, and I am interested in submitting a PR with the changes. Prior to that, I want to ensure that the usage syntax above makes sense with any future plans for the utility.
Currently mount only supports the target formats "fs_id:/" or "efs_fqdn:/". This is restricting the use of custom DNS servers. If the mount helper support custom DNS names or mount target IP addresses, customers can either setup a A record in their DNS server to resolve the mount target IP addresses or specify the IP address directly in the mount command.
Expected options:
mount -t efs myefs.mydomain.com:/ /mnt
mount -t efs 172.31.40.212:/ /mnt
OS: CentOS Linux release 7.4.1708 (Core)
EC2: m5large
after mounting an efs filesystem using efs-utils and -o tls...df no longer reports the file system usage correctly.
127.0.0.1:/ 8.0E 121M 8.0E 1% /mnt/efs
du -sh /mnt/efs 11G /mnt/efs
Hi guys,
I've been trying to automate the mount command with subprocess and I've had some interesting results.
This is amazon 2: amzn2-ami-hvm-2.0.20190823.1-x86_64-gp2 (ami-03ed5bd63ba378bd8)
efs-utils has been installed yum
Python3 has been installed with venv.
The relevant snippet
args = ['mount', f'-t {self.type}', f'-o {self.options}', f'{source_arg}', f'{self.mount_point}']
completed_process = subprocess.run(args, capture_output=True, text=True)
if completed_process.returncode != 0:
message = f'Failed to mount {completed_process.returncode}\n'
message += 'Attempted to run in subprocess :\n'
message += completed_process.args
message += 'StdErr\n'
message += completed_process.stderr
logger.error(message)
When I run my script -
Failed to mount 1
Attempted to run in subprocess:
mount -t efs -o tls,iam fs-xxxx /mnt/cross_account
mount: unsupported option format: tls,iam
I can run the constructed command directly fine, and /var/log/amazon/efs/mount.log shows
2020-03-19 09:25:33,211 - INFO - version=1.22 options={'tls': None, 'iam': None, 'rw': None}
2020-03-19 09:25:33,264 - INFO - Starting TLS tunnel: "stunnel /var/run/efs/stunnel-config.fs-c65954ff.mnt.cross_account.20224"
2020-03-19 09:25:33,266 - INFO - Started TLS tunnel, pid: 220802020-03-19 09:25:33,267 - INFO - Executing: "/sbin/mount.nfs4 127.0.0.1:/ /mnt/cross_account -o rw,noresvport,nfsvers=4.1,retrans=2,hard,wsize=1048576,timeo=600,rsize=1048576,port=20224"
2020-03-19 09:25:33,910 - INFO - Successfully mounted fs-xxxx.efs.ap-southeast-2.amazonaws.com at /mnt/cross_account
A second attempt trying to call mount.efs directly:
args = ['mount.efs',f'{source_arg}', f'{self.mount_point}', f'-o {self.options}']
mount.efs fs-xxxx /mnt/cross_account -o tls,iam
b'mount.nfs4: access denied by server while mounting fs-xxxx.efs.ap-southeast-2.amazonaws.com:/'
/var/log/amazon/efs/mount.log
2020-03-19 09:33:10,921 - INFO - version=1.22 options={}
2020-03-19 09:33:10,932 - INFO - Executing: "/sbin/mount.nfs4 fs-xxxx.efs.ap-southeast-2.amazonaws.com:/ /mnt/cross_account -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport"
2020-03-19 09:33:11,261 - ERROR - Failed to mount fs-xxxxx.efs.ap-southeast-2.amazonaws.com at /mnt/cross_account: returncode=32, stderr="b'mount.nfs4: access denied by server while mounting fs-xxxx.efs.ap-southeast-2.amazonaws.com:/'"
Is this some weird interaction between shells, python going on?
Am running on Ubuntu 18.04.4 server with the certified FIPS 140-2 kernel, have created a script to automate mounting to my encrypted EFS:
efsHost=AWS_FS_ID
sudo apt install -y git make binutils jq
git clone https://github.com/aws/efs-utils
cd efs-utils
./build-deb.sh
sudo apt-get -y install ./build/amazon-efs-utils*deb
# mount EFS
sudo mkdir -p /mnt/efs
sudo mount -t efs -o tls ${efsHost}:/ /mnt/efs
The response that I get is Failed to initialize TLS tunnel for AWS_FS_ID
.
Could the FIPS 140-2 kernel be causing the issue? I did not compile stunnel
separately as this is obviously quite a recent version of the OS.
If it helps:
$ uname -r
4.15.0-1011-fips
$ openssl version
OpenSSL 1.1.1 11 Sep 2018
... and if I attach to my EFS at the point of spinning up the instance (using the AWS console) it connects just fine (although I don't believe that uses transport encryption which I do require), which leads me to think that it's not the kernel, but thought I'd mention it all the same...
From the /var/log/amazon/efs/mount.log
file:
2020-06-03 22:19:16,428 - ERROR - Failed to start TLS tunnel (errno=1). stdout="b''" stderr="b'[ ] Clients allowed=500\n[.] stunnel 5.44 on x86_64-pc-linux-gnu platform\n[.] Compiled with OpenS
SL 1.1.0g 2 Nov 2017\n[.] Running with OpenSSL 1.1.1 11 Sep 2018\n[.] Update OpenSSL shared libraries or rebuild stunnel\n[.] Threading:PTHREAD Sockets:POLL,IPv6,SYSTEMD TLS:ENGINE,FIPS,OCSP
,PSK,SNI Auth:LIBWRAP\n[ ] errno: (*__errno_location ())\n[.] Reading configuration from file /run/efs/stunnel-config.fs-d0a344d5.mnt.efs.20403\n[.] UTF-8 byte order mark not detected\n[.] FIPS
mode disabled\n[ ] Compression disabled\n[ ] PRNG seeded successfully\n[ ] Initializing service [efs]\n[!] SSL_CTX_new: 140A90F2: error:140A90F2:SSL routines:SSL_CTX_new:unable to load ssl3 md
5 routines\n[!] Service [efs]: Failed to initialize TLS context'"
I am dockerizing an application that utilizes the efs-utils to mount a directory in an app I am dockerizing.
Can you only use this on an ec2 machine that has access to an IAM role? Or can you provide the efs-utils with access keys on your local environment ?
Doesn't work under Ubuntu 18.04.
Even compiled latest stunnel and your efs utils.
Can you at least put some effort into supporting the biggest distros in your cloud and not just Amazon AMI.
That would be fantastic.
Oh, and update your docs. They are out of date and you wasted approximately 5 hours of my time.
Would be nice if the releases of efs-utils would be published, so that they are reachable via the Github-API. In the moment an API-call for the the latest release:
curl -s https://api.github.com/repos/aws/efs-utils/releases/latest
will only show an error-message
{
"message": "Not Found",
"documentation_url": "https://developer.github.com/v3/repos/releases/#get-the-latest-release"
}
Other aws-projects (e.g. aws-toolkit-vscode) support the releases via API:
curl -s https://api.github.com/repos/aws/aws-toolkit-vscode/releases/latest
If the releases get published the latest release would also be flagged on the WebUI with a green label saying "Latest release"
Would be happy for a fixed repository, thanks alot.
Python2 is already EOL, and Python2 libs are already starting to disappear from, e.g., Debian Bullseye. I realize that the code in this project supports Python3, but please consider adding packaging support for Python 3, at least for some versions of distributions.
I'm not sure the best way to go about this, but here are some places that I think would need to be changed:
Hi guys,
We are using efs-utils from within Docker containers spawned from AWS Batch. It works great, but occasionally we receive this error about 26 seconds after attempting to mount EFS over TLS:
mount.nfs4: Connection reset by peer
We are using the recommended mount command:
mount -t efs -o tls [EFS file system ID]:/ /mnt
This happens in ~0.2% of all mount attempts from all our VPCs. It's a particularly nasty issue because it seems to prevent the mount process from being killed cleanly. Since Apache Commons Exec 1.3's basic ExecuteWatchdog is not able to destroy it, the only remedy I have found is to terminate the EC2 instance.
Any ideas or insights would be greatly appreciated.
Thanks!
Cheers,
-Jon
Greetings!
Could you add support for Container Linux?
Thanks!
--Christian
Hi there,
Following instructions to configure /etc/hosts, VPC peering, etc, I've been able to successfully mount from Paris to an EFS residing in Ireland issuing sudo mount -t efs fs-fXXXXXXX:/ /efs
.
However when I add the TLS option, Stunnel fails:
[ec2-user@ip-X-X-X-X ~]$ sudo mount -t efs -o tls fs-fXXXXXXX:/ /efs
mount.nfs4: Connection reset by peer
Failed to initialize TLS tunnel for fs-fXXXXXXX
Is it possible to use TLS in EFS between Regions?
Cheers
Manuel
Hi. I've isolated a scenario where the EFS mount type helper fails to create a docker volume, yet using the NFS mount type directly is successful. ๐ค The error is no such device
. I suspect this is related to the addr=
NFS mount option.
fs-12345678
docker volume create \
-d local \
-o "type=efs" \
-o "device=fs-12345678:/" \
-o "o=tls" \
myefs
docker run -ti -v myefs:/mnt/one alpine sh
The container does not run โน and the below no such device
error is given:
docker: Error response from daemon: error while mounting volume '/var/lib/docker/volumes/myefs/_data': failed to mount local volume: mount fs-12345678:/:/var/lib/docker/volumes/myefs/_data, data: tls: no such device.
No error. Running container. And the EFS filesystem is mounted into the container's /mnt/one
.
docker container prune
and yesdocker volume rm myefs
docker volume create \
-d local \
-o "type=nfs" \
-o "device=:/" \
-o "o=addr=fs-12345678.efs.us-east-1.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport" \
myefs
docker run -ti -v myefs:/mnt/one alpine sh
Yay. ๐ You are now in the running Alpine container and the EFS filesystem was successfully mounted to the container's /mnt/one
Please note the params on the volume creation. I used the addr=
param instead of putting the DNS on the device.
Below is a failure scenario using the typical NFS parameters, where the filesystem DNS is on the device
and does not use the addr=
parameter. This approach is what appears in the EFS console itself.
docker container prune
and yesdocker volume rm myefs
docker volume create \
-d local \
-o "type=nfs" \
-o "device=fs-12345678.efs.us-east-1.amazonaws.com:/" \
-o "o=nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport" \
myefs
docker run -ti -v myefs:/mnt/one alpine sh
The container does not run โน and the below invalid argument
error is given:
docker: Error response from daemon: error while mounting volume '/var/lib/docker/volumes/myefs/_data': failed to mount local volume: mount fs-12345678.efs.us-east-1.amazonaws.com:/:/var/lib/docker/volumes/myefs/_data, data: nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport: invalid argument.
No error. Running container. And the EFS filesystem is mounted into the container's /mnt/one
.
I executed
mount -t efs $EFS_DOMAIN:/ /mnt/efs
with version 1.23-2 then hunged.
With version 1.21-2, mount was succeeded.
I tried below commands in container. And
is OK.
curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"
does not respond.
I think this is the cause of the issue.
Hello,
Could this be a feature request? We want to be able to mount the EFS volume with the Route 53 CNAME of the EFS volume instead of the filesystem ID.
Thanks!
Hi,
I pursued documentation[1] hand configured codebuild to mount an EFS drive. The error I get is "Error retrieving region". I located that in the efs-utils code base[2] and noticed that it is trying to get current region from ec2 instance metadata. That won't work in codebuild containers will it?
[Container] 2019/12/17 05:03:23 Running command mount -t efs fs-11112222.efs.ap-south-1.amazonaws.com:/ /efs
Error retrieving region
P.S: I have additionally tried with just the host name unlike above:
[Container] 2019/12/17 05:03:23 Running command mount -t efs fs-11112222:/ /efs
Error retrieving region
I have attempted mounting in ec2 instance and that works fine. My codebuild containers is an amazon linux 2 1.0 image.
I have created a forum post as well[3]
[1] https://docs.aws.amazon.com/codebuild/latest/userguide/sample-efs.html#sample-efs-create-acb
[2] https://github.com/aws/efs-utils/blob/master/src/mount_efs/__init__.py#L130
[3] https://forums.aws.amazon.com/thread.jspa?threadID=314322&tstart=0
Hi,
Would it be possible to add openSUSE/SLES as a verified distribution?
All the tests run successfully and usage seems to work fine.
Thanks.
Please add git tags for releases of the efs-utils package. Right now the only way to find the commit for a certain version is to search through file history. This is error prone and not "best-practice".
git tags and/or github release feature are available to track this.
Hello,
I've created an Encrypted EFS and I want to mount it on a Debian Instance using TLS.
I followed the build deb package and installed it successfully (amazon-efs-utils-1.2-1.deb)
Now I'm stuck on mounting the EFS...
Here is my efs-utils.conf file:
#
# Copyright 2017-2018 Amazon.com, Inc. and its affiliates. All Rights Reserved.
#
# Licensed under the MIT License. See the LICENSE accompanying this file
# for the specific language governing permissions and limitations under
# the License.
#
[DEFAULT]
logging_level = INFO
logging_max_bytes = 1048576
logging_file_count = 10
[mount]
dns_name_format = fs-******.efs.*****.amazonaws.com
stunnel_debug_enabled = true
# Validate the certificate hostname on mount. This option is not supported by certain stunnel versions.
stunnel_check_cert_hostname = true
# Use OCSP to check certificate validity. This option is not supported by certain stunnel versions.
stunnel_check_cert_validity = true
# Define the port range that the TLS tunnel will choose from
port_range_lower_bound = 20049
port_range_upper_bound = 20449
[mount-watchdog]
enabled = true
poll_interval_sec = 1
unmount_grace_period_sec = 30
Here is the content of the /etc/fstab:
fs-******.efs.*****.amazonaws.com:/ /mnt/Encrypted-NAS efs defaults,_netdev,tls 0 0
Running
mount -a
results in
Invalid file system name: fs-******.efs.*****.amazonaws.com:/
ERROR:root:Invalid file system name: fs-******.efs.*****.amazonaws.com:/
Could you help me please?
Thank you
Trying to mount an EFS filesystem with encryption in transit. Mounting the filesystem fails because stunnel
cannot establish a connection due to a failure when validating the certificate (OCSP).
From the stunnel
debug log:
2019.05.29 12:43:45 LOG5[31598:139665592420096]: OCSP: Connecting the AIA responder "http://ocsp.rootca1.amazontrust.com"
2019.05.29 12:43:45 LOG6[31598:139665592420096]: connect_blocking: connecting 52.222.146.227:80
2019.05.29 12:43:45 LOG7[31598:139665592420096]: connect_blocking: s_poll_wait 52.222.146.227:80: waiting 10 seconds
2019.05.29 12:43:55 LOG3[31598:139665592420096]: connect_blocking: s_poll_wait 52.222.146.227:80: TIMEOUTconnect exceeded
2019.05.29 12:43:55 LOG4[31598:139665592420096]: OCSP check failed: depth=1, /C=US/O=Amazon/OU=Server CA 1B/CN=Amazon
2019.05.29 12:43:55 LOG7[31598:139665592420096]: SSL alert (write): fatal: handshake failure
2019.05.29 12:43:55 LOG3[31598:139665592420096]: SSL_connect: 14090086: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed
2019.05.29 12:43:55 LOG5[31598:139665592420096]: Connection reset: 0 byte(s) sent to SSL, 0 byte(s) sent to socket
The HTTP/HTTPS proxy allows requests to ocsp.rootca1.amazontrust.com
. But it seems like stunnel
is ignoring any proxy settings.
Any ideas?
additional command line arguments breaks the expected argument structure.
sudo mount -v -t efs -o tls fs-25d214ac /efs
yields the following sys.argv
['/sbin/mount.efs', 'fs-25d214ac', '/efs', '-v', '-o', 'rw,tlsโ]
the current code only looks for '-o' at index 3 and completely ignores the tls option and fails silently to use encryption in transit.
Hi guys,
I would like to mount an EFS volume over a TLS tunnel from a Java process running within a privileged Docker container. It does seem to work fine using the recommended -o tls
option, but the amazon-efs-mount-watchdog
fails to start with the following message:
Could not start amazon-efs-mount-watchdog, unrecognized init system "java"
I noticed during my initial testing that Java was unable to kill/destroy the mount
process and hung forever (when I forgot to add the --privileged
flag to docker run
), which I suspect is also related to the watchdog not being launched.
Any tips/advice/help would be greatly appreciated.
Thank you.
There are numerous stunnel processes left after mount point is unmounted for many times.
This causes process leak and port leak.
The following is htop
output of the leaking stunnel process:
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
1 root 20 0 113M 14064 10172 S 0.0 0.2 0:01.27 /bin/aws-efs-csi-driver --endpoint=unix:/csi/csi.sock --logtostderr --v=5
103 root 20 0 111M 5684 4824 S 0.0 0.1 0:00.04 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.49697c3c-123e-11ea-84e4-02e886441bde.volumes.k
96 root 20 0 111M 5732 4872 S 0.0 0.1 0:00.22 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.49697c3c-123e-11ea-84e4-02e886441bde.volumes.k
82 root 20 0 111M 5940 5080 S 0.0 0.1 0:00.03 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.c836f47c-123d-11ea-bb3d-0a95942502dc.volumes.k
75 root 20 0 111M 6064 5204 S 0.0 0.1 0:00.04 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.c836f47c-123d-11ea-bb3d-0a95942502dc.volumes.k
63 root 20 0 111M 6152 5292 S 0.0 0.1 0:00.03 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.cc6f0b25-123c-11ea-84e4-02e886441bde.volumes.k
56 root 20 0 111M 5920 5060 S 0.0 0.1 0:00.04 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.cc6f0b25-123c-11ea-84e4-02e886441bde.volumes.k
28 root 20 0 111M 6164 5300 S 0.0 0.1 0:00.03 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.4bf0b8f8-123c-11ea-84e4-02e886441bde.volumes.k
20 root 20 0 111M 6096 5236 S 0.0 0.1 0:00.03 `- stunnel /var/run/efs/stunnel-config.fs-e8a95a42.var.lib.kubelet.pods.4bf0b8f8-123c-11ea-84e4-02e886441bde.volumes.k
129 root 20 0 11752 3072 2720 S 0.0 0.0 0:00.00 bash
135 root 20 0 17128 3712 2744 R 0.0 0.0 0:00.03 `- htop
linux distro: amazonlinux:2 container image
kernel version: 4.14.133
efs-utils version: 1.9
We found during testing that by default stunnel is resolving dnsnames only once and caches the resulting IP forever.
In a rare case where an EFS mount target for an AZ gets recreated, stunnel is not able to reconnect if the IP of the mount target changed.
The behavior is controlled by stunnel delay = yes | no option. see docs
Setting this option to yes in efs_mount helper would make the tunnel more resilient in face of dynamic IP addresses.
we are trying to use EFS provisioner on our k8s cluster, and it seems that we can not use web identity based token using this.
https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/
is this something that is planned for efs-utils?
many thanks
Hello,
This is AWS EKS team. The customer issue that get reported to us is interesting. Let my try to summarize here.
All mounts work against the same EFS file system.
20:49:55 mount volume a-1. # a is a k8s pod and each pod mounts three volumes with different subpath.
20:49:55 mount volume a-2
20:49:56 mount volume a-3
20:51:52 unmount volume a-1
20:51:52 unmount volume a-2
20:51:52 unmount volume a-3
# so far so good
20:52:47 mount volume b-1 # hang indefinitely
20:52:48 mount volume b-2 # hang indefinitely
20:52:48 mount volume b-3
20:57:23 unmount volume b-3
21:00:51 mount volume c-1. # hang indefinitely
21:00:51 mount volume c-2 # hang indefinitely
21:01:25 mount volume c-3 # hang indefinitely
By hanging, I mean if we check the running process we can see
root 32307 32205 0 20:52 ? 00:00:00 stunnel <equivalent of b1>
root 32473 32205 0 20:52 ? 00:00:00 /sbin/mount.nfs4 127.0.0.1:<b1>
root 32307 32205 0 20:52 ? 00:00:00 stunnel <b2>
root 32473 32205 0 20:52 ? 00:00:00 /sbin/mount.nfs4 127.0.0.1:<b2>
root 32307 32205 0 20:52 ? 00:00:00 stunnel <c1>
root 32473 32205 0 20:52 ? 00:00:00 /sbin/mount.nfs4 127.0.0.1:<c1>
...
After customer manually killed those hanging processes, it is able to successfully mount the volumes again for some time, before the problem eventually reappears.
dmesg:
[Sun Mar 8 21:00:04 2020] INFO: task mount.nfs4:32473 blocked for more than 120 seconds.
[Sun Mar 8 21:00:04 2020] Not tainted 4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u1
[Sun Mar 8 21:00:04 2020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Mar 8 21:00:04 2020] mount.nfs4 D 0 32473 32205 0x00000000
[Sun Mar 8 21:00:04 2020] 0000000000000086 ffff8d6d58069800 0000000000000000 ffff8d6d5638a9c0
[Sun Mar 8 21:00:04 2020] ffff8d6d6d118980 ffff8d6d5b39e300 ffffa6320732b9c8 ffffffffb6417609
[Sun Mar 8 21:00:04 2020] ffff8d6d6d3fbcc0 0000000200000000 ffff8d6d6d118980 21174dd02544d08b
[Sun Mar 8 21:00:04 2020] Call Trace:
[Sun Mar 8 21:00:04 2020] [<ffffffffb6417609>] ? __schedule+0x239/0x6f0
[Sun Mar 8 21:00:04 2020] [<ffffffffb6417af2>] ? schedule+0x32/0x80
[Sun Mar 8 21:00:04 2020] [<ffffffffb6417daa>] ? schedule_preempt_disabled+0xa/0x10
[Sun Mar 8 21:00:04 2020] [<ffffffffb6419804>] ? __mutex_lock_slowpath+0xb4/0x130
[Sun Mar 8 21:00:04 2020] [<ffffffffb641989b>] ? mutex_lock+0x1b/0x30
[Sun Mar 8 21:00:04 2020] [<ffffffffc0782b04>] ? nfs4_discover_server_trunking+0x44/0x2a0 [nfsv4]
[Sun Mar 8 21:00:04 2020] [<ffffffffc0786792>] ? nfs_callback_up+0x182/0x470 [nfsv4]
[Sun Mar 8 21:00:04 2020] [<ffffffffc078ad10>] ? nfs4_init_client+0x120/0x2a0 [nfsv4]
[Sun Mar 8 21:00:04 2020] [<ffffffffc06ef361>] ? __fscache_acquire_cookie+0x61/0x150 [fscache]
[Sun Mar 8 21:00:04 2020] [<ffffffffc053f38b>] ? __rpc_init_priority_wait_queue+0x7b/0xb0 [sunrpc]
[Sun Mar 8 21:00:04 2020] [<ffffffffc0724575>] ? nfs_get_client+0x2c5/0x3b0 [nfs]
[Sun Mar 8 21:00:04 2020] [<ffffffffc078a301>] ? nfs4_set_client+0xb1/0x140 [nfsv4]
[Sun Mar 8 21:00:04 2020] [<ffffffffb5fa908f>] ? wb_init+0x18f/0x220
[Sun Mar 8 21:00:04 2020] [<ffffffffc078b7fa>] ? nfs4_create_server+0x12a/0x360 [nfsv4]
[Sun Mar 8 21:00:04 2020] [<ffffffffc07834b8>] ? nfs4_remote_mount+0x28/0x50 [nfsv4]
[Sun Mar 8 21:00:04 2020] [<ffffffffb601078b>] ? mount_fs+0x3b/0x160
[Sun Mar 8 21:00:04 2020] [<ffffffffb602e192>] ? vfs_kern_mount+0x62/0x100
More details can be found at kubernetes-sigs/aws-efs-csi-driver#141 .
efs-utils version=1.21
Do we see patterns like this before?
Traceback (most recent call last):
File "/sbin/mount.efs", line 1369, in <module>
main()
File "/sbin/mount.efs", line 1363, in main
mount_tls(config, init_system, dns_name, path, fs_id, ap_id, mountpoint, options)
File "/sbin/mount.efs", line 1295, in mount_tls
with bootstrap_tls(config, init_system, dns_name, fs_id, ap_id, mountpoint, options) as tunnel_proc:
File "/lib64/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/sbin/mount.efs", line 697, in bootstrap_tls
base_path=state_file_dir)
File "/sbin/mount.efs", line 842, in create_certificate
private_key = check_and_create_private_key(base_path)
File "/sbin/mount.efs", line 905, in check_and_create_private_key
do_with_lock(generate_key)
File "/sbin/mount.efs", line 888, in do_with_lock
return function()
File "/sbin/mount.efs", line 901, in generate_key
subprocess_call(cmd, 'Failed to create private key')
File "/sbin/mount.efs", line 963, in subprocess_call
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True)
File "/lib64/python2.7/subprocess.py", line 394, in __init__
errread, errwrite)
File "/lib64/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Ideally, script should abort with error telling the user that openssl is required for tls
option, and how to install it.
In a similar vein to #9, we cannot rely on the Amazon DNS servers resolving the correct mount target IP for our AZ as we're using Directory Service for our VPC DNS.
The dns_name_format
option in the configuration file seems like an ideal place to add a {az}
replacement, to allow mounting {az}.{fs_id}.efs.{region}.amazonaws.com
.
I'm happy to raise a PR for this, but I'm not sure how long the per-AZ mount points are going to be around for, based on the docs (which states they're only there for 'backwards compatibility').
To fix issue like #61, customer customized the mount helper to added a wait timeout of 15 seconds and if mount doesn't respond in that time it kills it and tries again
.
Shall we consider doing something like that in mount helper?
EFS mount helper uses an incorrect DNS name for filesystem and leads to error when mounting filesystem:
[ec2-user@ip-172-31-40-70 ~]$ sudo mount -t efs fs-bd02e558:/ efs
Failed to resolve "fs-bd02e558.efs.cn-northwest-1.amazonaws.com (http://fs-bd02e558.efs.cn-northwest-1.amazonaws.com/)" - check that your file system ID is correct.
See https://docs.aws.amazon.com/console/efs/mount-dns-name for more detail.
[ec2-user@ip-172-31-40-70 ~]$ sudo mount -t efs -o tls fs-bd02e558:/ efs
Failed to resolve "fs-bd02e558.efs.cn-northwest-1.amazonaws.com (http://fs-bd02e558.efs.cn-northwest-1.amazonaws.com/)" - check that your file system ID is correct.
See https://docs.aws.amazon.com/console/efs/mount-dns-name for more detail.
correct DNS name should be ended with efs.Region.amazonaws.com.cn
currently we can only mount efs filesystem using NFS client, not EFS mount helper from efs-utils
When correcting the DNS name, we could use NFS client to mount filesystem:
[ec2-user@ip-172-31-40-70 ~]$ sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-bd02e558.efs.cn-northwest-1.amazonaws.com.cn:/ efs
I checked the source file and maybe we should determine whether it's running in commercial region or China region so that the mount helper could use the correct DNS name:
https://github.com/aws/efs-utils/blob/master/src/mount_efs/__init__.py
line 133:
EFS_FQDN_RE = re.compile(r'^(?P<fs_id>fs-[0-9a-f]+)\.efs\.(?P<region>[a-z0-9-]+)\.amazonaws.com$')
Could you please kindly help to resolve this issue? Thanks a lot!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.