Comments (5)
Hi Guys,
I'm using https://www.serverless.com/ framework with plugin serverless-python-requirements in my project. I finally found a solution to build native dependencies in AWS Lambda.
Using Docker to create a package with native dependencies:
Dockerfile:
FROM lambci/lambda:build-python3.6
RUN pip install --upgrade pip
RUN yum -y remove cmake
RUN pip install cmake --upgrade
RUN yum -y install poppler-cpp-devel
serverless.yml
frameworkVersion: ">=1.3.4 <2.11.2"
plugins:
- serverless-python-requirements
provider:
name: aws
runtime: python3.6
custom:
pythonRequirements:
useDownloadCache: false
useStaticCache: false
invalidateCaches: true
dockerizePip: true
dockerFile: Dockerfile
dockerExtraFiles:
- /usr/lib64/libpoppler-cpp.so.10
- /usr/lib64/libpoppler.so.46
- /usr/lib64/libopenjpeg.so.2
The magic has happened in 'dockerExtraFiles' responsible to copy the native libraries to your lambda package.
from pdftotext.
For pdftotext==2.1.5
Complete native libraries list
- /usr/lib64/libpoppler-cpp.so.10
- /usr/lib64/libpoppler.so.46
- /usr/lib64/libopenjpeg.so.1
- /usr/lib64/libpng15.so.15
- /usr/lib64/liblcms2.so.2
- /usr/lib64/libtiff.so.5
- /usr/lib64/libtiffxx.so.5
- /usr/lib64/libfontconfig.so.1
- /usr/lib64/libfreetype.so.6
- /usr/lib64/libjpeg.so.62
- /usr/lib64/libjbig.so.2.0
- /usr/lib64/libexpat.so.1
- /usr/lib64/libuuid.so.1
Complete Dockerfile
FROM lambci/lambda:build-python3.8
ENV VIRTUAL_ENV=venv
ENV PATH $VIRTUAL_ENV/bin:$PATH
RUN python3 -m venv $VIRTUAL_ENV
RUN pip install --upgrade pip
RUN yum -y remove cmake
RUN pip install cmake --upgrade
RUN yum -y install poppler-cpp-devel
COPY requirements.txt .
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
WORKDIR /var/task/venv/lib/python3.8/site-packages
#COPY your-source-code-to-into-image .
COPY lambda_function.py .
RUN cp /usr/lib64/libpoppler-cpp.so.10 .
RUN cp /usr/lib64/libpoppler.so.46 .
RUN cp /usr/lib64/libopenjpeg.so.1 .
RUN cp /usr/lib64/libpng15.so.15 .
RUN cp /usr/lib64/liblcms2.so.2 .
RUN cp /usr/lib64/libtiff.so.5 .
RUN cp /usr/lib64/libtiffxx.so.5 .
RUN cp /usr/lib64/libfontconfig.so.1 .
RUN cp /usr/lib64/libfreetype.so.6 .
RUN cp /usr/lib64/libjpeg.so.62 .
RUN cp /usr/lib64/libjbig.so.2.0 .
RUN cp /usr/lib64/libexpat.so.1 .
RUN cp /usr/lib64/libuuid.so.1 .
RUN zip -9qr upload-to-s3.zip .
RUN echo "upload-to-s3.zip created"
How to get package file from docker?
IMAGE_NAME=what-you-want
docker build -t ${IMAGE_NAME} .
docker run --rm --entrypoint cat ${IMAGE_NAME} upload-to-s3.zip > function-deployment-package.zip
from pdftotext.
@svaidyans found any solution? were you able to work it out?
from pdftotext.
It seems this is solved. Closing, since there's no issue on my side here.
from pdftotext.
This is what worked for me
FROM public.ecr.aws/lambda/python:3.9-x86_64
RUN pip install --upgrade pip
RUN yum install -y gcc gcc-c++ make poppler-cpp-devel
COPY requirements.txt .
RUN pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"
COPY *.py ${LAMBDA_TASK_ROOT}/
from pdftotext.
Related Issues (20)
- Two different text output is returned HOT 2
- Unable to install pdftotext : poppler/cpp/poppler-document.h not found HOT 4
- Crash when PDF contains empty pages HOT 3
- problems reading and maintaining the layout HOT 2
- AttributeError: module 'pdftotext' has no attribute 'PDF' HOT 4
- ImportError: DLL load failed while importing pdftotext: The specified module could not be found
- Import error when running on MacOs (M1) HOT 1
- Enable tests requiring at least version 0.88 if requirement is met HOT 3
- Formatting changed after new install HOT 4
- Provide access to page::text_list HOT 1
- not able to install in red-hat base image 8 HOT 1
- Can't install using conda/mamba HOT 4
- double column pdf HOT 2
- PDF tags after converting tags from PDF HOT 5
- Poppler/error seen while extracting text from PDF such as poppler/error (572194): Unknown filter 'JPXDecode'\n HOT 2
- I am getting this issue in python 3.7.7 macosm2
- Getting error Invalid ToUnicode Cmap HOT 2
- Can't make crop work HOT 1
- #17 in arch linux HOT 9
- Not exactly an issue HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdftotext.