Giter Club home page Giter Club logo

Comments (57)

tlk3 avatar tlk3 commented on September 18, 2024 51

Or you could use:

pdf_template = pdfrw.PdfReader(infile)
pdf_template.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))
pdfrw.PdfWriter().write(outfile, pdf_template)

This adds the NeedAppearances key/value to the AcroForm dict. If I'm understanding your problem correctly.

  • Updated 07/15/2020 to fix the formatting. I glad this has helped so many.

from pdfrw.

davidmacneil avatar davidmacneil commented on September 18, 2024 29

As a workaround, I was able have the fields show up by setting an empty string to the appearance dictionary (AP):

form = pdfrw.PdfReader(fname)
annotations = form.pages[0]['/Annots']
for annotation in annotations:
    # ... validate / update fields here
    annotation.update(pdfrw.PdfDict(AP=''))

The fields are then visible in Preview (Mac OS 10.13.4), but not Acrobat Reader DC. I suspect that Preview detects the invalid appearance dictionary and sets it to a default value.

from pdfrw.

PeterSlezak avatar PeterSlezak commented on September 18, 2024 5

You're correct, the error is because no appearnace stream is associated with the field, but you've created it in a wrong way. You've just assigned and stream to AP dictionary. What you need to do is to assign an indirect Xobject to /N in /AP dictionary; and you need to crate Xobject from scratch.
The code should be something like the following, but I haven't tested it as I don't have any such pdf file with me right now and no time to create one. You can post an example pdf:

from pdfrw import PdfWriter, PdfReader, IndirectPdfDict, PdfName, PdfDict

INVOICE_TEMPLATE_PATH = 'untitled.pdf'
INVOICE_OUTPUT_PATH = 'untitled-output.pdf'

field1value = 'im field_1 value'

template_pdf = PdfReader(INVOICE_TEMPLATE_PATH)
template_pdf.Root.AcroForm.Fields[0].V = field1value

#this depends on page orientation
rct = template_pdf.Root.AcroForm.Fields[0].Rect
hight = round(float(rct[3]) - float(rct[1]),2)
width =(round(float(rct[2]) - float(rct[0]),2)

#create Xobject
xobj = IndirectPdfDict(
            BBox = [0, 0, width, hight],
            FormType = 1,
            Resources = PdfDict(ProcSet = [PdfName.PDF, PdfName.Text]),
            Subtype = PdfName.Form,
            Type = PdfName.XObject
            )

#assign a stream to it
xobj.stream = '''/Tx BMC
BT
 /Helvetica 8.0 Tf
 1.0 5.0 Td
 0 g
 (''' + field1value + ''') Tj
ET EMC'''

#put all together
template_pdf.Root.AcroForm.Fields[0].AP = PdfDict(N = xobj)

#output to new file
PdfWriter().write(INVOICE_OUTPUT_PATH, template_pdf)

FYI: /Type, /FormType, /Resorces are optional (/Resources is strongly recomended).
I'm not going to explain the code but if anything unclear just ask or check PDF Reference (all info is there :))

from pdfrw.

alexgarciaguilera avatar alexgarciaguilera commented on September 18, 2024 5

Putting all your help on a simple Script, This works for me in Windows 10

#/bin/python

import os
import pdfrw

def writeFillablePDF(input_pdf_path, output_pdf_path, data_dict):
    # Read Input PDF
    template_pdf = pdfrw.PdfReader(input_pdf_path)

    # Set Apparences ( Make Text field visible )
    template_pdf.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))

    # Loop all Annotations
    for annotation in template_pdf.pages[0]['/Annots']:
        # Only annotations that are Widgets Text
        if annotation['/Subtype'] == '/Widget' and annotation['/T']: 
            key = annotation['/T'][1:-1] # Remove parentheses
            if key in data_dict.keys():
                annotation.update( pdfrw.PdfDict(V=f'{data_dict[key]}') )
                #print(f'={key}={data_dict[key]}=')
    pdfrw.PdfWriter().write(output_pdf_path, template_pdf)

if __name__ == '__main__':

    TEMPLATE_PATH = 'C:/tmp/OrigDoc.pdf'
    OUTPUT_PATH = 'C:/tmp/FilledDoc.pdf'

    # Assuming you know the Text Filed Name in the Document
    # Build dictionaty with Name & Values
    data_dict = {
        'CustomerName': 'Big Company Name',
        'PartNumber': 'PN12345',
        'Revision': '333',
    }

    writeFillablePDF(TEMPLATE_PATH, OUTPUT_PATH, data_dict)

from pdfrw.

Eddiedigits avatar Eddiedigits commented on September 18, 2024 2

If I then open B (with a value added and saved in Acrobat Reader) in the interpreter, change the value of the field and output the PDF. When I open it in Acrobat Reader the original value is still shown, but when I click on the field the NEW value is shown.
I can't find the original value in the Python interpreter but it seems changing the .V attribute is not correct.
Something I don't understand is. When I access the value, saved in Acrobat, in the interpreter it prints with round brackets.

>>> field = doc.Root.AcroForm.Fields[0]
>>> field.V
'(777)'
>>> field.update(pdfrw.PdfDict(V=pdfrw.PdfString('444')))
>>> field.V
'444'

When I change the value. Making sure to use the pdfrw.PdfString object. There are no round brackets. If I try to add the round brackets when creating the value they are escaped and included in the field.

Does someone who knows more about pdfrw than me know what these brackets mean?

from pdfrw.

PeterSlezak avatar PeterSlezak commented on September 18, 2024 2

I used example pdf from #132 . Code below will add "im field_1 value" to the first text field. Please note that it's just a proof of concept rather than anything else:

from pdfrw import PdfWriter, PdfReader

INVOICE_TEMPLATE_PATH = 'sample-template.pdf'
INVOICE_OUTPUT_PATH = 'sample-output.pdf'

field1value = 'im field_1 value'

template_pdf = PdfReader(INVOICE_TEMPLATE_PATH)
#update first filed, it's assumed that it's text field
template_pdf.Root.AcroForm.Fields[0].V = field1value
#add apearnance stream to display it
template_pdf.Root.AcroForm.Fields[0].AP.N.stream = '''/Tx BMC
BT
 /Helvetica 8.0 Tf
 1.0 5.0 Td
 0 g
 (''' + field1value + ''') Tj
ET EMC'''

PdfWriter().write(INVOICE_OUTPUT_PATH, template_pdf)

See section 5 of PDF reference manual for more text formating/painting options.
When I open sample-output.pdf I can see field 1 text in foxit reader, adobe acrobat 11, chrome. Tested on Windows 10.

from pdfrw.

Keenpachi avatar Keenpachi commented on September 18, 2024 1

All data for testing are in under link:
https://bostata.com/post/how_to_populate_fillable_pdfs_with_python/

Which field in PDF array need to by changed to get updated value to appear in new PDF?

from pdfrw.

RuellePaul avatar RuellePaul commented on September 18, 2024 1

@tlk3 It works for me too, thank you so much !!

from pdfrw.

fiapps avatar fiapps commented on September 18, 2024 1

@tlk3 it works with Adobe Reader, but not with Preview. To get field values to appear in Preview, use the solution above of setting the appearance dictionary for each modified field to an empty string.

from pdfrw.

DimitrisAthanasiadis avatar DimitrisAthanasiadis commented on September 18, 2024 1

I had a rendering problem with my fields and I've been trying for a lot of hours to solve it. I used your help from here and the holy Stack Overflow but the problem remained. I decided to leave the AP as blank (AP='') when it was not present in the file just to see what happens. I also used Foxit Reader to open the file and everything was perfect. Even printed the pages on paper and it was correct. The same with the browser PDF reader. BUT the Adobe Acrobat did not render the text until I clicked the field and when I previewed the pages for printing, the fields were blank. Does anyone know what doesn't work well with Acrobat? Is something special needed to work properly with Adobe?

from pdfrw.

misokol-earthlink avatar misokol-earthlink commented on September 18, 2024 1

I have a pdf form filler using pdfrw and it almost works. I can fill out the form from my custom dictionary to replace blank text fields from the template form which happens to be IRS F941. But the saved from does not display the saved entries even though I have used the code to update the NeedAppearances suggested by many. My script concludes by reopening the saved file and dumps out the values that were saved but invisible so the substitution code worked. Further, when I open the form with a PDF editor, in addition to not being able to see any of the field values, when I click on any field, I get a message saying I cannot make any changes and resave the file which I was intending to do to handle the check boxes which I have not yet coded.

from pdfrw.

pmaupin avatar pmaupin commented on September 18, 2024

Yeah, I'd like to have code for that, too. I haven't really looked at how that works yet.

from pdfrw.

praveen049 avatar praveen049 commented on September 18, 2024

Hi
@pmaupin , if you can give me some introduction on how this could to be implemented, i can try to implement it. Thansk

from pdfrw.

Sousaplex avatar Sousaplex commented on September 18, 2024

I dont know if there's been any movement on this, but this would be fantastic. I'll see if I can find any relevant information in the spec. FYI this is the one I'm looking at: https://wwwimages2.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf

from pdfrw.

tbbooher avatar tbbooher commented on September 18, 2024

I have the same problem. Same experience with preview and the appearance dictionary.

from pdfrw.

sevetseh28 avatar sevetseh28 commented on September 18, 2024

Did anyone find a solution for this?

from pdfrw.

bartmika avatar bartmika commented on September 18, 2024

+1

from pdfrw.

jancoow avatar jancoow commented on September 18, 2024

I still have this problem. I'm trying to populate a few annotate fields. Some readers display the new annotate values correctly, however Adobe Reader leaves them blank.

from pdfrw.

Eddiedigits avatar Eddiedigits commented on September 18, 2024

I have 2 documents. Copies of each other. One is a blank Form (A). The other I have filled in the first field with a number and saved it in Acrobat Reader (B). When I open B again the number shows in the field.
If I open both documents in the Python interpreter. I can see B.Root.Pages.Kids[0].Annots[0].V has the value.
If I copy the value of the first Annotation from B to A and pdfWriter it out. It is only visible when the field has focus.
If I copy the whole Annotation from B to A and pdfWriter it out. The value is visible as we all want.
I have compared the 2 versions of the Annotation and the only difference I have found is Annot.AP.N.BBox is a bit different but copying this over to A doesn't help.
The only thing I haven't carefully compared is Annot.P because it seems to be just circular references to the Page information.
The bottom line is. I don't think pdfrw is the problem. There is something else in the PDF which needs to be programatically updated to make this work.

from pdfrw.

PeterSlezak avatar PeterSlezak commented on September 18, 2024

Characters enclosed in parentheses denotes literal string (type of PDF object).

from pdfrw.

Eddiedigits avatar Eddiedigits commented on September 18, 2024

Thanks Peter. If I do pdfrw.PdfString.encode() then I get the brackets. Unfortunately this still doesn't make the value visible.
My best guess at the moment is that Acrobat Reader is moving / copying the value into the PDF text on defocus. This is maybe why I can't find the value, as pdfrw doesn't really give access to the Pdf text.
I'm going to try and dump the text from document B with another library and see if I can find a way forward.
Unless someone knows to decode the String of bytes (not byte string) that comes out of the content.stream?

from pdfrw.

jancoow avatar jancoow commented on September 18, 2024

@Eddiedigits I'm having the exact same issue. PDF's created and filled with pdfrw cannot be opened correctly in Adobe reader, while other PDF readers view them fine. The fields only appear while putting focus on them. See my other issue #158 . Even if I just read a pdf file and write it directly to a new file, without editing anything, all the annotate keys are added recursively. So I believe there is something wrong with the writing process of pdfrw.

from pdfrw.

DrLou avatar DrLou commented on September 18, 2024

@Eddiedigits As am I. Opening the written file in Acrobat, I can only see the written fields - they are there - when focus is placed on them with mouse. Also, this only works for 2 of the 3 fields written. The 3rd, an email address, is apparently not written at all. Weird!
Reader/Form Editor is Acrobat Pro 11.0.3 on macOS.

from pdfrw.

PeterSlezak avatar PeterSlezak commented on September 18, 2024

You need to modify /V and also appearance stream (indirect reference object specified by /AP). /V contains value of the field and /AP specify how to present it.

PDF reference 1.7 page 692

The field’s text is held in a text string (or, beginning with PDF 1.5, a stream) in the V (value) entry of the field dictionary. The contents of this text string or stream are used to construct an appearance stream for displaying the field, as described under “Variable Text” on page 677.

See "Tj" lines in example 8.18, it contains the text that will be displayed as default when you open pdf document (since /AP dictionary contains /N = annotation's normal appearance).

I don't have time right now to investigate if it is possible to easily update appearance stream XObject using pdfrw.

from pdfrw.

jancoow avatar jancoow commented on September 18, 2024

I'm trying to update the appearance stream with your code. However, I get an error:
" AttributeError: 'NoneType' object has no attribute 'N' ". I assume that there is no appearance stream available in my field, so I tried creating it with:

            annotation.AP = pdfrw.PdfDict(N=pdfrw.PdfDict(stream='''/Tx BMC
                    BT
                     /Helvetica 8.0 Tf
                     3.0 5.0 Td
                     0 g
                     (''' + value + ''') Tj
                    ET EMC'''))

However this results in disappearing fields in all pdf readers...

from pdfrw.

Eddiedigits avatar Eddiedigits commented on September 18, 2024

@PeterSlezak This works for me. I just changed the font to /TiRo because that's what is already used in my PDF and changed the stream to 1.0 1.0 Td because the number was appearing too high in the Form Field and cutting off the top half of the number.
Thank you very much!!!

from pdfrw.

jancoow avatar jancoow commented on September 18, 2024

@PeterSlezak
Hi. I tried your solution. However, Adobe Acrobat Reader is crashing directly after opening the PDF. Also in any other PDF viewer the values aren't displayed anymore. I've tried exactly your code but it seems not to be working unfortunately.

from pdfrw.

PeterSlezak avatar PeterSlezak commented on September 18, 2024

Hi @jancoow,
Share your code and pdf file if possible. Otherwise I cannot help you.

from pdfrw.

Efk3 avatar Efk3 commented on September 18, 2024

Hi @PeterSlezak,
thank you for your script, it works great. I have only one problem: I need to write latin-2 characters into the input. I attached a font into the pdf which supports characters like Ő and Ű and I used this font for render but I don't know how to write the value into the input.

I got this error:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 308-311: ordinal not in range(256)

I'm using this:

xobj.stream = '''/Tx BMC
BT
 /LiberationSerif 12.0 Tf
 1.0 5.0 Td
 0 g
 (''' + value + ''') Tj
ET EMC'''

with value = "ÍŐŰ"

from pdfrw.

gpontesss avatar gpontesss commented on September 18, 2024

Hi @PeterSlezak,
thank you for your script, it works great. I have only one problem: I need to write latin-2 characters into the input. I attached a font into the pdf which supports characters like Ő and Ű and I used this font for render but I don't know how to write the value into the input.

I got this error:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 308-311: ordinal not in range(256)

I'm using this:

xobj.stream = '''/Tx BMC
BT
 /LiberationSerif 12.0 Tf
 1.0 5.0 Td
 0 g
 (''' + value + ''') Tj
ET EMC'''

with value = "ÍŐŰ"

I've got the same problem. I think the pdfrw library only deals with ASCII characters, for the message "ordinal not in range(256)". Probably it can't modify it with unicode, even though it's possible by manual typing. A solution for know may be to use reportlab. If someone has something better using pdfrw would be way more appreciated, I believe.

I see that you're not using a unicode string too. try using the following:

xobj.stream = u'''/Tx BMC
BT
 /LiberationSerif 12.0 Tf
 1.0 5.0 Td
 0 g
 ({}) Tj
ET EMC'''.format(value)

from pdfrw.

PeterSlezak avatar PeterSlezak commented on September 18, 2024

All data for testing are in under link:
https://bostata.com/post/how_to_populate_fillable_pdfs_with_python/

Which field in PDF array need to by changed to get updated value to appear in new PDF?

@ZarakiiKenpachi
I don't know why it doesn't work on your pdf. I can populate and display value on few fields but not all.

from pdfrw.

PeterSlezak avatar PeterSlezak commented on September 18, 2024

Hi @Efk3
I never needed non ASCII characters, but my suggestion would be to use \ddd sequence in literal string where ddd is octal character code; or you can try to use hexadecimal string instead of literal string.
original xobj.stream code snipped will change to:

xobj.stream = '''/Tx BMC
BT
 /Helv 8.0 Tf
 1.0 5.0 Td
 0 g
 <696D206669656C645f312076616C7565> Tj
ET EMC'''

It should display "im field_1 value"

from pdfrw.

stbth01 avatar stbth01 commented on September 18, 2024

@PeterSlezak Thanks so much for code snippet really helped me!! This is similar to ASCII question above but I have an address field and would like to have a EOL character in the normal address break line spot. I have tried several (ex \n, \r, \015, \012, <br>) and none seem to show in the stream but they will show correctly when focused (template_pdf.Root.AcroForm.Fields[0].V = field1value). Do you have any suggestions?

from pdfrw.

ltsat avatar ltsat commented on September 18, 2024

@PeterSlezak your code seems on spot, although I haven't been able to make it work.
I don't have any problem with unicode, could it be a python issue?
And just because I haven't seen it mentioned, as a temp workaround, MS Edge does display field values (both unicode and ascii) without a problem.

from pdfrw.

PeterSlezak avatar PeterSlezak commented on September 18, 2024

@PeterSlezak Thanks so much for code snippet really helped me!! This is similar to ASCII question above but I have an address field and would like to have a EOL character in the normal address break line spot. I have tried several (ex \n, \r, \015, \012,
) and none seem to show in the stream but they will show correctly when focused (template_pdf.Root.AcroForm.Fields[0].V = field1value). Do you have any suggestions?

Hi @stbth01
Change the Appearance stream as follows:

template_pdf.Root.AcroForm.Fields[0].AP.N.stream = '''/Tx BMC
BT
 /Helvetica 8.0 Tf
 0.0 10.0 Td
 0 g
 (Line one) Tj
 0.0 -7.0 Td
 (Line two) Tj
ET EMC'''

Just replace "Line one" with the first-line-text and "Line two" with second-line-text, and adjust the Td values as appropriate to fit both lines in your text box. The values depend on the box a font size.

You should also updated/add a form field dictionary entry /Ff 13 to indicate that it's a multi-line field. (When /Ff is completely omitted it indicates a single line text fields.) It should work even without /Ff, but it's better to follow the PDF reference document.

from pdfrw.

vincentaudoire avatar vincentaudoire commented on September 18, 2024

@tlk3 Worked for me, thanks!

from pdfrw.

dwasyl avatar dwasyl commented on September 18, 2024

@tlk3 That did the trick for me having the same problem with Adobe not showing the fields.

from pdfrw.

tonimarie avatar tonimarie commented on September 18, 2024

@tlk3 that totally saved my day. Thank you!

from pdfrw.

tbbooher avatar tbbooher commented on September 18, 2024

@tlk3 boom! works great

from pdfrw.

vasmedvedev avatar vasmedvedev commented on September 18, 2024

@tlk3 your solution helps very much, thanks! It also works for PyPDF2 in a similar way. However in my case I still have some fields (date field and checkboxes) that remain empty (not rendered). It seems to be a general PDF problem, not pdfrw one.

from pdfrw.

Pikafu avatar Pikafu commented on September 18, 2024

@tlk3 It works! Thank you!

from pdfrw.

l47y avatar l47y commented on September 18, 2024

@tlk3
this saved also my day :-) Thanks alot

from pdfrw.

rau avatar rau commented on September 18, 2024

@tlk3 Thank you! Any clue why in a big dict of items, some of the filled fields show up, and every tenth or so form some just randomly dont appear?

from pdfrw.

chdsbd avatar chdsbd commented on September 18, 2024

TLK3's solution works with Acrobat and macOS Preview, but it doesn't work with PDFjs. If I open a file created this way with Acrobat and save it from there, it will then show the field values in PDFjs.

from pdfrw.

pmilano1 avatar pmilano1 commented on September 18, 2024

Below is something that I threw together quick, I was able to iterate through and produce individual PDFs just fine, fields seemed visible (slightly different code).

When I added the merge code in order to produce a multi-page PDF containing results of objects in data, it seems to no longer work. Can someone take a quick look to see if I'm handling the merge and setting the appearance workaround properly, based on your experience? It's down low in __main__

Many thanks.

import pdfrw

IN_FILE = "awards.csv"
TEMPLATE_FILE = "template.pdf"
ANNOT_KEY = '/Annots'
ANNOT_FIELD_KEY = '/T'
ANNOT_VAL_KEY = '/V'
ANNOT_RECT_KEY = '/Rect'
SUBTYPE_KEY = '/Subtype'
WIDGET_SUBTYPE_KEY = '/Widget'
FIELDS = ["Certificate Category", "Certificate Rank"]
N = 1


# Updates single instance of template pdf, increment form field suffix
def modify_form(input_pdf_path, data_dict):
    global N  # need to get rid of this
    template_pdf = pdfrw.PdfReader(input_pdf_path)
    annotations = template_pdf.pages[0][ANNOT_KEY]
    for annotation in annotations:
        if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
            if annotation[ANNOT_FIELD_KEY]:
                key = annotation[ANNOT_FIELD_KEY][1:-1]
                if key in data_dict.keys():
                    annotation.update(
                        pdfrw.PdfDict(T="{}".format(key + str(N)))
                    )
                    annotation.update(
                        pdfrw.PdfDict(V="{}".format(data_dict[key]))
                    )
                    annotation.update(pdfrw.PdfDict(Ff=1))
    N += 1
    return template_pdf


def build_datadict(in_file):
    o = []
    with open(in_file) as file:
        reader = csv.DictReader(file, delimiter=',')
        for row in reader:
            m = {}
            for f in FIELDS:
                if row[f] and not row[f].isspace() and not row[f] is None:
                    m[f] = row[f]
            if m:
                m['Date'] = "January 25th, 2020"
                o.append(m)
    return o


if __name__ == '__main__':
    data = build_datadict(IN_FILE)
    writer = pdfrw.PdfWriter()
    writer.trailer.Info = pdfrw.IndirectPdfDict(
        Title='Combined PDF'
    )
    # Iterate array of 'data_dict's
    for d in data:
        this_pages = modify_form(TEMPLATE_FILE, d)  # fill the form
        this_pages.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))  # maintain appearances
        writer.addpages(this_pages.pages)  # merge into single pdf
    writer.write(IN_FILE.split(".")[0] + ".pdf")

from pdfrw.

dementedhedgehog avatar dementedhedgehog commented on September 18, 2024

This is the second time I've bounced of pdfrw because of this issue :( The fixes above don't work for me. I've had to go back to pdftk.

from pdfrw.

2Nipun avatar 2Nipun commented on September 18, 2024

Seeing the same issue. PDF form has the values but its not displaying them till I click on these each field in a viewer. The moment I click out it goes away. Viewing the PDF in Mac on both Preview and Acrobat Reader & Pro. So in pro the form field still shows as unfilled (ie it has that blue color indicator of a unfilled form field).

So I guess I need to look at pdftk or some other solution beyond pdfrw?

from pdfrw.

starlabs007 avatar starlabs007 commented on September 18, 2024

@pmilano1 Yours is a slightly different issue (see here: #171) and it's regarding merging PDFs.

For anyone else reading this and finding that setting the Acroform / NeedAppearances doesn't work in Acrobat, verify that you're not merging pdf files. It seems the Acroform node is lost during the merging process when the concatenated pdf is written out. There's a Stack Overflow link that has working code that addresses this in the link above.

from pdfrw.

cemoga avatar cemoga commented on September 18, 2024

@tlk3 you are the best. It worked for Acrobat

from pdfrw.

cemoga avatar cemoga commented on September 18, 2024

@davidmacneil Your solution works perfectly for preview in Mac. Thank you!

from pdfrw.

sazedulhaque avatar sazedulhaque commented on September 18, 2024

@tlk3 Thank you buddy

from pdfrw.

cemoga avatar cemoga commented on September 18, 2024

from pdfrw.

TyrGo avatar TyrGo commented on September 18, 2024

I'm having the same problem as others here. Everything appears fine in Preview. But Adobe doesn't display the fields till clicked. The solutions above don't seem to me to fix that. Anyone solved that yet?

from pdfrw.

summerswallow-whi avatar summerswallow-whi commented on September 18, 2024

I found this blog: (https://medium.com/@vivsvaan/filling-editable-pdf-in-python-76712c3ce99) and corresponding repo (https://github.com/vivsvaan/filling_editable_pdf_python). It seems to work at least on Reader DC and chrome. I did notice that some fields don't appear filled in on preview, but that could just be me. I only tried the code an hour ago.

from pdfrw.

vijeshkpaei avatar vijeshkpaei commented on September 18, 2024

Please make sure input pdf is flattern****

from pdfrw.

pablo1strange avatar pablo1strange commented on September 18, 2024

EUFGJ.TXT

whats wrong in this code?

from pdfrw.

pablo1strange avatar pablo1strange commented on September 18, 2024

import openpyxl
import pdfrw
from PyPDF2.generic import TextStringObject, NameObject

def read_excel_data(excel_file):
try:
workbook = openpyxl.load_workbook(excel_file)
sheet = workbook.active
data = []

    # Leer los datos del archivo Excel
    for row in sheet.iter_rows(min_row=2, values_only=True):
        row_data = {}
        for idx, value in enumerate(row, start=1):
            header = sheet.cell(row=1, column=idx).value
            row_data[header] = value
        data.append(row_data)
    
    return data
except Exception as e:
    print("Error al leer los datos del archivo Excel:", e)
    return []

def fill_pdf_form(pdf_template, excel_file):
try:
# Leer los datos del archivo Excel
data = read_excel_data(excel_file)

    # Abrir la plantilla del PDF
    template_pdf = pdfrw.PdfReader(pdf_template)

    # Iterar sobre las páginas del PDF
    for page in template_pdf.pages:
        annotations = page.get('/Annots')
        if annotations:
            for annotation in annotations:
                if annotation.get('/Subtype') == '/Widget':
                    field_name = annotation.get('/T')
                    if field_name:
                        for row_data in data:
                            if field_name in row_data:
                                annotation.update({
                                    NameObject('/V'): TextStringObject(str(row_data[field_name]))
                                })
                                break

    # Escribir el PDF rellenado en un archivo de salida
    output_pdf = 'pdfrellenado.pdf'
    writer = pdfrw.PdfWriter()
    writer.write(output_pdf, template_pdf)

    print("Formulario rellenado con éxito.")
except Exception as e:
    print("Se ha producido un error:", e)

fill_pdf_form('swap.pdf', 'datos.xlsx')

from pdfrw.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.