Help with PDFtk function


#1

So I found some PHP code and tweaked it. The code author claims that it can be used to get the field name of a PDF document that I intend to fill info back to.

Also, could someone please recommend a good basic free code editor, web-based or downloadable that they trust, preferably something that will allow you to run the code and give you error feedback, thanks.

I know I need to create an [include(“fields.php”);] line of code somewhere to link the PHP [public function fields($pretty = false)] to the HTML for everything to work correctly and that is one of my issues. Where and how to put and create that.

The other is, having the [return $pretty == true ? nl2br($con) : $con;] post results to the[

 
] line in the HTML page instead of on the fields.php page.

I think those are my only issues, but can’t be sure as I have no way of testing it until I get these bugs worked. Let me say thanks in advance for any help that come along.

Here is what I’ve gotten so far.

HTML Code_

Get PDF Field Names

Enter PDF URL Below:

 

__PHP Code for fields.php file

<?php // ... public function fields($pretty = false) { $tmp = $this->tmpfile(); exec("pdftk {$this->pdfurl} dump_data_fields > {$tmp}"); $con = file_get_contents($tmp); unlink($tmp); return $pretty == true ? nl2br($con) : $con; } ?>

Any help I can get in getting this to work properly would be deeply appreciated.

Also, here is a link to the page the HTML is on and the PHP code is for, just in case:
http://losttitleconnection.net/PDFURL


#2

Just realized another issue. linking the in the form to the public function fields($pretty = false) of the fields.php file

here is the complete code again
html code________________________
‘’’
<–form action="" id=“datadump” onsubmit=“return false;”>
<–fieldset><–legend>Get PDF Field Names<-/legend>

<-p><-label for=“PDF_URL”>Enter PDF URL Below:<-/label><-/p>

<-p><-input id=“pdfurl” name=“pdfurl” type=“text” /><-/p>
<–/fieldset>
<-/form>
‘’’
field.php code____________________________

<?php

// ...

public function fields($pretty = false)
{
    $tmp = $this->tmpfile();

    exec("pdftk {$this->pdfurl} dump_data_fields > {$tmp}");
    $con = file_get_contents($tmp);

    unlink($tmp);
    return $pretty == true ? nl2br($con) : $con;
}

#3

So I’ve done a little more research…and here is what I’ve learned

HTML Side

<form action="fields.php" class="datadump" method="POST">
<fieldset><legend>Get PDF Field Names</legend>

<p><label for="PDF_URL">Enter PDF URL Below:</label></p>

<p><input name="pdfurl" type="text" /></p>

<input type="submit" value="Submit" name="submit">

</fieldset>

</form>

<div id="results">&nbsp;</div>

PHP Side


<?php

$('.myforms').on('submit', function (fields) {
    event.preventDefault();

public function fields($pretty = false)
{

    $tmp = $this->tmpfile();
       var purl = $("input[name='pdfurl']", this).val();

    exec("pdftk {$this->purl} dump_data_fields > {$tmp}");

    $con = file_get_contents($tmp);

    unlink($tmp);

    //still need help posting "return" to "<div id = "results"></div> on html page
    return $pretty == true ? nl2br($con) : $con;

#4

Can someone please tell me if this is right or what I need to change…thanks


<?php

$('.myforms').on('submit', function (fields) {
    event.preventDefault();

public function fields($pretty = false)
{

    $tmp = $this->tmpfile();
       var purl = $("input[name='pdfurl']", this).val();
       var $results = results.innerHTML; 
   
    exec("pdftk {$this->purl} dump_data_fields > {$tmp}"); results.innerHTML = "List of Form Fields $" + fields;

    $con = file_get_contents($tmp);

    unlink($tmp);

    //still need help posting "return" to "<div id = "results"></div> on html page... May have just found an 
    //answer, can someone please verify

    // this is part of the original code i found in a tutorial that posts results to the same .php page
    return $pretty == true ? nl2br($con) : $con;

    //This is what I think I need to be using to post results to my <div id = "results"></div> line of my html page
   return  $results
 }

#5

Is this routine going to be used to edit PDF’s set up by you?

The question is how do you know if it is an “editable” PDF? Most are not!

If so, then create a test page such as a Contact-Us format and save as a PDF. Then, give it to us and your code. Then, see if it finds the fields. Then, we have something to help you debug.


#6

Would this work if I just gave you the url to the form that will be used?
As it is a goverment issued docoument that allows users to submit info to it already, does that count as editable?


#7

Yes, it would help. If you are worried about posting it here, send the link to me in a private message…


#8

I have been looking around and found several ways to do this. Lots of libraries around. Some are much easier than others. One issue is that this will let you take a a downloaded PDF and fill it in. But, to post it back might not work as most of the sites where you download them from need log in info to be handled.

Do you have all of that solved, too?


#9

I would prefer to work with the actual docs as they are the ones that will need to actually be tested for this process as opposed to me creating something…

this is the link to one of the actual docs:
https://drive.google.com/open?id=1gN-CwF6hHl_wX-0OLBRzAZJWC5YQADjD

I was going to host my own copies of the pdf’s on my server and have the code request them from there as opposed to having to request them from a site I have no control of. Does that sound right?

As for codes and which is easier, obviously, I’m willing to listen to those smarter and more experienced than me. But more importantly, what get my customers what they need without issues.


#10

as for actual code to put the fdf to pdf to download, save or print… I’ve got code I’m working with, but since I’m really new to the php code I’m not really confident that I could put it together correctly. I seem to have a handle on the translation of what I see, but still alittle blurry when it comes to putting it together from scratch. Although, I think I’m doing pretty good for just starting to work with it this past week.


#11

Well, as they say around here, you can’t get there from here… L O L

So, the doc you showed me is NOT a FDF file. It is a PDF file. A PDF is a "Portable Document Format "…
What this means is that it is designed to let you see a printable output on a webpage. It can be created using Adobe’s Acrobat program. And, then, read or actually “viewed” in a browser using Acrobat-Reader. It can NOT be edited! Now with that said, the PDFtk library lets you strip out parts of a PDF, merge parts of them together and converts them to a couple other formats.

So, using the sample you posted, you can not do what you want to do directly. If you OWN the full Acrobat package, you might be able to edit it, but, not using PHP as far as I know of.

But, you can convert it to HTML, edit the HTML directly and place your data into the correct areas and then convert it back to a PDF. The two-way conversions are very easy to do. Just a line or two at most. The editing of the data is more tricky as you have to figure where to place it on the form. If the form seldom changes, you can create a HTML “template” and just fill in the data as needed using PHP. I have done this several times for various forms from various sites similar to your sampler. You would need to basically build a form in HTML and then insert your data.

Or, you could keep the PDF as an image and fill in the data with text overlays, placing the text and X’s where needed on the image using fairly simple graphic’s functions. then, turn the image back into a PDF. (I do not like this version as it is much harder to handle.)

As you see this form has not been changed since 2015. Therefore, it is quite easy to convert to HTML, create a template from it and insert your data where it needs to be placed. And, simple PHP libraries can create a PDF from that with ease.

There are tons of free sites online that will convert PDF’s to HTML for you. Some give better output than others. I tried this one: PDF-to-HTML … I downloaded that live form from the Texas site (130u) and put it into this site’s converter. It took forever! A long time! But, it gave me back an HTML version which can be the start of a template for you to use as a base for your system.

To go from HTML to PDF, I use a great library called DOMpdf. It is found here: DOMpdf Here is sample code to convert HTML and send out a PDF of it. simple code and can send it out to the browser or save it as a PDF file.

—// reference the Dompdf namespace
use Dompdf\Dompdf;
use Dompdf\Options;
require_once ‘lib/dompdf/autoload.inc.php’;

// instantiate and use the dompdf class
$dompdf = new Dompdf();
$dompdf->loadHtml($form1html);

// (Optional) Setup the paper size and orientation
$dompdf->setPaper(‘A4’, ‘portrait’);

// Render the HTML as PDF
$dompdf->render();

// Output the generated PDF to Browser
//$dompdf->stream();

// Save to file
$pdf_gen = $dompdf->output();
file_put_contents(“form1test.pdf”, $pdf_gen);

// Display PDF in browser
echo $dompdf->output();

So, your project can be done, but, not the way you are attempting to do it.


#12

So, I got a little explaining to do Lucy…LOL

As I’m still learning and researching this pdftk lib, I have found a file that is called PdfForm.php and a readme.md file that seem to explain this far better that I could attempt to.

It seems to have what you are explaining, especially about creating a FDF temp file and then filling in the PDF with that. It has other pieces of code in the ReadMe.md file that are used to do the actual filling of the pdf and to extract form fields form the pdf form itself with the use of the PdfForm.php. I have kinda started tweaking each of these to my use but definitely will need guidance on making it work correctly. These files can be found here:

Let me know if this is what you are speaking to and if it is doable with this. I like this option because it something I’ve been studying for one. Two, It looks like this method actually take form fields from the pdf and actually corrolates them to the MySQL form fields in a good fashion that I can follow. Unless I’m wrong and if I am, please tell me so. Your the expert here, like I said, what give the customer what they need in the easiest form possible.

Please understand there is still at least 5 to 8 other pages that I will have to conform this same process to, so I am looking for the easiest way possible.

Now I will say, the code you’ve given looks shorter, but don’t see where the data get posted to the pdf, so there will be more code for that. Creating the html form template like you suggested will take is more coding and then the time to adjust it to the pdf form so it prints correctly.

The code I’m sharing with you seems to take care of all this by one, extracting the field names and then assoicating them thru var the missing minor details I see or would assume to see is the connect.php and the code for requesting the reference_code and proper data to corroalte the var with column names.

Please look at it and tell me what you find and think. I really appreciate all your time and help


#13

Just saw your zip file response. You are saying I would have to add {some-tag} like {VehVin}, then in the php code, create a connection, then copy that data from the finaldetails table of the thetitl1_Livesite745 MySQL database into these tags.

I think I’ve already started to do that with the other code that I presented you with. I guess it’s a matter of posting it to see if what I am doing is making sense.

Again, thanks


#14

Here is my version of the PdfForm.php file. I have named it 130uPdfForm.php as that is the form I plan to do first. If it looks like this file will be needed in the exact same fashion for every pdf form i plan to complete I may decide to change the name back to PdfForm.php, but we’ll see who this goes

<?php
class PdfForm
{
    
    // this will reflect the actual PDF URL
    /*
    * Path to raw PDF form
    * @var string
    */
    private $pdfurl;

    // this will reflect the data request
    /*
    * Form data
    * @var array
    */
    private $data;

    // this will reflect where the final filled .pdf is stored on the server
    /*
    * Path to filled PDF form
    * @var string
    */
    private $output;

    // simply flattens the file
    /*
    * Flag for flattening the file
    * @var string
    */
    private $flatten;
    /**
    


    * Class Conctructor
    *
    * @param string $pdfurl
    * @param string $data
    */
    public function __construct($pdfurl, $data)
    {
        $this->pdfurl = $pdfurl;
        $this->data   = $data;
    }
    /**
    


    * Generate a filled PDF file
    *
    */
    private function generate()
    {
        $fdf = $this->makeFdf($this->data);
        $this->output = $this->tmpfile();
        exec("pdftk {$this->pdfurl} fill_form {$fdf} output {$this->output}{$this->flatten}");
        unlink($fdf);
    }
    /**
    

    * Extract fields information
    *
    * @param boolean $pretty
    * @return string
    */
    public function fields($pretty = false)
    {
        $tmp = $this->tmpfile();
        exec("pdftk {$this->pdfurl} dump_data_fields > {$tmp}");
        $con = file_get_contents($tmp);
        unlink($tmp);
        return $pretty == true ? nl2br($con) : $con;
    }
    /**
    

    * Generate FDF file
    * @param array $data
    * @return string
    */
    public function makeFdf($data)
    {
        $fdf = '%FDF-1.2
        1 0 obj<</FDF<< /Fields[';
        foreach ($data as $key => $value) {
            $fdf .= '<</T(' . $key . ')/V(' . $value . ')>>';
        }
        $fdf .= "] >> >>
        endobj
        trailer
        <</Root 1 0 R>>
        %%EOF";
        $fdf_file = $this->tmpfile();
        file_put_contents($fdf_file, $fdf);
        return $fdf_file;
    }
    /**
    

    * Set the flatten flag
    *
    * @return pdfWriter
    */
    public function flatten()
    {
        $this->flatten = ' flatten';
        return $this;
    }
    /**
    

    * Save the file
    *
    * @param string $path
    */
    public function save($path = null)
    {
        if (is_null($path)) {
            return $this;
        }
        if (!$this->output) {
            $this->generate();
        }
        $dest = pathinfo($path, PATHINFO_DIRNAME);
        if (!file_exists($dest)) {
            mkdir($dest, 0775, true);
        }
        copy($this->output, $path);
        unlink($this->output);
        $this->output = $path;
        return $this;
    }
    /**
    

    * Force-download the filled PDF file
    *
    */
    public function download()
    {
        if (!$this->output) {
            $this->generate();
        }
        $filepath = $this->output;
        if (file_exists($filepath)) {
            header('Content-Description: File Transfer');
            header('Content-Type: application/pdf');
            header('Content-Disposition: attachment; filename=' . uniqid(gethostname()) . '.pdf');
            header('Expires: 0');
            header('Cache-Control: must-revalidate');
            header('Pragma: public');
            header('Content-Length: ' . filesize($filepath));
            readfile($filepath);
            exit;
        }
    }
    /**
    

    * Create a temporary file and return the name
    *
    * @return string
    */
    private function tmpfile()
    {
        return tempnam(sys_get_temp_dir(), gethostname());
    }
}

Here my version of the Wrapper code that they said was needed to fill the PDF form

<?php

// I understand there is a MySQL server connection that needs to be made here too
require 'connect.php';

// this is the file needed to create the fdf and new pdf form
require 'PdfForm.php';

// I understand that all of these fields will need to be made var as they will reflect the data to be drawn from the finaldetails table and 
// then used in the $data to fill the FDF and PDF form
    var $reference_code = 'reference_code',
    var $VehVin6 = 'VehVin6',
    var $VehVin  = 'VehVin',
    var $VehPlt  = 'VehPlt',
    var $VehYr   = 'VehYr',
    var $VehMk   = 'VehMk',
    var $VehMod  = 'VehMod',
    var $VehBdy  = 'VehBdy',
    var $VehClr  = 'VehClr',
    var $VehMil  = 'VehMil',
    var $VehWght = 'VehWght',
    var $VehCry  = 'VehCry',
    var $ID#     = 'ID#',
    var $IDOrgn  = 'IDOrgn',
    var $PassOrgn = 'PassOrgn',
    var $Own1st  = 'Own1st',
    var $OwnMid  = 'OwnMid',
    var $OwnLst  = 'OwnLst',
    var $OwnEnty = 'OwnEnty',
    var $OwnCnty = 'OwnCnty',
    var $OwnAdd  = 'OwnAdd',
    var $OwnCty  = 'OwnCty',
    var $OwnSt   = 'OwnSt',
    var $OwnZp   = 'OwnZp',
    var $PreOwn  = 'PreOwn',
    var $PreCty  = 'PreCty',
    var $PreSt   = 'PreSt',
    var $Dealer  = 'Dealer#',
    var $LienDt  = 'LienDt',
    var $LienHld = 'LienHld',
    var $LienAdd = 'LienAdd',
    var $LienCty = 'LienCty',
    var $LienSt  = 'LienSt',
    var $LienZp  = 'LienZp',
    var $Sales   = 'Sales$',
    var $Tax     = 'Tax$' 


// then used in the $data to fill the FDF and PDF form, of course, form_name will need to be changed
$data = [
    'form_name'  => '$VehVin6',
    'form_name'  => '$VehVin',
    'form_name'  => '$VehPlt',
    'form_name'  => '$VehYr',
    'form_name'  => '$VehMk',
    'form_name'  => '$VehMod',
    'form_name'  => '$VehBdy',
    'form_name'  => '$VehClr',
    'form_name'  => '$VehMil',
    'form_name'  => '$VehWght',
    'form_name'  => '$VehCry',
    'form_name'  => '$ID#',
    'form_name'  => '$IDOrgn',
    'form_name'  => '$PassOrgn',
    'form_name'  => '$Own1st',
    'form_name'  => '$OwnMid',
    'form_name'  => '$OwnLst',
    'form_name'  => '$OwnEnty',
    'form_name'  => '$OwnCnty',
    'form_name'  => '$OwnAdd',
    'form_name'  => '$OwnCty',
    'form_name'  => '$OwnSt',
    'form_name'  => '$OwnZp',
    'form_name'  => '$PreOwn',
    'form_name'  => '$PreCty',
    'form_name'  => '$PreSt',
    'form_name'  => '$Dealer#',
    'form_name'  => '$LienDt',
    'form_name'  => '$LienHld',
    'form_name'  => '$LienAdd',
    'form_name'  => '$LienCty',
    'form_name'  => '$LienSt',
    'form_name'  => '$LienZp',
    'first_name' => '$Sales$',
    'form_name'  => '$Tax$'
];

// Data can be fetched from different sources like a database table, a JSON object or just an array as 
// we did in the above snippet.
//Data to be fetched from database:thetitl1_Livesite754 and table:finaldetails

$pdf = new PdfForm('/130u.pdf', $data);

$pdf->flatten()
    ->save('/130u-output.pdf')
    ->download();

Not sure if what I’m doing is right, but I’m trying to follow what I’ve gleaned so far. Please forgive me if my coding make you cringe…LOL


#15

I am still very confused on how you expect this to work. The PDF form that you posted is just a PDF and nothing more. It is a printable form and nothing more! It does NOT have fields in place anywhere. If you convert it to a FDF file, that is easy to do. But, changing the format of the file does NOT create fields for you. You still have to pull the file apart and guess where the fields are. They do not magically appear in the FDF file for you. Just not possible.

With that said, I think you are wasting your time on this project in the manner you imagine it will happen. You just can not create a field from nothing. If you spend $500 and buy the full version of Adobe Acrobat, you can add those fields to the PDF and then use your newly created FDF file with the fields in place and this project can continue. But, you can’t search for a field to fill in if it does not exist at all.

In my simple template version, you can place the fields in the HTML template file and then you are done. You would need to add a small loop to replace the fields in the template before you recreate the new PDF file with the live data in place. But, not the way you are thinking this will happen. Sorry to dampen you ideas, but, it just can not happen that way!


#16

I did some further reading of your routines. It is an odd process. It does identify certain areas of the file using string codes to figure out where check boxes and other items are, but, hard to “fill in” with data. And, the data might not fit and then the entire page would be off. It’s like changing a face in a picture. Can be done, but why waste the time…

I think the idea behind this is creative, but, does not work as well as a simple template. Waste of server processing time and can not be accurate as a template system. One small query to get the live data. One line “extract” command to create the variables. One small one line replacement loop to replace the items in the template with the live data and done. Much simpler.


#17

Good Morning ErnieAlex,
I see like me you are up an at it… chasing the dreams.

It’s funny that this file isn’t showing the fields when I opened it just yesterday in google drive all the fields appeared and could be filled. I took that same file and use the open with feature to open in acro reader right now and the fields appeared. I have a script for adobe acro pro that I registered for when I started looking into the process about a month ago…

However, since it seems that doing it your way sounds way less complicated. That’s going to be the better way to go. If it means that all that has to be done is having each page converted to an HTML template and then using PHP to retrieve data and fill the areas without the complication of first creating an fdf temp file and then filling in the areas. That would eliminate the hassle of ensuring that the fdf part of the code works right for each page and just have to worry about adapting this process to this first pdf and then repeating the process for the other pages.

With that said, I need to first find a converter and have an HTML template created. Gonna work on doing that this morning and since I have about 8 pages that are going to need to be done I might as well do it to all of them.

As for coding, can you help me with what that needs to look like and I use what you give me as an example to write the rest for each field or pdf myself. Doesn’t need to be a line for line, maybe an example of something like this that you’ve done for another page. I sure I can decipher my way through it to some extent. I’m here to follow your lead, but I do want to learn and understand the nuts and bolts of the process too.

Also, If you take a look at the final instruction page, (user : temp@abc.com/passwor: temporary) I have created buttons for each pdf that I would like to have customers click in order to generate the creation of each pdf. I have also provided a text box at the top of the page where customers can input a reference_code that was assigned when their info was submitted to assist with pulling the right data as opposed to having to generate lines of code to search it out. Please tell me what you think, as I suspect that the PHP coding that needs to be generated will have to deal with both of these aspects.

What I envision happening is creating PHP code for each file template. When the customer clicks, the PHP code acquires the reference code, makes a MySQL connection, retrieves data, populates the form and renders a completed pdf with options to download or print I believe this will need to be done separately for each pdf required and would make it easier when I adjust the instruct file to fit other services.

Thanks for all your insight and I’m excited to work the process.


#18

I have found what seems to be a good converter, called Icecream PDF Converter: https://icecreamapps.com/Download-PDF-Converter/

My question now is, do I need to convert the pdf into an HTML format and if not, what format is best for what I’m wanting to do… I looks like it supports Doc and Docx. So which is the best?

My other question is, if these pdf’s are password protected, then I’m assuming my only option would be to take a pdf, open in photoshop, save as jpeg and then convert that to HTML or other preferred format, which this converter seems to be able to do. Please advise, thanks


#19

Well, lets step back for a moment and review the possibilities. Your version is better if you actually do have the forms already in a format where others can enter the fields. But, if you have that option, just use that one form and let the users enter the data and submit it to you. Then, you can pull their entries out and save them. But, if as you said early on, you have the data already and want to enter it using a program, you can do it the way you are working it now. But, only if you have the editable version of the PDF file. I did not understand you had that available. The one you posted was a standard PDF from the Texas State site. That is where I got my copy since I could not edit it when I viewed it. BUT, I viewed it as a PDF and therefore viewed it in a browser. If you meant that you opened the PDF inside of Acrobat, most people do not have that option. They usually only have Acrobat-Reader which is free.

On the other hand, the way I proposed is so very simple. Except that you have to take the converted HTML version and add in all the template fields. This, of course is only done one time. Each field must be assigned a code like {vinNumber} or whatever inside the HTML code so it can be replaced by the live data.

Both ways have issues. Both have pre-processing that needs to be done. In your version, you need to pull out the locations of each possible field and create a data-template where you would replace each field with the live data. In my proposed version, you have to mark the HTML with a similar code. Both ways need to process the live data and replace pre-built layouts with the live data.

Since the forms seldom change, I would expect either way would require little set up. The code to populate the templates is less complicated in my version. I can create a sample for you with an adjusted HTML file for you. Then, you can make a decision which is better for you. I will post it shortly. BUT, for some reason, this site will not let me stick in a file in the live posts. So, I will send it to you in a private message instead.


#20

Very interesting! I found one of the conversions I had done in testing your PDF actually gave me an editable version of the PDF. Just found it. Therefore, I need to look into your PDFtk functions and use it as my base instead of the one you posted. Will update you in a bit… Still not sure which is the better process…