A week and a half ago I started looking into PDF generation with PHP. As my friends pointed out, this can be a somewhat arduous process. Since I was working on a deadline1, I decided the best solution would be to create a nice printable view that I could create PDFs from, then come back to the actual PDF generation if I had time.
Getting the printable view done was pretty easy, and once I had it ready I thought “surely2 someone has to have created an HTML to PDF generator”. A little looking around and I discovered dompdf, an LGPL PHP5 library which does exactly that.
The simple example page worked great, but then I ran into a couple of stumbling blocks early when trying to create a PDF of my printable HTML page:
- dompdf doesn’t gracefully handle
IMG
tags with empty src attributes. I hadn’t gotten around to setting up the images yet, so I just removed theIMG
tag and that let me to… - dompdf doesn’t seem to handle
TH
(table heading) tags properly. I got a DOMText::getAttribute() error that I was unable to find reports of online. Finally I tried removing theTH
tags3 and everything worked great.
After that, it was just a matter of tweaking the CSS and HTML until I was happy with it. Total time to get from start to application generated PDFs: 3 days. Day 1: create printable HTML view. Day 2: integrate dompdf and get it all working. Day 3: add the polish of breaking tables at pages, etc.
There are some quirks, but the FAQ and forums have a lot of answers and after a short time digging around you’ll find answers to most of your questions. Here are a couple of tips that should be useful:
- The page canvas the dompdf uses seems to be 600px wide. I set my
BODY
tag towidth: 560px
andpadding: 0 20px;
and that has worked well. - dompdf doesn’t like all CSS shortcuts. In particular,
font: 12pt Helvetica;
andbackground: #999;
didn’t work as well as explicitly setting thefont-family
andfont-size
separately, and setting thebackground-color
. - If you want to create page breaks, this is your huckleberry (also in the FAQ, I missed it at first glance).
- Adding page numbers is pretty easy.
This has been a huge time saver for me, many thanks to Benj Carson and the other contributors to the project.
- I needed to be able to generate PDFs of this data by the end of the month. [back]
- …and don’t call me Shirley. [back]
- I replaced them with
TD
tags – this HTML is only to print or generate a PDF so validation and semantic HTML goodness take a backseat. Perhaps I’ll see how difficult it is to addTH
support and contribute it back (in my copious spare time). [back]
Welcome back to the world of adding value. 😉 New shit has come to light!
Good call on sacrificing the validation and semantics to getting it done. I bet you’ll get around to adding the code back for TH’s, too, because you’re just that kind of itch-scratcher.
You can also take a look at FPDF or UFPDF which adds UTF-8 support. Yet another option is TCPDF also based on FPDF.
Personally I had more success with those since they didn’t have the added funnybusiness of a DOM parser posing as a HTML/CSS parser. HTML/CSS are rather complicated, so there’s a really large margin of error. Not to mention the transformation of that into PDF seemed a bit slow to me.
Haven’t done PDF in a while though.
Robert, you’re pointing me to the very thing I was trying to avoid. With dompdf I can create a PDF by running my printable HTML view through it instead of having to code up all the PDF stuff.
I was solving the very problem a while ago. mozilla2ps came to rescue. it renders with the mozilla engine via xulrunner, so all css / layout are like what you see. the small caveat is that you have to run a Xvfb since it requires an x server but doesn’t actually uses it.
Unfortunately all PHP PDF libraries (except the $1000 libpdf) are pretty useless with UTF-8 input data.
Even though UFPDF and TCPDF promise to handle UTF-8, they do it by just embedding a full TTF font in the PDF. If you have just one language to handle this is fine, but when you need full Unicode range, embedding a 20MB font isn’t feasible.
The correct answer to this problem would be font subsetting, which would only embed those glyphs which are actually used in the document. But so far no free PHP lib I know of supports this.
I’d be happy to be proved wrong though.
[…] Just a link for my own future knowledge. Thanks Alex. […]
I personally use FPDF for PDF creation due to its ease of use. I used to have some problems with the text-wrapping in PDF documents, and FPDF was the one which appeared to have the simplest solution and so I’ve stuck with it ever since!
[…] https://alexking.org/blog/2007/11/25/dompdf-php-pdf-generation […]
Good, i want test this, thx for review.
Is there a way you can find out the set of unicode fonts which “cover” a set of given characters? that way, one could only embed these fonts instead of adding the huge large set?
I’ve just added a few notes about dompdf on my blog:
http://www.deepvoidlog.com/?p=129
I think that dompdf is a nice software, perhaps lacking in official support. I had many difficulties in having it showing correctely images in procedurally generated headers, and the official docs are not exhaustive about this matter.
I recently embedded dompdf into a personal project, and encountered troubles with PHP’s memory limits; there’s nothing about it in the docs, and contacting the author is somewhat an useless job. Is the dompdf’s development suspended?
We have been using domPDF successfully now for about 9 months in our commercial product. We like for all the reasons that the author expressed, and have found it easy to use and fairly reliable in translating the HTML + CSS.
We knew the conversion was very memory resource/intensive but our server seemed to handle this OK until recently. At the moment though we are having problems generating either larger or sequential PDFs. The font metrics require a lot of memory, and is processed through an eval. When we try to process multiple documents as PDF, whether in a single document or in multiple sequential documents we get a PHP crash due to memory limits.
I agree with you Alex. It’s easy to implement dompdf but I am having some problems with table layout.
I have a table with 4 columns and each column has width=25%. It’s displayed nicely except when the table is right at the top of the page – the cells’ width are anything but 25% then.
Have you ever experienced this problem?
Hey, I found this post about dompdf and was curious if you were successful adding images to your pdf files. I have successfully gotten them (via normal html img tags) to display in the pdf, but haven’t figure out how to position them where I want, or overlay them on top. It seems to always want to push the text down. Haven’t been able to make the img a background-image via CSS (doesn’t show up in pdf). And haven’t been able to figure out how to implement dompdf’s image() function, with Canvas::open_object() or whatever…
Any clue?
Shwaa,
Did you ever find an answer to this? I’m having the same problems…
benjaminkeith@gmail.com
Thx
Hi Shwaa & Benjamin
You would use the css as it is used in the file background_images.html under the folder – dompdf/www/test/
If you still feel some problem, youwould contact me at-
singhrvijay@gmail.com
[…] Nekaj nasvetov pri generiranju PDF-jev s pomoÄjo DOMPDF […]
For the benefit of anyone who comes across this post from a search, as I just did, dompdf does now appear to support the tag – I use it in my projects and it works fine (centres the text and puts it in bold as I would expect).
That should be ‘support the th tag’ – HTML stripped without any warning…
[…] both but couldn’t get either to work satisfactorily. Then I found out about DomPDF, which Alex King has discussed previously. It’s a nice application with some good documentation and has been […]
Hello, if anyone is interested, dompdf is now back under active development, and the official project’s page is now on Google Code :
http://code.google.com/p/dompdf/
The code in the SVN trunk has lots of new features and fixes. A new 0.6 beta will come in the next weeks.
I’m having problems creating long tables which have data spanning across multiple pages. thead doesn’t work. and i’m not able to maintain table header on each page.
Any help