Let's discuss the matter further

Why Linux Prints Such Small Tax Forms

Note: IRS tax forms no longer seem to suffer from the problem with their MediaBox that they did when this article was written in April of 2006, but this article will remain online for reference.

My desktop Debian Linux system was printing miniature tax forms. Instead of allowing each form to comfortably take up an entire 8½×11 page, the forms were issuing from the printer reduced in size to the point of being unusable:


Miniature Taxes
Miniature Taxes. On the left is the miniature tax form which my Linux machine printed when I sent the PDF file from the IRS web site directly to the printer; on the right is the same form printed correctly thanks to the procedure outlined in this article.

The Internal Revenue Service performs a wonderful service by offering all of their forms and publications on the web in PDF format. Their consistent use of this standard document format means that the documents can be retrieved, read online, and printed by a wide variety of operating systems and applications. But it was discouraging on my particular desktop to download a tax form from the IRS and have it come out so small on the printer.

My Debian desktop runs the Common Unix Printing System (CUPS) which in turn runs the manufacturer’s printer drivers that Samsung included with my ML-1740 laser printer. What about the tax documents was encouraging the drivers to print them in a reduced size? The forms looked quite normal when viewed on the screen with xpdf, but, I noticed, had rather wider and more awkward margins when viewed with the PostScript viewer gv:


Schedule D in xpdf
Schedule D in xpdf. The tax form looks quite attractive when rendered by xpdf, and is framed by reasonable margins.

Schedule D in gv
Schedule D in gv. The Ghostscript-based viewer gv surrounds the document with ridiculously expansive margins.

The fact that the status boxes along the top of gv display the document size not as “Letter” but as “y1043×842” suggests that it is not correctly determining the document size. Checking the page size ourselves with the pdfinfo command reveals what is going on:

$ pdfinfo -box f1040sd.pdf
Title:          2005 Form 1040 (Schedule D)
Subject:        Capital Gains and Losses
Keywords:       Fillable
Author:         SE:W:CAR:MP
Creator:        OneForm Designer Plus
Producer:       Acrobat Distiller 6.0.1 (Windows)
CreationDate:   Thu Oct 20 08:33:42 2005
ModDate:        Fri Nov  4 11:37:25 2005
Tagged:         no
Pages:          2
Encrypted:      no
Page size:      842 x 1043 pts
MediaBox:           0.00     0.00   842.00  1043.00
CropBox:          114.98    20.59   727.02   812.60
BleedBox:         114.98    20.59   727.02   812.60
TrimBox:          114.98    20.59   727.02   812.60
ArtBox:           114.98    20.59   727.02   812.60
File size:      71366 bytes
Optimized:      no
PDF version:    1.4

The problem is that Ghostscript — the PostScript interpreter that drives both gv and my printer drivers — is using the large MediaBox as the page size, whose width and height are roughly 11.7×14.4 inches, rather than the smaller 8½×11-inch CropBox. So when forced to print to letter-sized paper, it helpfully reduces the document magnification in order to fit content that it thinks is almost twelve inches across within the eight and a half inches across a letter-sized page. See section 10.10.1 of the PDF Reference, entitled “Page Boudaries”, if you are interested in the details of what the various boxes mean; here I will only note that the IRS document, by having a MediaBox whose dimensions are different than its CropBox, appears to violate the recommendation of implementation note 161 in Appendix H of the Reference.

The solution is to convert the PDF tax form into PostScript ourselves so we can manually specify that we want the CropBox used as the dimensions of the resulting page. This is easily accomplished with the pdftops command, which will produce a PostScript file ending with .ps suitable for printing:

$ pdftops -pagecrop f1040sd.pdf
$ lp f1040sd.ps
request id is lp-170 (1 file(s))

The result is a correctly printed tax form.

this decorates the bottom of the main column this decorates the bottom of the screen

Powered by WordPress