15.8.07

PDF tools

Some time ago i have scanned one book which has more than 500 pages. It took me quite some time, but since it was a working day, that was ok, boss didn't mind. ;)

Later i have discovered that i made a mistake - the scanned pages (with text mainly) had quality too good for a book and too big filesize as a result.

So my idea was to resize the pages and convert to a worse quality. Here is the whole procedure i have done:

1) i have installed a deb package "pdftk" (available in Ubuntu's repositories)
2) Image magick, the program that i usually use for image conversions gave too poor quality working with pdf files. The solution was to use program "pdfimages"
> pdfimages -j -f 1 -l 546 foo.pdf foo
3) Next step  is to rescale fresh jpeg's to lower size (1280 px in my case):
> mogrify -resize 1280 foo*
4) And then - let's convert it back to pdf. For that i have used simple loop:
> list=($(ls .)); for i in "${list[@]}"; do convert $i $i.pdf; echo converted file $i to $i.pdf; done;
"echo [...]" i used to follow the process execution in time
5) Ok, and the last step is to merge everything back:
> pdftk foo*.pdf cat output foo_result.pdf

Hooray! new file is about 4 times smaller (in my case) - not that i had not enough of space, but not enough of RAM (there is never enough of it) - it's much faster to browse the file now, while the quality is definately good enough.

I have used three programs for this task: pdftk, pdfimages and convert - they are all available in the official Ubuntu repositories.



No comments: