HolySmoke
Posts: 2
Joined: Sat Jan 18, 2020 12:36 pm

How can I repair corrupted pdf files on raspberry pi?

Sat Jan 18, 2020 3:03 pm

Hi,

I have several pdf file that is corrupted/damaged.
The app "GoodReader" on my iPad have the capability to repair corrupted pdf. (Do it automatically if pdf can't open)

I have been trying to open the corrupted/damaged in Okular, E-book viewer, PDF Viewer on my Raspberry pi 4 with no luck.

My question is,
what software can automatically repair corrupted/damaged pdf on Raspberry pi 4?

User avatar
scruss
Posts: 3138
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: How can I repair corrupted pdf files on raspberry pi?

Sat Jan 18, 2020 4:19 pm

Depends on how corrupted the file is. mupdf will try to view broken files. It probably won't be as capable as GoodReader which is likely using a commercial PDF library.

I know of several manual tools, but those aren't what you asked for. PDF is a somewhat irritating format to fix: everything relies on hardcoded byte offsets in the data stream. It's pretty amazing they survive at all.

There is someone on this board who knows a lot about PDF but I haven't seen them around for a while.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

HolySmoke
Posts: 2
Joined: Sat Jan 18, 2020 12:36 pm

Re: How can I repair corrupted pdf files on raspberry pi?

Mon Jan 20, 2020 3:40 pm

Thanks for the answer.

MuPDF did open some of the corrupted files and make it easier to decide what to do with the pdf, keep the file or throwing it away.

What alternative do I have if I want to try the manual way?

User avatar
scruss
Posts: 3138
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: How can I repair corrupted pdf files on raspberry pi?

Mon Jan 20, 2020 7:24 pm

qpdf, pdftk (which may not be available on Raspbian any more), even ghostscript (gs) using its pdfwrite function to copy one pdf to another. Maybe some of the suggestions on PDF - Forensics Wiki. If you need to get the image objects out of a pdf, pdfimages from the Poppler tools is quite good.

All of these are cryptic command-line tools. Some are more robust than others. I can't think of one general fixer-upper tool I'd recommend over anything else: I discard corrupted PDFs as they're usually useless to me.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

gkaiseril
Posts: 679
Joined: Mon Aug 08, 2016 9:27 pm
Location: Chicago, IL

Re: How can I repair corrupted pdf files on raspberry pi?

Mon Jan 20, 2020 10:02 pm

Both are in the standard raspbian depositories and one should be able to install either or both. Be fore warned, if there is missing data recovery might not be possible. If the internal pointers to data blocks have been corrupted, then recovery should be possible. The internal PDF format if far more complex than Word's doc format.
f u cn rd ths, u cn gt a gd jb n cmptr prgrmmng.

User avatar
ksharindam
Posts: 162
Joined: Sat Jan 09, 2016 4:16 pm

Re: How can I repair corrupted pdf files on raspberry pi?

Wed Jan 22, 2020 1:57 pm

You can use the gs command from ghostscript package.

Code: Select all

gs -o outfile.pdf -sDEVICE=pdfwrite corrupted.pdf
this can fix errors in pdfs.

Return to “Raspberry Pi OS”