This is a little story that happened a few days ago, it explains well how I usually get involved into ports in OpenBSD.
1 - Lurking into ports/graphics/
At first, I was looking in various ports there are in the graphics category, searching for an image editor that would run correctly on my offline laptop. Grafx2 is laggy when using the zoom mode and GIMP won’t run, so I just open ports randomly to read their pkg/DESCR file.
This way, I often find gems I reuse later, sometimes I have less luck and I only tried 20 ports which are useless to me. It happens I find issues in ports looking randomly like this…
2 - Find the port « comix »
Then, the second or third port I look at is « comix », here is the DESCR file.
Comix is a user-friendly, customizable image viewer. It is specifically designed to handle comic books, but also serves as a generic viewer. It reads images in ZIP, RAR or tar archives (also gzip or bzip2 compressed) as well as plain image files.
That looked awesome, I have lot of books as PDF I want to read but it’s not convenient in a “normal” PDF reader, so maybe comix would help!
3 - Using comix
Once comix was compiled (a mix of python and gtk), I start it and I get errors opening PDFs… I start it again from console, and in the output I get the explanation that PDF files are not usable in comix.
Then I read about the CBZ or CBT files, they are archives (zip or tar) containing pictures, definitely not what a PDF is.
4 - mcomix > comix
After a few searches on the Internet, I find that comix last release is from 2009 and it never supported PDF, so nothing wrong here, but I also found comix had a fork named mcomix.
mcomix forked a long time ago from comix to fix issues and add support for new features (like PDF support), while last release is from 2016, it works and still receive commits (last is from late 2019). I’m going for using comix!
5 - Installing mcomix from ports
Best way to install a program on OpenBSD is to make a port, so it’s correctly packaged, can be deinstalled and submit to ports@ mailing list later.
I did copy comix folder into mcomix, use a brain dead sed command to replace all occurrence of comix by mcomix, and it mostly worked! I won’t explain little details, but I got mcomix to work within a few minutes and I was quite happy! Fun fact is that comix port Makefile was mentioning mcomix as a suggestion for upgrade.
6 - Enjoying a CBR reader
With mcomix installed, I was able to read some PDF, it was a good experience and I was pretty happy with it. I’ve spent a few hours reading, a few moments after mcomix was installed.
7 - mcomix works but not all the time
After reading 2 longs PDFs, I got issues with the third, some pages were not rendered and not displayed. After digging this issue a bit, I found about mcomix internals. Reading PDF is done by rendering every page of the PDF using mutool binary from mupdf software, this is quite CPU intensive, and for some reason in mcomix the command execution fails while I can do the exact same command a hundred time with no failure. Worse, the issue is not reproducible in mcomix, sometimes some pages will fail to be rendered, sometimes not!
8 - Time to debug some python
I really want to read those PDF so I take my favorite editor and start debugging some python, adding more debug output (mcomix has a -W parameter to enable debug output, which is very nice), to try to understand why it fails at getting output of a working command.
Sadly, my python foo is too low and I wasn’t able to pinpoint the issue. I just found it fail, sometimes, but I wasn’t able to understand why.
9 - mcomix on PowerPC
While mcomix is clunky with PDF, I wanted to check if it was working on PowerPC, it took some times to get all the dependencies installed on my old computer but finally I got mcomix displayed on the screen… and dying on PDF loading! Crash seems related to GTK and I don’t want to touch that, nobody will want to patch GTK for that anyway so I’ve lost hope there.
10 - Looking for alternative
Once I knew about mcomix, I was able to search the Internet for alternatives of it and also CBR readers. A program named zathura seems well known here and we have it in the OpenBSD ports tree.
Weird thing is that it comes with two different PDF plugins, one named mupdf and the other one poppler. I did try quickly on my amd64 machine and zathura was working.
11 - Zathura on PowerPC
As Zathura was working nice on my main computer, I installed it on the PowerPC, first with the poppler plugin, I was able to view PDF, but installing this plugin did pull so many packages dependencies it was a bit sad. I deinstalled the poppler PDF plugin and installed mupdf plugin.
I opened a PDF and… error. I tried again but starting zathura from the terminal, and I got the message that PDF is not a supported format, with a lot of lines related to mupdf.so file not being usable. The mupdf plugin work on amd64 but is not usable on powerpc, this is a bug I need to report, I don’t understand why this issue happens but it’s here.
12 - Back to square one
It seems that reading PDF is a mess, so why couldn’t I convert the PDF to CBT files and then use any CBT reader out there and not having to deal with that PDF madness!!
13 - Use big calibre for the job
I have found on the Internet that Calibre is the most used tool to convert a PDF into CBT files (or into something else but I don’t really care here). I installed calibre, which is not lightweight, started it and wanted to change the default library path, the software did hang when it displayed the file dialog. This won’t stop me, I restart calibre and keep the default path, I click on « Add a book » and then it hang again on file dialog. I did report this issue on ports@ mailing list, but it didn’t solve the issue and this mean calibre is not usable.
14 - Using the command line
After all, CBT files are images in a tar file, it should be easy to reproduce the mcomix process involving mutool to render pictures and make a tar of that.
I found two ways to proceed, one is extremely fast but may not make pages in the correct order, the second requires CPU time.
Making CBT files - easiest process
The first way is super easy, it requires mutool (from mupdf package) and it will extract the pictures from the PDF, given it’s not a vector PDF, not sure what would happen on those. The issue is that in the PDF, the embedded pictures have a name (which is a number from the few examples I found), and it’s not necessarily in the correct order. I guess this depend how the PDF is made.
$ mutool extract The_PDF_file.pdf $ tar cvf The_PDF_file.tar *jpg
That’s all you need to have your CBT file. In my PDF there was jpg files in it, but it may be png in others, I’m not sure.
Making CBT files - safest process (slow)
The other way of making pictures out of the PDF is the one used in mcomix, call mutool for rendering each page as a PNG file using width/height/DPI you want. That’s the tricky part, you may not want to produce pictures with larger resolution than the original pictures (and mutool won’t automatically help you for this) because you won’t get any benefit. This is the same for the DPI. I think this could be done automatically using a correct script checking each PDF page resolution and using mutool to render the page with the exact same resolution.
As a rule of thumb, it seems that rendering using the same width as your screen is enough to produce picture of the correct size. If you use large values, it’s not really an issue, but it will create bigger files and take more time for rendering.
$ mutool draw -w 1920 -o page%d.png The_PDF_file.pdf $ tar cvf The_PDF_file.tar page*.png
You will get PNG files for each page, correctly numbered, with a width of 1920 pixels. Note that instead of tar, you can use zip to create a zip file.
15 - Finally reading books again
After all this LONG process, I was finally able to read my PDF with any CBR reader out there (even on phone), and once the process is done, it uses no cpu for viewing files at the opposite of mcomix rendering all the pages when you open a file.
I have to use zathura on PowerPC, even if I like it less due to the continuous pages display (can’t be turned off), but mcomix definitely work great when not dealing with PDF. I’m still unsure it’s worth committing mcomix to the ports tree if it fails randomly on random pages with PDF.
16 - Being an open source activist is exhausting
All I wanted was to read a PDF book with a warm cup of tea at hand. It ended into learning new things, debugging code, making ports, submitting bugs and writing a story about all of this.