Fixing OCR support in gscan2pdf on Ubuntu 14.04 & derivatives

Posted on May 11, 2014 by pseudomorph

Edit: Jeffrey Ratcliffe, the very active developer of gscan2pdf, has released an update that fixes this bug. Ubuntu users can access it his PPA (see below).

In this post the other day I talked about my relatively painless experience upgrading to Xubuntu 14.04. Since then, I have discovered a couple of bugs in some OCR software I use fairly regularly.

Here is a solution to a slightly annoying regression in gscan2pdf, an otherwise great little PDF scanning, clean-up and OCR solution.

In Ubuntu 14.04 gscan2pdf has a bug in it’s tesseract OCR support meaning it appears to OCR the document but once completed no text is added to the OCR layer. Although the bug does not affect the gocr OCR engine, tesseract (which was developed by ~~Google~~ HP Labs) is a much better engine and the one I prefer to use.

My first attempt at rectifying the problem was to upgrade gscan2pdf to the latest version (from 1.2.3-1 to 1.2.4) which doesn’t seem to have made it into the Ubuntu 14.04 repos, a shame considering Trusty is an LTS release. On the upside Jeffrey Ratcliffe, gscan2pdf’s developer, has a PPA that contains the latest version, so upgrading was relatively painless. The process is well documented here on the RCLUBLINUX blog.

Unfortunately, the bug is not fixed in gscan2pdf 1.2.4 so the upgrade didn’t fix my problem.

A little poking about on the gscan2pdf Sourceforge page however, showed this bug report, and also patch to fix the problem contributed by user tzieg (Thomas Zieg?).

After applying the patch and firing up gscan2pdf I was glad to see tesseract again worked as expected, thanks Thomas!

Problem: After upgrading to Xubuntu 14.04 the tesseract OCR engine no longer worked in gscan2pdf.

Solution: Patch gscan2pdf using the patch supplied by Thomas Zeig.

Procedure: Download a copy of the patch from gscan2pdf’s Sourceforge bugtracker.

Copy the patch to the gscan2pdf directory.

sudo cp Tesseract.pm.patch /usr/share/perl5/Gscan2pdf/

Change to the gscan2pdf directory.

cd /usr/share/perl5/Gscan2pdf/

Apply the patch,

sudo patch -p0 < Tesseract.pm.patch

OCR with tesseract should now work as expected, easy.

Right, now to figure out why OCRFeeder crashes when exporting to PDF.

Configuring Synaptics Touchpad on a Macbook Under Linux

Posted on May 3, 2012 by pseudomorph

It has been a while since I’ve posted anything up but today I’ve got a couple of little touchpad tips.

My current everyday machine is an old Macbook 2,1 that I’ve had for quite a long time, just over 4 years if memory serves me correctly. Now most of the time I really enjoy the Apple hardware in this machine, if not the software. There is, however, the more than occasional time that I need to make the hardware play nice with the software I choose to use. Most of the time it’s not a case of a piece of hardware not working at all but – because it is Apple after all – it working in a way I don’t want, or mostly working but not just not perfectly. The Synaptics touchpad is one of these devices.

Today, after finally getting the shits with accidentally getting somewhat near the touchpad with my palm and again finding myself editing the wrong damn sentence of my PhD thesis, I got my google fu out to find a solution.

And what a solution I found. So good, in fact, that I need do nothing but direct you towards it as all I did was to pretty much take the configuration provided and drop it in the correct place. Job done.

However! Before you all get too carried away with Synaptics goodness lets just really quickly solve a problem I’ve found under Linux Mint once or twice. Namely, how does one configure touchpad options that Gnome Shell/Linux Mint does not reveal in the GUI? Shame this nice little solution will be well and truly superseded by what follows. Oh well.

Problem: In Linux Mint 12 with Gnome Shell, the GUI installed by default provides access to only a few of the settings that the Synaptics touchpad supports.

Solution: Installing the package gpointing-device-settings installs ‘Pointing Devices‘ settings GUI.

Procedure:
$ sudo aptitude install gpointing-device-settings

Right, now that’s out of the way, if you came here to really configure your Synaptics touchpad you’re going to want to get to many more settings than are revealed through gpointing-device-settings, and for that you’ll need a commandline and synclient. What you really want to do, however, is head over to the useless use of cat blog and read the excellent post on ‘Tuning the Macbook touchpad in Linux’.

Shotwell – unable to upgrade library.

Posted on March 16, 2012 by pseudomorph

Yesterday I wrote a post about a permissions error I came across when when trying to move items to the trash in Linux Mint. Today I’d like to do a follow-up post on a very similar issue I came across when I fired up Shotwell the other day.

As the last post was fairly involved I’ll keep this one brief, however, do check that post if you need a little more detail as it covers very similar ground.

Problem: When attempting to start Shotwell the application gives the following error:

“Shotwell was unable to upgrade your photo library from version 0.9.3 (schema 12) to 0.11.6 (schema 14). “

Solution: Ensure that the user has read/write permissions to Shotwell’s configuration directory.

Procedure:

Open a terminal, and check the permissions to ~/.shotwell

ls -al ~/.shotwell

If you find, like I did, that Shotwell’s data and thumbs directories are owned by root rather than your user then:

sudo chown -R $USER:$USER .shotwell/

It may also be a permissions error where your user doesn’t have the correct permissions to read/write to these directories. In this case then the following should fix it:

sudo chmod -R 755 .shotwell/

Fire up Shotwell and all should be dandy.

/pseudomorph

technology, culture and politics: an antipodean perspective

Tag Archives: LinuxMint

Fixing OCR support in gscan2pdf on Ubuntu 14.04 & derivatives

Configuring Synaptics Touchpad on a Macbook Under Linux

Shotwell – unable to upgrade library.