Fixing OCR support in gscan2pdf on Ubuntu 14.04 & derivatives

Edit: Jeffrey Ratcliffe, the very active developer of gscan2pdf, has released an update that fixes this bug. Ubuntu users can access it his PPA (see below).

In this post the other day I talked about my relatively painless experience upgrading to Xubuntu 14.04. Since then, I have discovered a couple of bugs in some OCR software I use fairly regularly.

Here is a solution to a slightly annoying regression in gscan2pdf, an otherwise great little PDF scanning, clean-up and OCR solution.

In Ubuntu 14.04 gscan2pdf has a bug in it’s tesseract OCR support meaning it appears to OCR the document but once completed no text is added to the OCR layer. Although the bug does not affect the gocr OCR engine, tesseract (which was developed by Google HP Labs) is a much better engine and the one I prefer to use.

My first attempt at rectifying the problem was to upgrade gscan2pdf to the latest version (from 1.2.3-1 to 1.2.4) which doesn’t seem to have made it into the Ubuntu 14.04 repos, a shame considering Trusty is an LTS release. On the upside Jeffrey Ratcliffe, gscan2pdf’s developer, has a PPA that contains the latest version, so upgrading was relatively painless. The process is well documented here on the RCLUBLINUX blog.

Unfortunately, the bug is not fixed in gscan2pdf 1.2.4 so the upgrade didn’t fix my problem.

A little poking about on the gscan2pdf Sourceforge page however, showed this bug report, and also patch to fix the problem contributed by user tzieg (Thomas Zieg?).

After applying the patch and firing up gscan2pdf I was glad to see tesseract again worked as expected, thanks Thomas!

Problem: After upgrading to Xubuntu 14.04 the tesseract OCR engine no longer worked in gscan2pdf.

Solution: Patch gscan2pdf using the patch supplied by Thomas Zeig.

Procedure: Download a copy of the patch from gscan2pdf’s Sourceforge bugtracker.

Copy the patch to the gscan2pdf directory.

sudo cp Tesseract.pm.patch /usr/share/perl5/Gscan2pdf/

Change to the gscan2pdf directory.

cd /usr/share/perl5/Gscan2pdf/

Apply the patch,

sudo patch -p0 < Tesseract.pm.patch

OCR with tesseract should now work as expected, easy.

 

Right, now to figure out why OCRFeeder crashes when exporting to PDF.

Advertisements

Xubuntu 14.04 – Notification Area Missing Icons

Yesterday I bit the bullet and upgraded my fairly stable Xubuntu install from 13.10 Saucy Salamander to 14.04 Trusty Tahr.

I had no pressing need to upgrade (aside from an the occasional reminder when I logged in that a new release was available) but since Trusty had been out for a few weeks I figured any show stopping bugs would be ironed out by now.

First, I have to comment on how painless the upgrade procedure has become, a couple of clicks and it was away. After about an hour or so spent downloading and installing updates, a reboot and a slightly extended initial login, everything seemed to be right where I left it. No longer are we faced fixing a bunch of small things that go awry during the upgrade process.

I did, however, find one minor annoyance. No longer did all my running apps (the ones that I want to anyway) show up in the notification area I have in the top left of my screen.

Notification area Missing Icons

Missing Icons

Conspicuously missing were Network Manager, Dropbox, Spideroak, KeePass and perhaps a few more, leaving me with just the volume control and power indicator icons showing. This was true even though each of my apps appeared to be running after being correctly started at login.

Indicator Plugin

Indicator Plugin

After a bit of poking around in the XFCE panel preferences I found that replacing the Notification Area applet with the Indicator Plugin applet all my application icons were restored.

This, however, left me with another dilemma, as Indicator Plugin also includes a bunch of icons for mail, bluetooth and keyboard that, although I could hide, I couldn’t easily remove. What I really wanted was for Notification Area to work the way it did before the upgrade.

Notification Area with Icons

Notification Area

After further investigation and a little google-fu, I found that by killing indicator-application-service my icons would reappear. A quick delve into ‘Sessions and Startup’ settings found in XFCE’s Settings Manager found this service (listed as Indicator Application) was started on login and by unticking the box next to it I could tell it not to start. Problem solved. Now my notification area looks the way I like it with grey and black icons showing and the more and out of place looking coloured icons nicely hidden away.

Session and Startup Properties

Session and Startup Properties

Problem: After upgrading to Xubuntu 14.04 some application icons no longer show in the notification area.

Solution: Stop indicator-application-service from starting at login.

Procedure:

  • Open XFCE Settings Manager and navigate to Session and Startup preferences.
  • Click on the Application Autostart Tab and scroll down to Indicator Application
  • Untick the tickbox.
  • Click close, log out and log back in again.

 

Annotating PDF with Okular

Native PDF annotating under linux has long been a bugbear of mine and something I’d almost given up hope ever being properly supported. Until, that is, I stumbled across this post describing the process in new versions of Okular.

Discovering this also led me to look deeper, and to discover that Evince also supports PDF annotations, and has done for quite some time! See this post for more information on Evince.

With luck, we’ll soon see the ability to simply add annotations and save, rather than requiring saving annotated PDFs as new documents in order for changes to remain.

groak@{subjects of research}

Once in a while I am looking around if there is finally a way to properly annotate PDF in Linux. The answer was no until a couple of months ago. But I think it is still little known.

Even in this post, whose comments made me have a close look again, did see the option of embedding annotations into PDF. The comments, however, point to Okular which is a very good reader since quite some time, and a more or less recent version of poppler the PDF library.

The way to go is to make annotations with Okular (use the review tool (F6)) and then save the PDF with “save as”. Now the annotations are embedded into the pdf file. I tested the annotations with the Adobe Android reader and I can view them and alter them with it.

Unfortunately this information is hidden in the Okular handbook and…

View original post 81 more words

Comments on everything: The incivility of internet communication

I wanted to write something that was triggered by a brief Google+ conversation and an even more brief Twitter conversation that I’ve recently had. What follows is my unrefined thoughts on the topic of the incivility of internet communications.

Do tell me – civilly because I do, and will, moderate your comments – what you think below.

Nasty internet comments and conversation is something I’ve been thinking about for a while and I’d like to try to get some of my thoughts in writing. I guess here is as good a place as any to do that. I apologise in advance if this is a bit long or sounds a bit ranty.

In response to a post I made on Google+, Lars wrote in response:

I think you are on the point, the main problem is the sense of anonymity you have online, which makes it easier to ignore the social norms.
I fear this will probably never be solved as it’s almost impossible to, online, recreate the “being watched” feeling that usually keeps people in line with the social norms.
The only internal guides left are empathy and the fundamental respect for others which, sadly, a lot of people, especially the young, seem to be lacking.

I don’t think I’m as pessimistic about the future of online relationships as is Lars, though there is plenty of reason to be concerned.

Let’s look at the concern briefly. There is much evidence to show that online communication, by and large, has lost much of civility that we expect in face to face communication. All one need do is to read the comments on just about any online article, especially in these days of hyper partisanship, those to do with politics.

A recent example from my country was some of the terrible slander our (recently deposed) prime minister was subject to – being our first female PM, much of it was very very sexist and absolutely not anything anyone would dare say in public. Anne Summers has a good run down on it here: http://annesummers.com.au/speeches/her-rights-at-work-r-rated/

In our own community, all we need do is look at some of the fanboyism that sparked this very conversation which all too easily turns from a difference of opinion to outright attack.

More worrying (to me at least), is that that many people seem happy to put their real names and images to these awful comments. A browse of some Facebook hate pages that spring up periodically is all one needs to see truly awful comments alongside people’s names and photos. To me, this says the problem runs deeper than perceived online anonymity.

I would, however, like to consider the issue from a different perspective.

I mentioned in my last post that online communication is still relatively new. And, despite the internet being round for some time now, I actually believe this to be the case when we hold it up against other forms of communication.

Take me for instance, I’ve been chatting online since the early(ish) days of the internet in the mid 1990s, and had been chatting on local BBS systems for a number of years before that. Compared to some of these young whippersnappers (get off my lawn) I could almost be considered an old hand.

Except I don’t represent a generation, or perhaps even half a generation.

When we talk about kids these days having no respect for social norms, what we’re taking about is a generation that is, by and large, finding their way in a communication medium that their parents haven’t even experienced.

Stay with me now, I know this is long but there is a payoff at the end. I promise.

My point is, despite its ubiquitous nature, the internet remains a new frontier for communication.

In this world, many of the social pressures we use to enforce norms of polite communication don’t exist, or don’t seem to exist, and people feel free to flout them.

What I do not think, is that this is a cause for too much alarm. After all, theories of social decline have been with us for generations, and we are yet to completely implode as a species.

What I want to propose it that before online communication becomes both normalised, and beholden to strict social norms, it will take at least a generation and a half, probably two.

In my view, what it will take for new social norms around internet communication to take hold is for the current generation (those who feel free to make nasty comments of any age) to begin to feel the ramification of such actions.

For people to lose their jobs and their livelihoods because they thought they were anonymous; for teenagers to find that they cannot simply delete their comments and be done with it, and for people who feel free to make hurtful comments to feel what it is like to be on the receiving end.

This generation will then be equipped to ensure those mistakes are not repeated, creating in the process a new social pressure to ensure peaceful communication.

And here’s the payoff.

There is hope, we are all humans and deep down, we all want the same thing. Getting there just may take a little more time than we’d like.

Recommended Listening: Cory Doctorow on The Command Line Podcast

Today’s recommended reading listening is an episode of Thomas Gideon‘s podcast The Command Line.

Whilst The Command Line is an excellent podcast and comes highly recommended in its own right, this episode is particularly recommended listening.

On this feature cast Thomas’ guest is writer, geek and activist Cory Doctorow, speaking to an audience in Washington DC about the themes his latest novel Homeland: information, freedom and networks.

Drawing upon his relationship with internet activist Aaron Swartz, Doctorow discusses the connection between personal liberty and access to information. Individual freedom, he says, relies upon a healthy access to the information we need to make informed decisions about the future of our lives and our polity.

Whilst historically, the ready flow of information has constrained in a number of ways, such access becomes even more constrained as we move into a networked world. A phenomenon that is exactly the opposite of what we have come to expect.

Today, says Doctorow, we are all constrained by the digital locks upon the devices we own and yet further by laws that make the investigation and removal of such locks a criminal offence.

For Doctorow, the best defence against regimes that seek to lock us out of control over the devices we own and the networks upon which we rely is an informed public working together though grassroots organisations to ensure government officials are aware of the danger of ignoring constituents in favour of corporate interests.

Here, Doctorow is one of the most important thinkers of our generation. His ability to look not only at the past and the present but also towards possible futures gives him an extraordinary ability stitch together a compelling narrative of how an existence the majority of us take for granted (individual freedom) is routinely curtailed by the very devices that promise to further unlock it. His in-depth views on the subject are well worth a listen.

Almost more importantly, however, is the rider to Doctorow’s presentation and the promise he made the the parents of Aaron Swartz; to speak of the danger of depression that many of us face every day.

In an age of connectivity it is easy to believe the perusal of a Facebook, Twitter or other social network stream means as much as a message, an email or phonecall when the truth is it isn’t.

The message here is clear, if you take enough time to watch from afar, take also the time to touch base for real. It may just make all the difference.

Like my page, The Command Line Podcast is released under a Creative Commons License, meaning you are free to download, share and remix the original work.

Adding a printer to Linux Mint, LMDE or Ubuntu: an Encore

Some time ago I blogged about the difficulty of installing printers under Linux Mint 12 and Ubuntu 11.10, a post that to date remains the most popular on this blog.

After messing about with installing printers again, I’d like to expand upon that post.

Recently I felt the need to change the OS on my primary laptop, a black Macbook 2,1.

Until then I had been using Linux Mint 12 and despite coming with the somewhat unpopular Gnome-Shell it had proved quite stable and usable.

This time, instead of moving to the latest regular Linux Mint release (currently Mint 14, Nadia), I decided to install Linux Mint Debian Edition (LMDE), the distribution that I use on my desktop in my office. LMDE, however proved less than ideal on the Macbook so I’ve since replaced it with Ubuntu 12.10.

When it came to installing printers under LMDE and Ubuntu I had hoped that I would not encounter the frustration I blogged about last time. Unfortunately the same problem exists under both distributions, so once again I was forced to utilise the Gnome 2.x printer configuration application, system-config-printer described in my earlier post.

As it turns out, I actually prefer the old Gnome 2.x printer application rather than the newer Gnome 3.x one that ships with Gnome-Shell, Cinnamon and Unity.

Although built upon GTK 2.x, it retains all the features that were present under Gnome 2.x (such as printer properties and the ability to easily delete jobs from the print queue) that for some reason seem to have gone AWOL in the Gnome 3.x printer application.

Unfortunately although the app is installed by default under Mint, LMDE and (I believe) Ubuntu, it does not appear in the menu for any of these DTEs

On the upside, there are at least two ways an application can be added to the menu with relative ease.

Problem: Gnome 2.x Printers Application does not appear in menu for Cinnamon, Gnome-Shell or Unity.

Solution 1: One way to add this application to the menu is to fire up the Alacarte Menu Editor (also called Main Menu) and add an entry for the printer application by hand.

Procedure: Check to see if alacarte and system-config-printer are installed by opening a terminal and typing the following:

$ sudo apt-get install alacarte system-config-printer

Now, Alacarte should be accessible under Accessories in Cinnamon or by searching in Gnome-Shell or Unity.

Failing that, it can be launched from the command line by typing the following command:

$ alacarte

Next, navigate to the sub-menu where you would like to add the new launcher, I use System Tools | Preferences.

AlacarteClick the ‘New Item’ button, add a name, comment and the command ‘system-config-printer’, find a nice icon (something like /usr/share/icons/gnome-colors-common/scalable/devices/printer.svg should do).

Alacarte2

Finally, click OK and you should be good to go.

Solution 2: A second, more elegant, way of making sure you have easy access to your printer settings is to add a .desktop file to your ~/.local/share/applications folder. This file is read by your desktop environment and a menu entry is automatically created for you.

I won’t go into detail on just what .desktop files are and how they are interpreted by your system, as Joe over at the Linux Critic blog has a great post titled the Anatomy of a .desktop File that does just that and I encourage you to go and read his post.

What I will do here is show you how to do what I have done on my system.

Procedure: First, open a new file called system-config-printer.desktop in your favourite text editor. As we know we need to save this file in our ~/.local/share/applications directory, lets go ahead and open it there straight away.

$ gedit ~/.local/share/applications/system-config-printer.desktop

Next copy and paste the following into the file.

#!/usr/bin/env xdg-open

[Desktop Entry]
Version=1.0
Type=Application
Terminal=false
Icon[en_AU]=printer1
Name[en_AU]=Printers (Non-Gnome Shell Config)A
Exec=system-config-printer
Comment[en_AU]=Traditional Gnome Printer Management Application
Name=Printers (Non-Gnome Shell Config)
Comment=Traditional Gnome Printer Management Application
Icon=/usr/share/icons/gnome-colors-common/22x22/devices/printer.png
Categories=Settings

Finally, save the file and exit your text editor.

Which ever of the above solutions you’ve followed you should now have a new Printers menu item under your Preferences sub-menu. If you don’t, go ahead a log out and back in again.

Configuring Synaptics Touchpad on a Macbook Under Linux

It has been a while since I’ve posted anything up but today I’ve got a couple of little touchpad tips.

My current everyday machine is an old Macbook 2,1 that I’ve had for quite a long time, just over 4 years if memory serves me correctly. Now most of the time I really enjoy the Apple hardware in this machine, if not the software. There is, however, the more than occasional time that I need to make the hardware play nice with the software I choose to use. Most of the time it’s not a case of a piece of hardware not working at all but – because it is Apple after all – it working in a way I don’t want, or mostly working but not just not perfectly. The Synaptics touchpad is one of these devices.

Today, after finally getting the shits with accidentally getting somewhat near the touchpad with my palm and again finding myself editing the wrong damn sentence of my PhD thesis, I got my google fu out to find a solution.

And what a solution I found. So good, in fact, that I need do nothing but direct you towards it as all I did was to pretty much take the configuration provided and drop it in the correct place. Job done.

However! Before you all get too carried away with Synaptics goodness lets just really quickly solve a problem I’ve found under Linux Mint once or twice. Namely, how does one configure touchpad options that Gnome Shell/Linux Mint does not reveal in the GUI? Shame this nice little solution will be well and truly superseded by what follows. Oh well.

Problem: In Linux Mint 12 with Gnome Shell, the GUI installed by default provides access to only a few of the settings that the Synaptics touchpad supports.

Solution: Installing the package  gpointing-device-settings installs ‘Pointing Devices‘ settings GUI.

Procedure:
$ sudo aptitude install gpointing-device-settings

Right, now that’s out of the way, if you came here to really configure your Synaptics touchpad you’re going to want to get to many more settings than are revealed through gpointing-device-settings, and for that you’ll need a commandline and synclient. What you really want to do, however, is head over to the useless use of cat blog and read the excellent post on ‘Tuning the Macbook touchpad in Linux’.