Wednesday 5 June 2013

My ideas on Character Recognition

Character Recognition Summary written 06/03/13 10:26 [Wednesday]
Since my teenage years I have been interested in visual perception: how humans (and animals) see things and recognise them and thereby negotiate the world. Rather than go into the entire history of this interest of mine, though, my intention here is to summarise the programming work I have done over recent years to implement ideas I have had on how particularly humans view the world using their eyes.
In 2006 the computer I owned was capable, for about the first time since I had owned a computer, of doing in a reasonable time the lengthy computation I needed to compare printed characters (that is characters with exemplars to try to identify what the characters were, from the alphabet) according to a formula I had thought up for a measure of similarity. I published on my website a description of my ideas with some specific results from what I may call experimentation. Towards the end of 2006 I was grappling with the problem of printed characters running together through over-inking instead of standing separately, and thinking in terms of the aspect ratio characters have on average as a method for trying to separate individual characters.
This led in a natural course to my trying in 2007 to vary the greyscale threshold for distinguishing black from white in such a way as to separate out characters one from the next. Instead of looking for seeming characters with an aspect ratio in a particular range though I had the idea of measuring how fragments of black emerged with a raising of the greyscale threshold and preferring the range of threshold where the fragments appeared most stable. This had the advantages that the estimation could be done locally for neighbourhoods within the scan (so that shadows across part of a document would not throw the estimation out) and that the method could be applied to general images and not just pictures of objects within a known range of aspect ratio.
By 2008 instead of using the raw greyscale for each pixel I was doing what I call a ‘blackdensity’ computation so that black pixels locally to a given pixel will increase the measure of blackness at that pixel. By taking a local average in this way (but an average where closer pixels have more weight) mistaken measurements from the scan at particular pixels are evened out. One thing coming out of this methodology was that peaks of blackdensity (‘saliences’) could be counted and the way the count varied as resolution varied could be observed. This led to the hope that there would be a ‘natural’ scale of distances (that is, resolution) for a given pattern imaged so that the same object seen from different distances - giving the same pattern of saliences but scaled differently - might still be recognised.
From 2009 to 2011 because of the condition of mind I was in I got bogged down in too much detail and the work on ‘Visual field analysis’ was in abeyance, except for the general idea emerging in my mind of using a measure of ‘busyness’ of fragments to indicate how useful was the information contained in the pattern. The correct way to measure information in a greyscale pattern I now think is to compute what I call the clustermeasure. This has a formula very like the similaritymeasure which I was using in 2006, except that similaritymeasure is for two different patterns being compared. One thing I did achieve in 2009 was a very lucid explanation of clustermeasure and its additive simplicity as new clusters are added.