Barrass-Brough on Blogger: August 2012

16/05/12 15:25 [Wednesday]

I have been thinking about character recognition and visual field analysis again. The latest I was doing involved trying to settle on a best resolution, given a visual field, as a prerequisite before getting into the business of recognising objects at all. What I thought was that finding a measure of ‘busyness’ and observing how the measure altered as resolution increased might be the way to go. I see now that if instead of busyness I think in terms of information content, then what is certainly required is an optimal trade-off between that measure and the resolution since as resolution increases so processing cost increases. In other words in the animal kingdom would-be pattern recognisers need to gain maximum information (through recognising objects in the environment, ultimately) for the least possible expenditure of time and effort on processing.

What I have further thought is that the measure I developed of ‘clustering’ should be used as the measure of information content. Having toyed with simply counting black fragments (on the basis that many fragments means many objects being observed) it strikes me that whatever the number of fragments if they are better clustered it means they are better defined and thereby more likely to give up useful information through being recognised. Now I can measure clustering for a field of greyscale and this obviates the need to distinguish black from white. If a pixel at x_i has blackness (inverse greyscale 0 .. 255) b_i then the measure of clustering is

∑b_ib_j.exp -d(x_i - x_j) ²

In effect we are counting each unit of blackness as a separate black pixel.

I am wondering whether to use as a function of resolution giving an estimate of processing cost, ∑b_ib_j. The processing the computer does is adding up a lot of exponentials and processing cost is only saved in cases of b_i = 0, but for animal processing systems I feel they must model each unit of blackness separately which leads to very dark fields being puzzling and headachey.

30/07/12 13:17 [Monday]

About two weeks ago I wrote a program based on the ideas above, but found I needed to alter the measure to be maximised to

∑b_ib_j.exp -d(x_i - x_j) ² / (1/n)∑b_ib_j

where n is the number of pixels (width x height of the rectangular field). The reason is the numerator has a number of terms proportional to n rather than n ² because for each pixel i the multiplication is not by the b_j values over the entire field but only those for which exp -d(x_i - x_j) ² is non-negligible and this value is independent of the width or height of the field (as long as width and height are not too small).

Using this measure to find the best resolution for the field over my sample of cases (ie finding the resolution which maximises the measure of information content in ratio to the processing cost, as above) gives results like the following:

It must be admitted these divisions do correspond well with the natural scale of structures within each image. For the picture of the garden each quarter of it can be seen to be basically light (especially the quarter showing the sky) or dark. For the portion of a printed letter the reason the resolution arrived at is so high (corresponding in fact to the scale of the width of lines making up printed characters) is that the black print shows up so clearly against a very white background.

The question is where do I take this next? The next thing is to analyse each subdivision arrived at of the image, using the same technique of distinguishing light from dark at a natural grainsize. Repeated subdivision will end when cells are found which are not suitable candidates for further subdivision because they vary so little in greyscale across their entire size: this stage will be marked by very low values for the ratio measure defined above because really there will be no information content to speak of within each cell.

Barrass-Brough on Blogger

Tuesday, 7 August 2012

07/08/12 20:05 [Tuesday]

Friday, 3 August 2012

Character recognition