Detecting picture data

Discuss all aspects of programming here. From 8-bit through to modern architectures.
User avatar
geraldholdsworth
Posts: 294
Joined: Tue Nov 04, 2014 9:42 pm
Location: Inverness, Scotland
Contact:

Detecting picture data

Postby geraldholdsworth » Fri May 19, 2017 1:18 pm

Hi all,

Does anyone have an algorithm for working out whether a given block of data is likely to be picture data?

I'm not thinking of writing this on a BBC or RISC OS (Windows actually), but am looking for BBC screen/sprite data (which is not necessarily of, say 20K length for a MODE 2 screen, as could only be part of the screen) and telling it apart from code or other data. Sounds easy...I just can't get my head around how to do it programmatically.

Cheers,

Gerald.
Gerald Holdsworth
Repton Resource Page
www.reptonresourcepage.co.uk

User avatar
davidb
Posts: 1832
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Detecting picture data

Postby davidb » Fri May 19, 2017 1:41 pm

You could look for spans of multiple bytes with the same value. 6502 code isn't likely to contain a lot of those. Possibly narrow this down to bytes that represent runs of the same pixel colour.

User avatar
Rich Talbot-Watkins
Posts: 1090
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca

Re: Detecting picture data

Postby Rich Talbot-Watkins » Fri May 19, 2017 2:21 pm

A variant of that might be to count the percentage of pairs of adjacent bytes which differ by 3 bits or less (or some other threshold arrived at by experimentation). I'd expect image data to have a far higher percentage of such pairs than machine code, although other types of data might also come out as false positives.

paulb
Posts: 765
Joined: Mon Jan 20, 2014 9:02 pm

Re: Detecting picture data

Postby paulb » Fri May 19, 2017 3:50 pm

What I can imagine the statistics people doing is to analyse 6502 instructions and build up some kind of probability model (a Markov chain, maybe). Then, running through a sequence of bytes, things that don't match the predicted successor would probably suggest data rather than instructions.

This is all hand waving, of course, and you'd need to be careful with the operands for instructions, so perhaps any probable instruction would cause following bytes to be considered operands (according to that instruction's requirements), and the next item in the sequence would be obtained from the next instruction location rather than the next byte.

User avatar
geraldholdsworth
Posts: 294
Joined: Tue Nov 04, 2014 9:42 pm
Location: Inverness, Scotland
Contact:

Re: Detecting picture data

Postby geraldholdsworth » Sun May 21, 2017 11:07 am

Thank you - there is certainly some deep thought required to achieve this. Might take me some time, but some excellent pointers here.
Gerald Holdsworth
Repton Resource Page
www.reptonresourcepage.co.uk

paulb
Posts: 765
Joined: Mon Jan 20, 2014 9:02 pm

Re: Detecting picture data

Postby paulb » Sun May 21, 2017 12:49 pm

Somewhat related to this is part-of-speech tagging which is used in natural language processing to classify each word in a natural language text. I'm not claiming that such taggers are applicable here, but you can get a feel for the kind of thing I was suggesting by reading up a bit on that topic.


Return to “programming”

Who is online

Users browsing this forum: No registered users and 1 guest