Whilst fishing out the Econet books I've been promising to scan since the week after they were published, I also found the Inter-Word Reference Manual, and as we'd been talking about it a lot lately, I did that first.
I set the book up on a little copywriter's easel, and took a picture of each double-page. It took about 15 minutes or so, much quicker than I would be able to do with a flatbed scanner.
Then I did a dummy crop of one of the images to get the X and Y offset and crop box size so I could process them all with convert:
Code: Select all
for i in *.JPG; do convert $i -crop 4919x3454+514+227 -resize 2048 IWord$i.jpg ; done
I did want to convert them to PNG but that just was taking far too long, but I can do that next time around if it's going to be more useful.
Here's a sample.
I've put the whole lot here:
http://www.beebmaster.co.uk/Downloads/S ... 0Manual.7z
And a PDF I made, also using convert:
http://www.beebmaster.co.uk/Downloads/S ... Manual.pdf
Not bad for a first effort is my verdict, I can probably improve the technique as I go along. Any tips gratefully appreciated.
I don't think I have a facility to make an OCR/searchable PDF, but if anyone knows how it can be done in Ubuntu or Raspbian, I'll gladly give it a go. Otherwise I'm happy to upload full-size cropped images if anybody else can help turn them into a nice PDF.