Inter-Word Reference Manual

avoid work duplication! collaborate on the archival of acorn literature!
Post Reply
User avatar
BeebMaster
Posts: 3313
Joined: Sun Aug 02, 2009 5:59 pm
Location: Lost in the BeebVault!
Contact:

Inter-Word Reference Manual

Post by BeebMaster » Mon Jun 22, 2020 10:41 am

After years of feeling ashamed that I never scan anything, yesterday I decided to do something about it. Recently I got a basic tripod and a remote control switch for my camera, I thought it would help to get crisper pictures of things, and not having much to photograph just at the minute, I decided I would try it out on books.

Whilst fishing out the Econet books I've been promising to scan since the week after they were published, I also found the Inter-Word Reference Manual, and as we'd been talking about it a lot lately, I did that first.

I set the book up on a little copywriter's easel, and took a picture of each double-page. It took about 15 minutes or so, much quicker than I would be able to do with a flatbed scanner.

Then I did a dummy crop of one of the images to get the X and Y offset and crop box size so I could process them all with convert:

Code: Select all

for i in *.JPG; do convert $i -crop 4919x3454+514+227 -resize 2048 IWord$i.jpg ; done
That was the most time consuming part, took 35 minutes to do 56 images. I've no idea why it took so long, I used to be able to do a manual select..crop..apply..resize..apply...save as.. quicker than that! At least it's automated so I can go away and do something else whilst it's running.

I did want to convert them to PNG but that just was taking far too long, but I can do that next time around if it's going to be more useful.

Here's a sample.
IWord11.jpg
I've put the whole lot here:

http://www.beebmaster.co.uk/Downloads/S ... 0Manual.7z

And a PDF I made, also using convert:

http://www.beebmaster.co.uk/Downloads/S ... Manual.pdf

Not bad for a first effort is my verdict, I can probably improve the technique as I go along. Any tips gratefully appreciated.

I don't think I have a facility to make an OCR/searchable PDF, but if anyone knows how it can be done in Ubuntu or Raspbian, I'll gladly give it a go. Otherwise I'm happy to upload full-size cropped images if anybody else can help turn them into a nice PDF.
Image

User avatar
flaxcottage
Posts: 4075
Joined: Thu Dec 13, 2012 8:46 pm
Location: Derbyshire
Contact:

Re: Inter-Word Reference Manual

Post by flaxcottage » Mon Jun 22, 2020 10:51 pm

Considering that the scanning was done using a digital camera that ain't half bad. :D

I bought a Czur ET-16 scanner to do the same thing and its output was just as good as yours. :lol: Considering its price and the advertising hype I was underwhelmed by the scan quality and the scanning software. Both the scanner and the software were buggy and not really fit for purpose.

You would get better quality with a flat-bed scanner but it would take ages. If you scanned at brightness +16 and contrast +40, the bleed through from the page behind would be virtually removed.

I also looked at FineReader from ABBYY. It had some very good points and could batch split your scans into single pages, batch crop them and batch whiten the pages followed by creating a pdf of the book. With practice that could take 5 minutes or so. It will also make searchable pdfs. The snag is the cost! :?
- John

Image

User avatar
BeebMaster
Posts: 3313
Joined: Sun Aug 02, 2009 5:59 pm
Location: Lost in the BeebVault!
Contact:

Re: Inter-Word Reference Manual

Post by BeebMaster » Mon Jun 22, 2020 11:19 pm

I did it again today and it was a bit better, it's going to be a process of trial and error. The problem with a lot of these books is that they are printed on slightly shiny paper which always causes a reflection making the ink look a shade of grey rather than black.

Also I did the TCP/IP installation guide, but accidentically skipped one of the pages so I'll have to do it again. Then I did the AUN Manager's guide, but it wouldn't focus on one of the pages which was nearly blank so I set it to manual focus which looked all right through the viewfinder but quite a few of the subsequent images have come out blurred.
Image

User avatar
flaxcottage
Posts: 4075
Joined: Thu Dec 13, 2012 8:46 pm
Location: Derbyshire
Contact:

Re: Inter-Word Reference Manual

Post by flaxcottage » Tue Jun 23, 2020 7:54 am

You could get rid of the reflection on shiny paper by having lights lower than the camera lens but angled across the paper and use a low light environment. LED light strips could be good here.
- John

Image

User avatar
BeebMaster
Posts: 3313
Joined: Sun Aug 02, 2009 5:59 pm
Location: Lost in the BeebVault!
Contact:

Re: Inter-Word Reference Manual

Post by BeebMaster » Tue Jun 23, 2020 4:03 pm

By applying brightness & contrast filters using the second picture set, I've got a much better result. I can also carve up each photograph into the left and right pages in the same operation:

Code: Select all

time for i in *.JPG; do convert $i -crop 1635x2335+1232+1459 -brightness-contrast +24 -contrast-stretch 2% IWord$i-left.jpg; convert $i -crop 1635x2335+2867+1459 -brightness-contrast +24 -contrast-stretch 2% IWord$i-right.jpg; echo $i; done
Here's page 16 again:
IWordIMG_6789.JPG-left.jpg
For some reason the last picture (which is a blank left page and the blank inside back cover) always goes wacky with those settings, even though there are other blank pages which come out all right:
IWordIMG_6834.JPG-left.jpg
but I'm sure that can be omitted from the final PDF.
Image

User avatar
BeebMaster
Posts: 3313
Joined: Sun Aug 02, 2009 5:59 pm
Location: Lost in the BeebVault!
Contact:

Re: Inter-Word Reference Manual

Post by BeebMaster » Tue Jun 23, 2020 4:46 pm

Dear me (he said before the watershed), having trouble with PDFs again today!

Code: Select all

pi@raspberrypi:/media/BMNFS/BBC/Manuals/Scanning/ISW Scanning/Inter-Word Reference Manual v2 $ convert IW*.jpg IWordRefMan.pdf
convert-im6.q16: unknown `IWordRefMan.pdf' @ error/pdf.c/WritePDFImage/2031.
pi@raspberrypi:/media/BMNFS/BBC/Manuals/Scanning/ISW Scanning/Inter-Word Reference Manual v2 $ convert IW*.jpg IWordRefMan.pdf

libgomp: Thread creation failed: Resource temporarily unavailable
pi@raspberrypi:/media/BMNFS/BBC/Manuals/Scanning/ISW Scanning/Inter-Word Reference Manual v2 $ convert IW*.jpg IWordRefMan.pdf
convert-im6.q16: unknown `IWordRefMan.pdf' @ error/pdf.c/WritePDFImage/2031.
pi@raspberrypi:/media/BMNFS/BBC/Manuals/Scanning/ISW Scanning/Inter-Word Reference Manual v2 $ convert 'IW*.jpg' IWordRefMan.pdf
pi@raspberrypi:/media/BMNFS/BBC/Manuals/Scanning/ISW Scanning/Inter-Word Reference Manual v2 $ 
And that's after it froze completely and I took the overclocking down a notch, then it still froze, so I've disabled overclocking altogether for the time being. What is it with the Pi4 and PDFs??? Something the Pi developers need to fix!
Image

User avatar
richardtoohey
Posts: 3900
Joined: Thu Dec 29, 2011 5:13 am
Location: Tauranga, New Zealand
Contact:

Re: Inter-Word Reference Manual

Post by richardtoohey » Thu Jun 25, 2020 3:38 am

Pi machines are quite underpowered and throwing large images at them to convert to PDFs is probably a bit too much for them, especially the low-end machines. ImageMagick etc. is quite a beast. How much RAM does it have? If you open top in another console window can you see where the pressure is (RAM, CPU, ?) But this is a bit off-topic, sorry. :oops:

Post Reply

Return to “scanning of books, magazines, ads and letters”