becta logo
[senit] converting PDF files

Paul Nisbet Paul.Nisbet at ed.ac.uk
Mon Jan 14 13:02:05 GMT 2008

Article: [senit] converting PDF files

As well as the other methods mentioned previously you can use most OCR
packages that are normally used for scanning because they can open PDFs and
then convert them to other formats like Word or plain text. We use
FineReader (£68 from www.dyslexic.com, trial copy from www.abbyy.com) 

The main problem with converting PDFs to anything else is that the
formatting often goes awry in the process, but since you'll be doing a lot
of editing on the text to make Braille anyway, you might find that the
easiest (and free method) is to open the PDF in Acrobat Reader and then
'Save as text'. Then open it in whatever editor you use and amend as
necessary.


Paul




_______________________________________________
Paul D. Nisbet
Senior Research Fellow
Communication Aids for Language and Learning (CALL) Centre
Moray House School of Education
University of Edinburgh
Paterson's Land, Holyrood Road
Edinburgh EH8 8AQ
Tel. 0131 651 6236     Fax 0131 651 6234
email Paul.Nisbet at ed.ac.uk
http://callcentrescotland.org.uk  
http://callcentrescotland/digitalexams 
http://booksforall.org.uk  
 

-----Original Message-----
From: senit-bounces at lists.becta.org.uk
[mailto:senit-bounces at lists.becta.org.uk] On Behalf Of Adrian Higginbotham
Sent: 11 January 2008 10:00
To: senit at lists.becta.org.uk
Subject: RE: [senit] converting PDF files

It's a tricky area, getting text out using many of the applications
suggested is relatively easy but often requires lots of work re touching the
content, putting in headings, chapter titles and other structure, taking out
page numbers (or making it clear that they are page numbers rather than
scattered randomly through the text).

Products which undertake the conversion to accessible formats specificly
with this purpose in mind cut down on this sort of work but their success is
very dependant on the qualities of the source material - if there's no
structure in the original than no converter is going to magicly create it.

The products I'm aware of that specificly try to create accessible formats
are:
Riverdocs - as someone already said it's quite technical and expensive -
really designed for quite large scale use so priced accordingly.

Easy converter - from dolphin - can convert any of html, word, Braille,
daisy, pdf to any of these.  Possibly some conversion to audio too I think.
Worth a demo and consideration, again not cheap compared to some none
specialist products but does cut out a lot of work.

Also Abode Acrobat - the pay for fully featured product, not the free Reader
application has quite a lot of tools built in to help tidy up content to
improve accessibility and allows you to expert to a number of formats
including html.  Using the 'make accessible' feature it will even attempt to
add structural mark-up where none exists in the source.

Remember also that Adobe Reader is now much better at facilitating access to
pdf files for access tech users so before undertaking conversion it is
always worth checking the ease of use of the pdf file itself with a
screenreader / magnifier - either yourself or with a student.  Reader does
recognise the presence of assistive technology on the hoste machine so can
allow some level os access even on security enabled files where you can't
achieve this on your own machine - for example a screenreader user may be
able to copy and paste content where a none user can not.

Adrian Higginbotham
Project manager: Learning services
Becta
Tel: Direct dial 024 7679 7333 - Becta switchboard 02476-416994.
Email: Adrian.Higginbotham at becta.org.uk
Web: http://www.becta.org.uk/
BECTA, Millburn Hill Road, Science Park, Coventry, CV4 7JJ 

-----Original Message-----
From: senit-bounces at lists.becta.org.uk
[mailto:senit-bounces at lists.becta.org.uk] On Behalf Of Steve Lee
Sent: 11 January 2008 09:26
To: senit at lists.becta.org.uk
Subject: Re: [senit] converting PDF files

On 10/01/2008, Claire Barnes <clairebarnes at willowdeneschool.co.uk> wrote:
> You can actually cut and paste text from pdf files (e.g into MS Word) 
> but you lose all formatting and usually all the diagrams.

Depending on the fonts used in the PDF I've found you often end up with
unusable text.

Gmail has an option to display PDF attachements, but it doesn't work if
there are images.

Perhaps Acrobat has options to manipulate the accessibility features
available in newer PDF formats but that's an expensive solution and theres
loads of free converters.

The pdftotext part of XPDF is available for windows (here
ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02pl2-win32.zip) and as it's command
line should be good for batch operations.

--
Steve Lee
--
Jambu - Alternative Access to Computers
www.fullmeasure.co.uk





  Main Becta Site  | Return to top