<!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; font-size:11.0pt; font-family:"Calibri",sans-serif;} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} .MsoChpDefault {mso-style-type:export-only;} @page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 72.0pt 72.0pt 72.0pt;} div.WordSection1 {page:WordSection1;} --> I don’t think this is an old version of LO. I suspect it was written in word, then the pdf produced, then the Open Document version created from the PDF. I have both the PDF and the ODT versions. There is no HTML version.
I can’t use the PERL script. I don’t know PERL and I suspect by the time I was confident enough to try I could have cut and pasted everything I need manually! I put the PDF through an online converter and got a very clean DOCX version, so that at least is a start. The document uses page styles. What I find confusing is that the original ODT version has 91 custom page styles for 47 pages. The DOCX version has only 39 page styles, which still seems a lot. It isn’t something I use a lot though, so I don’t know if this is normal or an artefact of the conversion. It doesn’t appear to use footers, but text in a frame fixed to the bottom by inserting enough line breaks. Again this may be from the conversion. To put this in context, this is a policy document from a public body, originally issued for information in PDF and ODT formats. Both versions use columns, which makes reading onscreen a bit of a nightmare. It also includes tables split across columns and also across pages, without repeating headers. All in all I get the impression that the original document layout is a bit kludged together. While that may be a feature of the conversion, given the mess the ODT file is in, I can’t see any benefit from providing it, other than ticking an accessibility box and if tested would probably fail. I can’t see a screen reader making much sense of it. Side topic from my original problem but the UK government guidance on accessible formats is here:http://www.gov.uk/guidance/publishing-accessible-documents Sent fromMailfor Windows From:Steve EdmondsSent:04 July 2022 20:50To:users@global.libreoffice.orgSubject:Re: [libreoffice-users] Badly formatted document Is it possible it is an old document from an earlier version of LO. A while back the way LO anchored images (and frames) was changed, images <p class=MsoNormal>in tables were particularly affected throwing out text positioning also and I had to go through all my manuals and change my image anchoring to "as character". On 05/07/2022 04:59, Ian Bertram wrote: > I have been sent a graphic heavy document in ODT format. However it looks as > if it has been badly converted from a pdf file. The layout is scrambled, > headers don’t align properly and there are a host of other issues. It is also > in columns. Is there a simple way to strip out everything bar the words? I > have tried saving it as a txt file, but this loses a lot of the paragraph > numbering and introduces other layout issues. Saving in rtf format is even > worse. > > ></o:p> > > The best I have managed so far has been by converting all the text to a > single style and removing the columns. All the graphics however overlap text > and it is often very difficult to find the anchor point. > ></o:p> > > > Sent fromMailfor Windows > >; > > > -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ Privacy Policy: http://www.documentfoundation.org/privacy -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ Privacy Policy: http://www.documentfoundation.org/privacy