1 #LyX 1.3 created this file. For more info see http://www.lyx.org/
14 \use_numerical_citations 0
15 \paperorientation portrait
18 \paragraph_separation indent
20 \quotes_language english
24 \paperpagestyle default
28 Conversion to OpenOffice
38 \begin_inset LatexCommand \tableofcontents{}
48 It is sometimes useful to be able to convert
52 files to word-processor formats, particularly
57 It is more straight forward to convert to
66 generally makes a good job of converting to
77 performs the conversion from
90 document must be exported to
99 documents that have not been produced from
103 to be converted, but sometimes that may not work as unknown assumptions
107 The programs are written in
111 for a PC running Linux or Windows®
121 is referred to in this document, read it as Windows®
130 This is for Release 4.
132 \begin_inset LatexCommand \ref{Changes}
136 for changes and bug fixes.
142 This may seem a bit like ringing someone up to find out their telephone
143 number! Although you will have installed the program if you are reading
144 this, there are one or two extra points to be made.
150 You need to have Python installed.
151 Linux users will probably find it included in their distribution.
154 Windows users can download it from e.g.
157 http://www.python.org/download/
160 The GUI dialogs require the
168 to packages to be installed.
171 Configuration Completion
174 You need to run the script
178 to specify your default paths and construct the configuration files used
181 \begin_inset LatexCommand \ref{sub:Config.py}
193 There are two configuration files constructed by
198 \layout Subsubsection
204 \begin_inset LatexCommand \label{homeDir}
208 The first line contains your home directory path.
209 This is set to the folder within which you installed
221 \begin_inset LatexCommand \label{defPath}
225 The second line contains the full default path of your tex documents.
228 The third line contains the default path where picture files contained in
229 your documents are stored.
230 If the program cannot find a path for a picture it assumes this as the
234 The fourth line contains a printer set-up string for
243 as the document needs something.
244 It is probably best to change the printer, if you need it, from within
248 \layout Subsubsection
254 \begin_inset LatexCommand \label{tempPaths}
258 This contains the various platform-dependent paths and presets that are
259 needed internally by the Python programs.
270 \begin_inset LatexCommand \label{conversion}
274 This is done as follows:
293 without any parameters, or click the
297 bash script (which may be relocated, or you may create a link to it on
298 your Desktop; do not create a link directly to
302 as the default directory will be wrong); a file dialog will appear to select
314 from a shell console, specifying
322 extension but optionally including the full path:
330 a file selection GUI will not then appear
333 If the conversion is successful then a text window will appear saying so,
334 and the result will be an
345 directory, with the same name as your
350 Otherwise a brief error message will appear in a text window, unless a
351 program execution error has occurred, when you may need to inspect the
357 \begin_inset LatexCommand \ref{tempPath}
363 \begin_inset LatexCommand \ref{errors}
370 If you wish to obtain a
374 or other format file, open the
404 \begin_inset LatexCommand \ref{homeDir}
408 home directory, or on a shortcut to it of you have made one; a file dialog
409 will appear to select the
420 from the command prompt, specifying
428 extension but optionally including the full path:
435 a file selection GUI will not then appear
438 If the conversion is successful then a text window will appear saying so,
439 and the result will be an
444 \begin_inset LatexCommand \ref{homeDir}
448 home directory, with the same name as your
453 Otherwise a brief error message will appear in a text window, unless a
454 program execution error has occurred, when you may need to inspect the
460 \begin_inset LatexCommand \ref{tempPath}
466 \begin_inset LatexCommand \ref{errors}
473 If you wish to obtain a
477 or other format file, open the
489 \begin_inset LatexCommand \label{errors}
501 \begin_inset LatexCommand \ref{tempPath}
505 temporary folder for a brief error report.
508 Viewing the Converted Document
511 When you view the converted document with
515 it may reformat itself after a short time, and you may lose your viewing
516 position, especially if the document is large.
517 If you have placed the cursor then left-arrow/right-arrow should restore
521 The reason for this has not been discovered, but may arise from
522 \begin_inset Quotes eld
526 \begin_inset Quotes erd
529 in specifying the paragraph formats more completely in XML than is perhaps
530 necessary (to save a lot of extra mucking about when parsing).
533 Before submitting for publication you can make a trivial amendment and re-save
534 the document, which overcomes this (that is, if you have not had to make
535 other amendments anyway :-) )
541 This project is only partially complete, but has reached a useful stage.
546 UserGuide part 1 can be converted successfully, which covers a lot of ground.
549 Features not implemented
556 features have not been implemented:
559 document layout: some features are omitted (see
560 \begin_inset LatexCommand \ref{sec:Styles}
574 customised list labels
577 language (English assumed)
581 paragraph layout: only label widths have been implemented
584 character layout: mostly implemented (see
585 \begin_inset LatexCommand \ref{sec:Styles}
589 ), but colour, language and toggling are not implemented
592 document paging has not been calculated, so
596 is not implemented, page references are converted into chapter or section
597 references, and forced page breaks are not implemented
600 AMS standards have not been implemented as presumably that august body prefers
613 form rather than in word-processor format
620 \begin_inset LatexCommand \label{mathcal}
640 fonts are recognised by
651 Algorithm and Floatfit Figure
666 equation array lines are only labelled if referred to by cross-references
669 the use of sideways text in tables can only be used in table header rows,
670 and only for the whole row; this is because
674 does not permit otherwise (as far as the author has been able to discover)
685 (a suitable alternative in
692 floats are not implemented by
696 , so figure and table floats are put in a paragraph-aligned frame to keep
697 the caption and object together
700 margin notes are converted to footnotes
709 Other implementation aspects:
712 tables are not allowed to be split across pages
719 environment protected blanks are converted into soft ones as otherwise
720 the word-processor can split long lines in the middle of words.
721 However a protected blank at the start of a line is retained for obvious
725 The following graphics formats can be correctly sized:
733 jpeg, pcx, png, ps, xbm
737 other types are sized as 12 x 9 cm
739 this is because no size information is available in the
763 images appear as a sized box including details of the image, and will print
764 correctly on a Postscript® printer.
765 Otherwise convert the image to another format before including it.
768 If an object is defined which is not a standard
772 one the program does its best but may produce funny results as it cannot
773 guess what is really wanted.
774 This applies in particular to
778 documents not produced by
783 The first program is adaptive and adds unrecognised objects to its dictionaries.
787 \begin_inset LatexCommand \label{Changes}
791 Changes between Releases
797 None, but this was for Linux users only.
803 The Python programs were made platform-independent and extra configuration-suppo
804 rt files were introduced for Linux and Windows (XP by default)
810 The following bugs were fixed
813 Table column formats now recognise >{centering}, >{raggedleft} and >{raggedright
817 Paragraph Spacing (see
818 \begin_inset LatexCommand \ref{spacing}
822 ) now works properly (for sizes specified in
841 Table and figure floats are now correctly sized
844 Small graphic items are now properly sized
847 Support for Windows 98 has been included
853 GUI support has been introduced.
862 packages being available on your computer, which is generally true.
873 has been relaxed in so far as
881 , which means you only have to run
885 to convert a document.
888 All the final zipping and clean-up previously done in Bash or .bat script
889 files is now incorporated into
893 , so those files are no longer needed.
900 now enables the configuration files to suit your platform to be generated
904 Thus no distinction between WindowsXP and Windows98 is any longer necessary.
910 The conversion is performed by two
914 programs, and you need to have the
919 They are both platform-independent.
929 file and stores the result in a pickled dictionary located in your temporay
931 The idea is to parse into a (large!) Python dictionary so that all similar
932 objects are grouped together.
933 Then other dictionaries can be used to convert styles, formats and special
934 characters to suit the target file type.
935 In other words this program makes no assumption about the target file type,
936 and its results can be used to convert to any other file type, given a
937 suitable program and supporting dictionaries to implement that.
938 It contains options to print out its results either by direct calls from
939 the terminal or from a supervising program of some kind.
940 It generates the following files in your temporay directory for
945 \begin_inset LatexCommand \label{files}
951 \labelwidthstring MMMMMM
957 contains the pickled dictionary which is the principal result used in the
958 second stage of conversion
960 \labelwidthstring 00.00.0000
966 contains headings for use by a supervisor program (not supplied
967 \begin_inset LatexCommand \ref{c_progs}
971 ) to display the results
973 \labelwidthstring 00.00.0000
979 contains a report of the object types found and incorporated in
983 , again for use by a supervisor program to inspect the results.
986 The latter two files may be inspected directly if desired, but see
987 \begin_inset LatexCommand \ref{normal}
992 The first one is in Python pickled form and is not suitable for direct
999 This program reads in
1003 and produces the final
1008 It requires the following dictionaries and
1014 templates as supplied:
1016 \labelwidthstring MMMMMMMM
1018 EndOffice converts compound object end tags
1020 \labelwidthstring MMMMMMMM
1022 EscOffice converts escape characters
1024 \labelwidthstring MMMMMMMM
1026 FontSizeOffice converts font sizes
1028 \labelwidthstring MMMMMMMM
1030 FontStyleMathsOffice converts styles for equations
1032 \labelwidthstring MMMMMMMM
1034 FontStyleOffice converts text styles
1036 \labelwidthstring MMMMMMMM
1038 StartOffice converts compound object start tags
1040 \labelwidthstring MMMMMMMM
1044 charCount, mathsSizes, mathsSymb
1050 contain sizing information
1052 \labelwidthstring MMMMMMMM
1054 OfficeXML is a folder containing the template
1058 files from which the final document is constructed
1062 \begin_inset LatexCommand \label{sub:Config.py}
1069 Available in Release 4, this script provides a GUI based method of creating
1070 the configuration files
1078 , which are thus not supplied.
1079 You specify the three paths for:
1083 \begin_inset LatexCommand \label{tempPath}
1087 the location for temporary files
1090 the default path for your tex files, which is used by
1094 as the initial folder in its file dialog box
1097 the default path for graphics to be included in the document
1100 For each path a path-selection dialog is presented to navigate to the desired
1101 folder: double-click on the folder and close the widget; a confirmation
1102 button dialogue will then be presented with three buttons:
1109 to accept the result
1116 to reject it, in which case the path-selection dialog is presented again
1123 which closes the program leaving any previously constructed configuration
1124 files undisturbed i.e.
1125 this script may be re-run at any time to change the defaults.
1128 The configuration files are constructed from this information.
1129 The platform is deduced from the type of file path delimiter character
1130 encountered; if there is any doubt a radio-button dialog will be presented
1131 to select the platform.
1132 After configuration, the only significance of the platorm type is the type
1133 of file path delimiter used by the programs.
1139 If an error occurs in either of the Python programs, the file
1143 will exist in your temporay directory and any files constructed as part
1144 of the process up to the error will remain in that folder.
1145 A brief note of the error may be reported on the terminal, depending on
1146 where the error occurred, which will also be found in a file named
1150 in the temporary folder.
1151 An even briefer one will be found there in the file
1155 (which is really created for the benefit of the script and/or supervisor
1156 programs to determine unambiguously that an error has occurred; it is deleted
1158 If the error was a file error the offending file name will be reported.
1164 The author can be contacted at nctsm@safe-mail.net
1167 If you would like to be informed of upgrades and bug fixes, please tell
1172 \begin_inset LatexCommand \label{sec:Styles}
1182 The following Layout/Document options have been implemented:
1184 \labelwidthstring 00.00.0000
1187 size all page size options are available; default is treated as A4
1189 \labelwidthstring 00.00.0000
1198 may be selected instead of
1202 \labelwidthstring 00.00.0000
1226 the only units recognised are
1238 ches; any other setting will default to
1243 if the inner margin differs from the outer and two-sided paper has been
1244 selected then pages will be mirrored correctly
1246 \labelwidthstring 00.00.0000
1249 size all are recognised; the default is assumed to be
1253 \labelwidthstring 00.00.0000
1255 font the default font is
1259 times, palatino, helvetica, avant
1265 [New Century Schoolbook] and
1277 are not and default to
1283 we have gained a font in
1287 by assuming the default is
1291 \labelwidthstring 00.00.0000
1294 spacing all are recognised, default is single-spacing; custom settings
1295 are assumed to be specified by integers
1297 \labelwidthstring 00.00.0000
1303 Changes to other settings will not be recognised and the defaults are assumed:-
1308 \SpecialChar \menuseparator
1314 \SpecialChar \menuseparator
1320 \SpecialChar \menuseparator
1326 \SpecialChar \menuseparator
1332 \SpecialChar \menuseparator
1338 \SpecialChar \menuseparator
1344 \SpecialChar \menuseparator
1350 \SpecialChar \menuseparator
1366 The Label Width may be set
1370 \begin_inset LatexCommand \label{spacing}
1378 option only, which may be specified in
1387 The minimum spacing corresponds to 2.1mm
1390 Otherwise the defaults are assumed.
1396 The following options are implemented in Layout/Character:
1398 \labelwidthstring 00.00.0000
1402 \labelwidthstring 00.00.0000
1406 \labelwidthstring 00.00.0000
1416 \labelwidthstring 00.00.0000
1418 Colour black assumed
1420 \labelwidthstring 00.00.0000
1422 Language English assumed
1424 \labelwidthstring 00.00.0000
1426 Toggling defaults assumed
1429 More on Dictionaries
1432 The dictionaries have stylised names as it was initially envisaged that
1433 other programs apart from
1437 could be written to convert to other formats.
1438 During early testing a plain text version called
1447 It required different dictionaries from
1451 , and template dictionaries were used as the starting point for constructing
1452 sets of customised dictionaries.
1453 Because of this open approach
1457 reads in some dictionaries based on its own name, as it too was derived
1458 from a template program.
1459 However it is unlikely that other conversion programs will be needed or
1460 desired, so none of this stuff has been supplied, although the first Python
1461 program remains in generalised form.
1467 Equation objects have to be corrected for
1468 \begin_inset Quotes eld
1472 \begin_inset Quotes erd
1475 syntax, as otherwise
1480 These may not be errors in fact as it may genuinely be desired to print
1482 \begin_inset Formula $\sum$
1489 expects an argument after the symbol and prints an error indicator.
1490 To avoid this, {} is inserted after
1491 \begin_inset Formula $\sum$
1494 which does not show and satisfies
1499 There are numerous such potential corrections to be made, and each equation
1500 has to be parsed for them.
1501 The most tricky is to correct for un-matched brackets of any kind, which
1502 can be useful for tensor equations such as
1503 \begin_inset Formula $x_{(1}=\left\{ \begin{array}{cc}
1504 \alpha_{0} & \alpha_{0}^{s}\end{array}\right\} $
1508 So far unmatched left brackets (of any kind) are corrected (the unmatched
1509 bracket is enclosed in ordinary double quotes which do not show), but the
1510 syntax analyser cannot (yet) correct for unmatched right brackets.
1511 If you need them, enclose them in ordinary double quotes.
1514 I have noticed a problem when
1522 :- including a reserved text word like 'and' in an equation can cause the
1523 rest of the equation to be set to Roman style rather than just that word;
1524 this causes confusion as
1530 usually means render in non-italic Roman style until its scope ends, which
1535 implies putting an ordinary double quote " at the start and end of the
1536 scope, ruining the equation if the scope is wrong.
1537 I have left this interpretation of
1543 in place as I'm not sure what else to do, so beware! The same applies to
1568 which in any case cannot be rendered
1569 \begin_inset LatexCommand \ref{mathcal}
1577 \begin_inset LatexCommand \label{confFiles}
1584 The Python programs require one parameter, which is the name of the file
1585 to be converted, without the
1590 If this is supplied without a path then the default path (c.f.
1592 \begin_inset LatexCommand \ref{defPath}
1596 ) is assumed, otherwise the full path supplied will be used instead.
1601 script is used this is supplied as its parameter.
1604 The programs use the home directory (c.f.
1606 \begin_inset LatexCommand \ref{homeDir}
1610 ) to control where to find things.
1613 The programs use the paths
1614 \begin_inset LatexCommand \ref{tempPaths}
1622 file to control where temporary files are to be created during the conversion
1624 They must be specified as follows
1632 where temporary files are constructed
1668 zz|ConvLaTex|Dictionaries|
1674 is the document sub-folder in your home directory
1677 Line 6: the path separator used by your platform:
1690 Line 7: the separator for
1699 Line 8: the separator for