Two pointers

As you know, this is where I take out my frustrations by having a good grumble about life, the universe, and everything. Yesterday, in a meeting, we learned of two developments coming to the academic world. In neither case is it an unmixed blessing. I should add that this is in no way a grumble against my colleagues, who foresightedly and openmindedly have told us about what is heading our way so that we can be prepared.

The first, and mostly harmless, concerns publication of research data. No longer will it be enough to say “Contact the author for the data/programs used in this paper”; the data must be available at a specific link published in the paper. It is even claimed that such statements will still be required in papers with no data at all, such as pure mathematics papers.

I do not know whether this is being forced on the humanities as well as the sciences, or whether it is just an anomaly of mathematics being misclassified as a science. In any case, I hope that someone can provide us with some boilerplate text for the purpose. (One of my colleagues suggested something along the lines “No data were hurt in the research underlying this pure mathematics”.)

But how do you publish “no data”? If mathematics is founded on set theory, a file containing the empty set symbol might do the job (an empty file might be problematic). But perhaps the bureaucrats might be happier with a spreadsheet with no entries.

Journals already require publication of funding information; many specify a conflict of interest statement; some need an ethical approval statement. I fear that in future a pure mathematics publication will resemble a beautiful work of art in a standard frame designed by a committee.

The second, more serious because it addresses a real problem, concerns accessibility of lecture notes. Now this doesn’t mean, as you might first think, that we will have to dumb down our lecture notes to make them accessible to creatures of restricted intelligence. Instead, it seems that there is a PDF reader program which reads aloud the contents of a PDF file for disabled students. (No, I am not talking about a lecturer!) It seems that PDF documents produced via LaTeX are not well handled by this piece of software, and so we are to blame and must suffer.

I have spent a lot of time during my career trying to make lecture notes as clear and beautiful as possible. I have always taken the view of conventional typographers, that typesetting is a window through which you see the beautiful scenery, and the best typesetting is the one which provides the most unobtrusive window pane. Now we are told that the programs to fix this problem (the best of which is called “bookdown”, if I caught the name right) seriously degrade the LaTeX typesetting in the interests of accessibility. So the majority of students will be penalised to help the minority. Is this good? This is a moral question I can’t judge.

With a bit less than twice the work, one could produce two files, one a carefully crafted LaTeX file, the other a reader-friendly file. The downside would be maintenance; changes would have to be made in two places, and the files could very easily get out of sync.

Let me interpolate two speculations here; I have no evidence for either of these.

First, it seems highly likely that this PDF reader is optimized for files produced by Word, as a hangover of the Microsoft hegemony. If this so, then we are being punished for using a better product than Microsoft can produce.

Second, this measure (entirely arbitrary, if my first speculation is correct) will delight the bureaucrats. Here is a measure, a number produced by computer and hence completely objective, of our concern for handicapped students. How long until it is incorporated into staff appraisals and disciplinary measures?

Now I have a suggestion here. The great computer scientist Donald Knuth, in the days before the term “web” became synonymous with “internet” in the public mind, devised a system for producing programs and documentation together, which he called Web. A Web document could be fed to two preprocessors called Tangle and Weave, though I don’t remember which was which. One of them produced as output a Pascal program (Knuth’s preferred language at the time; I believe there is a C version of Web now). The other output a TeX document which consisted of the program with documentation. Knuth published two major programs, TeX and METAFONT, in this form as books.

Surely we could have a much simpler system here. An input file (which would superficially resemble LaTeX, to ease the learning curve) could be processed in two ways: the output of one would be a reader-friendly PDF, or means to produce one, such as a bookdown file; the other would be a non-crippled LaTeX file. Both could be made available to students, and maintenance would only need to be done in one place.

Ultimately, like Hamlet, I know where the exit door is, and I know it is not locked. When my previous university brought in draconian rules for staff performance and appraisal, including restrictions on seminar attendance, I decided that rather than stay and fight I would retire and live on my pension; and so I would be doing, had not St Andrews come to the rescue with the offer of a half-time position. I still have the option of retiring and living on my pension. But I am in such a good, friendly and supportive department that I don’t expect this to become necessary, and I would take that door only with great reluctance.

About Peter Cameron

I count all the things that need to be counted.
This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

14 Responses to Two pointers

  1. Jon Awbrey says:

    So hard to know who to follow, Sisyphus or Coriolanus

  2. Joshua Paik says:

    Dear Peter,

    I imagine it is sufficient to convert your text to PDF-A. Then the work to x+epsilon not 2(x-epsilon). That is all the due diligence you should be expected to do. If someone in the humanities wants to check; frankly they won’t be able to tell between random Greek sounding symbols and what’s actually there.


  3. Anton Cox says:

    My understanding is that the problem with pdfs produced with LaTeX is that formulas are not encoded in a way that respects their logical substructure – so if you have the formula for the roots of a quadratic the automatic reader needs to know that one bit is the numerator (and that part of that is a square root incorporating certain other parts) and another bit is the denominator. Whereas MathJax apparently does construct the formula in a structurally consistent way, and is fine with automatic readers (of which there are a number). So I dont think this is anything to do with Microsoft.

    What bookdown does is rather close to what you suggest – it uses a combination of LaTeX (using MathJax) and markdown and can then be compiled either as a pdf, a word document, an individual webpage, or a suite of webpages. In fact it is even closer to weave and tangle than that as it is written in R and can also contain either code fragments or run and display the results of code fragments when compiled. Unfortunately it only uses vanilla LaTeX – I think you can use packages etc for the pdf output, but then they do not work with the other formats.

    I agree it is not great, but I ended up converting my latex files into markdown files and making both the pdfs and the html versions available. Mostly it was cut and paste, but the formatting of non-maths (such as tables and sections and lists, or referencing) is rather different. An example of the non-pdf version is This was a 140 page pdf, but converting it was not so terrible (and they are sufficiently similar that maintaining both in parallel has proved quite easy). And although this example is not very Maths heavy, my other teaching material also works fine with a lot more mathematical content.

  4. I’m wondering what the relevant bureaucracy thinks of .tex sourcecode. This is probably the most accessible format around, but I’m not sure whether the relevant tools are there and known to the offices. And what will they say about pictures?

  5. Dima says:

    One can compile LaTeX to HTML, such an output is accessible (perhaps after adding a couple of extra lines to the html file). Cf

    I dont now why they only talk about using LaTeX ML and Pandoc, as TeXLive comes with its own LaTeX to HTML compiler, htlatex.

  6. Dima says:

    htlatex, documented here: can produce maths in MathML, perhaps that’s what the document reader needs.

  7. Years ago, when I was a subscriber to Tugboat, the journal for the TeX community, there was a lot of discussion about software that would read aloud the contents of a TeX file. This seems more sensible than reading the PDF; for example, fractions are cited as being difficult in PDF, but in LaTeX the command is simply \frac{x}{y}, the numerator and denominator in the order you need to read them. Indeed, I bumped into Rick Thomas in town today, and he told me about a severely visually disabled Masters student who insisted on having LaTeX files rather than PDF. I wonder whether this has been considered?
    Off topic, some years ago I reviewed the first issue of a new mathematics journal for a librarians’ publication. This was just as the use of author-supplied LaTeX by publishers was starting to catch on, and this journal used that. I felt I had to point out that there were disadvantages as well as advantages; one paper in the journal had put a big complicated fraction in the exponent. This would have been better done either with a solidus or with \exp.

  8. Tony Forbes says:

    Is this related to a recent announcelent by Wiley? Submissions to J. Combinatorial Designs must now be in Microsoft WORD (!). PDF files beautifully formated by LaTeX are no longer accepted.

  9. Jim Hefferon says:

    There is a lot of interest in producing accessible PDFs, and people are making progress. The TeX Users Group has sponsored a number of projects, and there have been a lot of talks at meetings. One place to look is at TUG’s journal page, which has a keyword for the topic:

  10. Scott Harper says:

    One of my papers ends with a publisher’s note stating that they “remain neutral with regard to jurisdictional claims in published maps”. Needless to say, the only maps in the paper are those defined between finite sets.

    • How long will it be until all this stuff takes up more space than the contents of a paper? Any bets? (This is a serious question; two of my papers were less than 2 pages long.)

    • Scary! Please read this; authors, please publish in diamond OA journals if you can, and diamond OA editors, please don’t even be tempted to go down the surveillance route!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.