Creating multi-language documentation with sphinx

In a previous article I covered how CakePHP would potentially be moving to using sphinx for the 2.0 documentation. Myself and some of the other CakePHP developers have been working on this option, and seeing if it has any legs. Turns out that sphinx is actually a pretty great tool. It works great out of the box for generating single language docs, but multi-language docs are a bit undocumented, and require some digging.

Thankfully, the bazaar project also has documentation in several languages. I used their implementation as the basis for our own multi-language documentation.

Splitting up the content

We decided very quickly that each language would be in a separate directory, and each version of the documentation would be a separate branch. This allows you to easily generate all the documentation for all the languages for a single version, without resorting to branch shenanigans. This resulted in a repository that looks like:

Show Plain Text
  1.  
  2. ├── Makefile
  3. ├── _templates
  4. │   └── // custom templates here
  5. ├── build
  6. ├── config
  7. │   ├── __init__.py
  8. │   └── all.py
  9. ├── en
  10. │   ├── Makefile
  11. │   ├── _static
  12. │   ├── .. rest of the documentation here.
  13. │   ├── conf.py
  14. │   └── index.rst
  15. └── es
  16.     ├── Makefile
  17.     ├── _static
  18.     ├── conf.py
  19.     ├── .. rest of the documentation here
  20.     └── index.rst
  21.  

Each language is a top level directory, along with templates and documentation wide configuration files. Having languages as top-level directories makes sense if you think about the end urls we might want. http://book.cakephp.org/2.0/en/index.html. With version and language as the first path segments, its easy to switch languages, and versions.

The config folder contains an all.py file, which contains all the generic configuration information used across the various translations of the documentation. Each language also contains a conf.py file that contains language specific configuration values. Finally, tempaltes are shared, as all translations need to look the same.

Makefiles and building

Sphinx uses Makefiles by default, and while fairly old, make is still a great tool for build tasks, especially when they are simple ones. I decided to use one master Makefile, and additional Makefiles for each translation, this way you could run make for one, or all the languages pretty easily. The top level Makefile looks like:

Show Plain Text
  1. # MakeFile for building all the docs at once.
  2. # Inspired by the Makefile used by bazaar.
  3. # http://bazaar.launchpad.net/~bzr-pqm/bzr/2.3/
  4.  
  5. PYTHON = python
  6.  
  7. .PHONY: all clean html latexpdf epub htmlhelp
  8.  
  9. # Dependencies to perform before running other builds.
  10. SPHINX_DEPENDENCIES = \
  11.     es/Makefile
  12.  
  13. # Copy-paste the english Makefile everwhere its needed.
  14. %/Makefile : en/Makefile
  15.     $(PYTHON) -c "import shutil; shutil.copyfile('$<', '$@')"
  16.  
  17. # Make the HTML version of the documentation with correctly nested language folders.
  18. html: $(SPHINX_DEPENDENCIES)
  19.     cd en && make html LANG=en
  20.     cd es && make html LANG=es
  21.  
  22. htmlhelp: $(SPHINX_DEPENDENCIES)
  23.     cd en && make htmlhelp LANG=en
  24.     cd es && make htmlhelp LANG=es
  25.  
  26. epub: $(SPHINX_DEPENDENCIES)
  27.     cd en && make epub LANG=en
  28.     cd es && make epub LANG=es
  29.  
  30. latexpdf: $(SPHINX_DEPENDENCIES)
  31.     cd en && make latexpdf LANG=en
  32.     cd es && make latexpdf LANG=es
  33.  
  34. clean:
  35.     rm -rf build/*

For each task that we have, I just manually enumerate all the languages and call the same task on all the translations. I pass a language flag in, as a precondition to each build, is to clone the english Makefile to all the translations. This way I don’t have to copy & paste Makefile changes around.

The translation Makefiles look fairly similar to the standard sphinx Makefiles with a few addition for languages. The most important changes look like:

Show Plain Text
  1. ALLSPHINXOPTS   = -d $(BUILDDIR)/doctrees/$(LANG) $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
  2.  
  3. # more stuff
  4.  
  5. html:
  6.     $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html/$(LANG)
  7.     @echo
  8.     @echo "Build finished. The HTML pages are in $(BUILDDIR)/html/$(LANG)."

By re-targeting the build and doctree directories, all the languages can build into one destination. This saves an additional step of collecting an collating all the various builds together.

So there you have it, a reasonably straight-forward way of generating multi-language documentation with sphinx.

Comments

This solution looks very nice, clean and organized. I went for git based documentation too, but ended up with a solution that couldn’t take care of separate versions and different languages.

I am definitely going to follow your approach.

Fahad on 27/2/11

i have been using sphinx for 5 years specially for indexing. Very good choice and glad to hear it will be in cake someday.

vangel on 5/5/11

The thing is, you aren’t using the all.py file, nor sharing the themes, and the calling method of “make html” uses the local conf.py instead of the all.py file.

Can you write how did you do to share themes and config?

Oscar on 11/7/11

Oscar: The Makefile goes through each translation and calls make html on each language. You are right in that this uses the local conf.py file for each language. However, each language’s conf.py file includes the all.py file using from config.all import * and some path manipulation. The templates are shared by defining the paths correctly, its not as complicated as it looks :)

mark story on 14/7/11

That was what I missed, the import in the conf.py files. Thanks! Now I have my automated multilanguage documentation :)

I’ll possibly do a post in my blog (blog.oscarcp.com) with the instructions (for future reference) and I’ll link back here, do you mind?

Oscar on 19/7/11

You should simplify your Makefile:


# MakeFile for building all the docs at once.

langs = en es html: $(foreach lang,$(langs),html-$(lang)) html-%: $(SPHINX_DEPENDENCIES) @echo “cd $* && make html LANG=$*”


hector on 20/7/11

hector: Thanks for the tip :D

mark story on 20/7/11

Sphinx 1.1 provides a bit more gettexty way of doing this – http://sphinx.pocoo.org/latest/intl.html

LRN on 17/9/11

That looks good, but say the ‘master translation’ (for want of a better word) was edited or changed, how would that be propagated through to the translators?

phill on 23/8/12

Hello!

If I am making some descriptions in my code like
“”“
database model
“”“
how to distinguish what should go to English docs and what to eg Spanish? Is it possible?

Archarachne on 23/8/13

Archarachne: For CakePHP we did not re-use any of the code comments are they were english only. If you wanted to generate docs from your code I think you’d be stuck with only one language.

mark story on 24/8/13

Have your say:

*
* You can use Textile markup, but be reasonable