39 SWIG and Doxygen Translation

This chapter describes SWIG's support for translating Doxygen comments found in interface and header files into a target language's normal documentation language. Currently only Javadoc and Pydoc is supported.

39.1 Doxygen Translation Overview

The Doxygen Translation Module of SWIG is an ongoing effort from a Google Summer of Code proposal from Summer 2008. It adds an extra layer of functionality to SWIG, allowing automated translation of Doxygen formatted comments from input files into a documentation language more suited for the target language. Currently this module only translates into Javadoc and Pydoc for the SWIG Java and Python Modules, but other extensions are to be added in time.

Questions about running SWIG are best answered in the SWIG Basics chapter as well as the target language modules. (For now, only Java and Python). The behaviour of this functionality is wildly unpredictable if the interface file is not proper to begin with!

39.2 Preparations

To make use of the comment translation system, your documentation comments must be in properly formatted Doxygen. They can be present in your main interface file or any header file that it imports. It is advised that you are certain your comments compile properly with Doxygen before you try to translate them. Doxygen itself is a deeper tool and can provide you better feedback for correcting any syntax errors that may be present. Please look at Doxygen's Documenting the code for proper specifications for comment format. However, SWIG's Doxygen parser will still point you most of errors and warnings found in comments (like unterminated strings or missing ending tags).

/*! This is describing class Shape
 \author Bob
 */

class Shape {

Currently, the whole subset of Doxygen comment styles is supported (See Documenting the code). Here they are:

/**
 * Javadoc style comment, multiline
 */
/*!
 * QT-style comment, multiline
 */
/**
 Any of the above, but without intermediate *'s
 */
/// Single-line comment
//! Another single-line comment

Also any of the above with '<' added after comment-starting symbol, like /**<, /*!<, ///<, or //!< will be treated as post-comment and will be assigned to the node before the comment.
Any number of '*' or '/' in Doxygen comment is considered to be a separator and is not included in final comment, so you may safely use comments like /*********/ or //////////.

Please note, as SWIG parses input file by itself with strict grammar, there is only a limited support for various cases of comment placement in the file.
Comments can be placed before C\C++ expressions on separate lines:

/**
 * Some comment
 */
void someOtherFunction();
/**
 * Some comment
 */
void someFunction();

class Shape {
  /*
   * Calculate the area in cm^2
   */
  int getArea();
}

After C\C++ expressions at the end of the line:

int someVariable = 9; ///< This is a var holding magic number 9
void doNothing(); ///< This does nothing, nop

and in some special cases, like function parameter comments:

void someFunction(
         int a ///< Some parameter 
     );

or enum element comments:

enum E_NUMBERS
{
    EN_ZERO, ///< The first enum item, gets zero as it's value
    EN_ONE, ///< The second, EN_ONE=1
    EN_THREE
};

Just remember, if SWIG shows syntax error parsing the file because of your comment, try to move it in some other, 'safer' place as described above.
Also, currently only the comments directly before or after the nodes are supported. Doxygen structural comments are stripped out and not assigned to anything.

39.2.1 Enabling Doxygen Translation

Doxygen comments translation is disabled by default and needs to be explicitly enabled using the command line -doxygen switch for the languages that do support it (currently Java and Python).

39.2.2 Doxygen-specific %feature Directives

Translation of Doxygen comments is influenced by the following %feature directives:

doxygen:notranslate

Turns off translation of Doxygen comments to the target language syntax: the original comment will be copied to the output unchanged. This is useful if you want to use Doxygen itself to generate documentation for the target language instead of the corresponding language tool (javadoc, sphinx, ...).

doxygen:ignore:<command-name>

Specify that the Doxygen command with the given name should be ignored. This is useful for custom Doxygen commands which can be defined using ALIASES option for Doxygen itself but which are unknown to SWIG. "command-name" is the real name of the command, e.g. you could use

%feature("doxygen:ignore:transferfull");

if you use a custom Doxygen transferfull command to indicate that the return value ownership is transferred to the caller, as this information doesn't make much sense for the other languages without explicit ownership management.

Doxygen syntax is rather rich and, in addition to simple commands such as @transferfull, it is also possible to define commands with arguments. As explained in Doxygen documentation, the arguments can have a range of a single word, everything until the end of line or everything until the end of the next paragraph. Currently, only the "end of line" case is supported using the range="line" argument of the feature directive:

// Ignore occurrences of
//
//    @compiler-options Some special C++ compiler options.
//
// in Doxygen comments as C++ options are not interested for the target language
// developers.
%feature("doxygen:ignore:compileroptions", range="line");

In addition, it is also possible to have custom pairs of begin/end tags, similarly to the standard Doxygen @code/@endcode, for example. Such tags can also be ignored using the special value of range starting with end to indicate that the range is an interval, for example:

%feature("doxygen:ignore:forcpponly", range="end"); // same as "end:endforcpponly"

would ignore everything between @forcpponly and @endforcpponly commands in Doxygen comments. By default, the name of the end command is the same as of the start one with "end" prefix, following Doxygen conventions, but this can be overridden by providing the end command name after the colon.

This example shows how custom tags can be used to bracket anything specific to C++ and prevent it from appearing in the target language documentation. Conversely, another pair of custom tags could be used to put target language specific information in the C++ comments. In this case, only the custom tags themselves should be ignored, but their contents should be parsed as usual and contents="parse" can be used for this:

%feature("doxygen:ignore:beginPythonOnly", range="end:endPythonOnly", contents="parse");

Putting everything together, if these directives are in effect:

%feature("doxygen:ignore:transferfull");
%feature("doxygen:ignore:compileroptions", range="line");
%feature("doxygen:ignore:forcpponly", range="end");
%feature("doxygen:ignore:beginPythonOnly", range="end:endPythonOnly", contents="parse");

then the following C++ Doxygen comment:

/**
    A contrived example of ignoring too many commands in one comment.

    @forcpponly
    This is C++-specific.
    @endforcpponly

    @beginPythonOnly
    This is specific to @b Python.
    @endPythonOnly

    @transferfull Command ignored, but anything here is still included.

    @compileroptions This function must be compiled with /EHa when using MSVC.
 */
void func();

would be translated to this comment in Python:

def func():
    r"""
    A contrived example of ignoring too many commands in one comment.

    This is specific to **Python**.

    Command ignored, but anything here is still included.
    """
    ...

doxygen:nolinkranslate (Java-only currently)

Turn off automatic link-objects translation.

doxygen:nostripparams (Java-only currently)

Turn off stripping of @param and @tparam Doxygen commands if the parameter is not found in the function signature.

39.2.3 Additional Command Line Options

ALSO TO BE ADDED (Javadoc auto brief?)

39.3 Doxygen To Javadoc

If translation is enabled, Javadoc formatted comments should be automatically placed in the correct locations in the resulting module and proxy files.

39.3.1 Basic Example

Here is an example segment from an included header file

/*! This is describing class Shape
 \author Bob
 */

class Shape {
public:
  Shape() {
    nshapes++;
  }
  virtual ~Shape() {
    nshapes--;
  };
  double  x, y; /*!< Important Variables */
  void    move(double dx, double dy); /*!< Moves the Shape */
  virtual double area(void) = 0; /*!< \return the area */
  virtual double perimeter(void) = 0; /*!< \return the perimeter */
  static  int nshapes;
};

Simply running SWIG should result in the following code being present in Shapes.java


/**
 * This is describing class Shape 
 * @author Bob 
 * 
 */

public class Shape {

...

/**
 * Important Variables 
 */
  public void setX(double value) {
    ShapesJNI.Shape_x_set(swigCPtr, this, value);
  }

/**
 * Important Variables 
 */
  public double getX() {
    return ShapesJNI.Shape_x_get(swigCPtr, this);
  }

/**
 * Moves the Shape 
 */
  public void move(double dx, double dy) {
    ShapesJNI.Shape_move(swigCPtr, this, dx, dy);
  }

/**
 * @return the area 
 */
  public double area() {
    return ShapesJNI.Shape_area(swigCPtr, this);
  }

/**
 * @return the perimeter 
 */
  public double perimeter() {
    return ShapesJNI.Shape_perimeter(swigCPtr, this);
  }
}

The code Java-wise should be identical to what would have been generated without this feature enabled. When the Doxygen Translator Module encounters a comment it finds nothing useful in or cannot parse, it should not effect the functionality of the SWIG generated code.

Javadoc translator will handle most of the tags conversions (see the table below). It will also automatically translate link-objects params, in \see and \link...\endlink commands. For example, 'someFunction(std::string)' will be converted to 'someFunction(String)'. If this works not really good for you, or if you don't want such behaviour, you could turn this off by using 'doxygen:nolinktranslate' feature. Also all '\param' and '\tparam' commands are stripped out, if specified parameter is not present in function. Use 'doxygen:nostripparams' to avoid.

Javadoc translator features summary (see %feature directives):

39.3.2 Javadoc Tags

Here is the list of all Doxygen tags and the description of how they are translated to Javadoc
Doxygen tags:

\a wrapped with <i> html tag
\arg wrapped with <li> html tag
\author translated to @author
\authors translated to @author
\b wrapped with <b> html tag
\c wrapped with <code> html tag
\cite wrapped with <i> html tag
\code translated to {@code ...}
\cond translated to 'Conditional comment: <condition>'
\copyright replaced with 'Copyright:'
\deprecated translated to @deprecated
\e wrapped with <i> html tag
\else replaced with '}Else:{'
\elseif replaced with '}Else if: <condition>{'
\em wrapped with <i> html tag
\endcode see note for \code
\endcond replaced with 'End of conditional comment.'
\endif replaced with '}'
\endlink see note for \link
\endverbatim see note for \verbatim
\exception translated to @exception
\f$, \f[, \f], \f{, \f} LateX formulas are left unchanged
\if replaced with 'If: <condition> {'
\ifnot replaced with 'If not: <condition> {'
\image translated to <img/> html tag only if target=HTML
\li wrapped with <li> html tag
\link translated to {@link ...}
\n replaced with new line char
\note replaced with 'Note:'
\overload prints 'This is an overloaded ...' according to Doxygen docs
\p wrapped with <code> html tag
\par replaced with <p alt='title'>...</p>
\param translated to @param
\remark replaced with 'Remarks:'
\remarks replaced with 'Remarks:'
\result translated to @return
\return translated to @return
\returns translated to @return
\sa translated to @see
\see translated to @see
\since translated to @since
\throw translated to @throws
\throws translated to @throws
\todo replaced with 'TODO:'
\tparam translated to @param
\verbatim translated to {@literal ...}
\version translated to @version
\warning translated to 'Warning:'
\$ prints $ char
\@ prints @ char
\\ prints \ char
\& prints & char
\~ prints ~ char
\< prints < char
\> prints > char
\# prints # char
\% prints % char
\" prints " char
\. prints . char
\:: prints ::

39.3.3 Unsupported tags

Doxygen has a wealth of tags such as @latexonly that have no equivalent in Javadoc. As a result several tags that have no translation (or particular use, such as some linking and section tags) are suppressed with their content just printed out (if it has any sense, typically text content). If you are interested in more of the specifics of Javadoc, please visit How to Write Doc Comments for the Javadoc Tool.
Here is the list of these tags:

\addindex \addtogroup \anchor \attention
\brief \bug \callgraph \callergraph
\class \copybrief \copydetails \copydoc
\date \def \defgroup \details
\dir \dontinclude \dot \dotfile
\enddot \endhtmlonly \endinternal \endlatexonly
\endmanonly \endmsc \endrtfonly \endxmlonly
\enum \example \extends
\file \fn \headerfile \hideinitializer
\htmlinclude \htmlonly \implements \include
\includelineno \ingroup \internal \invariant
\interface \latexonly \line \mainpage
\manonly \memberof \msc \mscfile
\name \namespace \nosubgrouping \package
\page \paragraph \post \pre
\private \privatesection \property \protected
\protectedsection \protocol \public \publicsection
\ref \related \relates \relatedalso
\relatesalso \retval \rtfonly \section
\short \showinitializer \skip \skipline
\snippet \struct \subpage \subsection
\subsubsection \tableofcontents \test \typedef
\union \until \var \verbinclude
\weakgroup \xmlonly \xrefitem \category

If one of the following Doxygen tags appears as the first tag in a comment, the whole comment block is ignored:

\addtogroup \callgraph \callergraph \category
\class \def \defgroup \dir
\enum \example \file \fn
\headerfile \hideinitializer \interface \internal
\mainpage \name \namespace \nosubgrouping
\overload \package \page \property
\protocol \relates \relatesalso \showinitializer
\struct \name \namespace \nosubgrouping
\typedef \union \var \weakgroup

39.3.4 Further Details

TO BE ADDED.

39.4 Doxygen To Pydoc

If translation is enabled, Pydoc formatted comments should be automatically placed in the correct locations in the resulting module and proxy files. The problem is that Pydoc has no tag mechanism like Doxygen or Javadoc, so most of Doxygen commands are translated as English plain text pieces.

39.4.1 Basic Example

Here is an example segment from an included header file

/*! This is describing class Shape
 \author Bob
 */

class Shape {
public:
  Shape() {
    nshapes++;
  }
  virtual ~Shape() {
    nshapes--;
  };
  double  x, y; /*!< Important Variables */
  void    move(double dx, double dy); /*!< Moves the Shape */
  virtual double area(void) = 0; /*!< \return the area */
  virtual double perimeter(void) = 0; /*!< \return the perimeter */
  static  int nshapes;
};

Simply running SWIG should result in the following code being present in Shapes.py


...

class Shape(_object):
    """
    This is describing class Shape 
    Authors:
    Bob 

    """
    
    ...
    
    def move(self, *args):
        """
        Moves the Shape 
        """
        return _Shapes.Shape_move(self, *args)

    def area(self):
        """
        Return:
        the area 
        """
        return _Shapes.Shape_area(self)

    def perimeter(self):
        """
        Return:
        the perimeter 
        """
        return _Shapes.Shape_perimeter(self)

If any parameters of a function or a method are documented in the Doxygen comment, their description is copied into the generated output using Sphinx documentation conventions. For example

/**
    Set a breakpoint at the given location.

    @param filename The full path to the file.
    @param line_number The line number in the file.
 */
bool SetBreakpoint(const char* filename, int line_number);
would be translated to
def SetBreakpoint(*args):
    r"""
    Set a breakpoint at the given location.

    :type filename: string
    :param filename: The full path to the file.
    :type line_number: int
    :param line_number: The line number in the file.
    """

The types used for the parameter documentation come from doctype typemap which is defined for all the primitive types and a few others (e.g. std::string and shared_ptr<T>) but for non-primitive types is taken to be just the C++ name of the type with namespace scope delimiters (::) replaced with a dot. To change this, you can define your own typemaps for the custom types, e.g:

%typemap(doctype) MyDate "datetime.date";

Currently Doxygen comments assigned to vars are not present in proxy file, so they have no comment translated for them.

Whitespace and tables
Whitespace is preserved when translating comments, so it makes sense to have Doxygen comments formatted in a readable way. This includes tables, where tags <th>, <td> and </tr>are translated to '|'. The line after line with <th> tags contains dashes. If we take care about whitespace, comments in Python are much more readable. Example:

/**
 * <table border = '1'>
 * <caption>Animals</caption>
 * <tr><th> Column 1 </th><th> Column 2 </th></tr>
 * <tr><td> cow      </td><td> dog      </td></tr>
 * <tr><td> cat      </td><td> mouse    </td></tr>
 * <tr><td> horse    </td><td> parrot   </td></tr>
 * </table>
 */

translates to Python as:

  Animals
  | Column 1 | Column 2 |
  -----------------------
  | cow      | dog      |
  | cat      | mouse    |
  | horse    | parrot   |

Overloaded functions
Since all the overloaded functions in c++ are wrapped into one Python function, Pydoc translator will combine every comment of every overloaded function and put it in the comment for wrapping function.
If you intend to use resulting proxy files with Doxygen docs generator, rather than Pydoc, you may want to turn off translator completely (doxygen:notranslate feature). Then SWIG will just copy the comments to the proxy file and reformat them if needed, but all the comment content will be left as is. As Doxygen doesn't support special commands in Python comments (see Doxygen docs), you may want to use some tool like doxypy (http://code.foosel.org/doxypy) to do the work.

39.4.2 Pydoc translator

Here is the list of all Doxygen tags and the description of how they are translated to Pydoc
Doxygen tags:

\a wrapped with '_'
\arg prepended with ' --'
\author prints 'Author:'
\authors prints 'Author:'
\b wrapped with '__'
\cite wrapped with single quotes
\cond translated to 'Conditional comment: <condition>'
\copyright prints 'Copyright:'
\deprecated prints 'Deprecated:'
\e wrapped with '_'
\else replaced with '}Else:{'
\elseif replaced with '}Else if: <condition>{'
\em wrapped with '_'
\endcond replaced with 'End of conditional comment.'
\endif replaced with '}'
\exception replaced with 'Throws:'
\if replaced with 'If: <condition> {'
\ifnot replaced with 'If not: <condition> {'
\li prepended with ' --'
\n replaced with new line char
\note replaced with 'Note:'
\overload prints 'This is an overloaded ...' according to Doxygen docs
\par replaced with 'Title: ...'
\param translated to 'Arguments:\n param(type) --description'
\remark replaced with 'Remarks:'
\remarks replaced with 'Remarks:'
\result replaced with 'Result:'
\return replaced with 'Result:'
\returns replaced with 'Result:'
\sa replaced with 'See also:'
\see replaced with 'See also:'
\since replaced with 'Since:'
\throw replaced with 'Throws:'
\throws replaced wih 'Throws:'
\todo replaced with 'TODO:'
\tparam translated to 'Arguments:\n param(type) --description'
\version replaced with 'Version:'
\warning translated to 'Warning:'
\$ prints $ char
\@ prints @ char
\\ prints \ char
\& prints & char
\~ prints ~ char
\< prints < char
\> prints > char
\# prints # char
\% prints % char
\" prints " char
\. prints . character
\:: prints ::

39.4.3 Unsupported tags

Doxygen has a wealth of tags such as @latexonly that have no equivalent in Pydoc. As a result several tags that have no translation (or particular use, such as some linking and section tags) are suppressed with their content just printed out (if it has any sense, typically text content).
Here is the list of these tags:

\addindex \addtogroup \anchor \attention
\brief \bug \callgraph \callergraph
\class \copybrief \copydetails \copydoc
\date \def \defgroup \details
\dir \dontinclude \dot \dotfile
\code \endcode \endverbatim \endlink
\enddot \endhtmlonly \endinternal \endlatexonly
\endmanonly \endmsc \endrtfonly \endxmlonly
\enum \example \extends \f$
\f[ \f] \f{ \f}
\file \fn \headerfile \hideinitializer
\htmlinclude \htmlonly \implements \include
\image \link \verbatim \p
\includelineno \ingroup \internal \invariant
\interface \latexonly \line \mainpage
\manonly \memberof \msc \mscfile
\name \namespace \nosubgrouping \package
\page \paragraph \post \pre
\private \privatesection \property \protected
\protectedsection \protocol \public \publicsection
\ref \related \relates \relatedalso
\relatesalso \retval \rtfonly \section
\short \showinitializer \skip \skipline
\snippet \struct \subpage \subsection
\subsubsection \tableofcontents \test \typedef
\union \until \var \verbinclude
\weakgroup \xmlonly \xrefitem \category
\c

39.4.4 Further Details

TO BE ADDED.

39.5 Developer Information

39.5.1 Module Design

If this functionality is turned on, SWIG places all comments found into the SWIG parse tree. Nodes contain an additional attribute called DoxygenComment when a comment is present. Individual nodes containing Doxygen with Structural Indicators, such as @file, as their first command, are also present in the parse tree. These individual "blobs" of Doxygen such as :

/*! This is describing function Foo
 \param x some random variable
 \author Bob
 \return Foo
 */

are passed on individually to the DoxygenTranslator Module. This module builds its own private parse tree and hands it to a separate class for translation into the target documentation language. For example, JavaDocConverter is the Javadoc module class.

39.5.2 Debugging Doxygen parser and translator

There are two handy command line switches, that enable lots of detailed debug information printing.

  -debug-doxygen-parser     - Display Doxygen parser module debugging information
  -debug-doxygen-translator - Display Doxygen translator module debugging information

39.5.3 Tests

This part of SWIG currently has 6 runtime tests in both Java and Python.

  doxygen_parsing
  doxygen_translate
  doxygen_translate_all_tags
  doxygen_basic_translate
  doxygen_basic_notranslate
  doxygen_translate_links
  doxygen_misc_constructs

All this tests are included in common.mk and are built with the commands like 'make check-test-suite' or 'make check-python-test-suite'. To run them individually, type make <testname>.cpptest -s in the language-specific subdir in Examples/test-suite directory. For example:

  Examples/test-suite/java $ make doxygen_misc_constructs.cpptest -s
If the test fails, both expected and translated comments are printed to std out, but also written to files expected.txt and got.txt. Since it is often difficult to find a single character difference in several lines of text, we can use some diff tool, for example:
  Examples/test-suite/java $ kdiff3 expected.txt got.txt

Runtime tests in Java are implemented using Javadoc doclets. To make that work, you should have tools.jar from the JDK in your classpath. Or you should have JAVA_HOME environmental var defined and pointing to the JDK location.
The Java's comment parsing code (the testing part) is located in commentParser.java. You may see it to understand how the checking process works. There is also a possibility to run that file as stand-alone program, with 'java commentParser ', and it will print the list of comments found in the specified directory (in the format it's used in runtime tests). So, when you want to create the new test of Doxygen comment translator, just copy any existing one, and replace the actual comment content (section of entries in form 'wantedComments.put(...)' with the output of the above command.
Runtime tests in Python are just plain strings comparison with the use of __doc__ properties.

39.6 Extending to Other Languages

In general, an extension to another language requires a fairly deep understanding of the target language module, such as Modules/python.cxx for Python. Searching for "doxygen" in the java.cxx module can give you a good idea of the process for placing documentation comments into the correct areas. The basic gist is that anywhere a comment may reside on a node, there needs to be a catch for it in front of where that function, class, or other object is written out to a target language file. The other half of extension is building a target documentation language comment generator that handles one blob at a time. However, this is relatively simple and nowhere near as complex as the wrapper generating modules in SWIG. See DoxygenTranslator/JavaDocConverter.cpp for a good example. The target language module hands the DoxygenTranslator the blob to translate, and receives back a translated text.

What is given to the Doxygen Translator

/*! This is describing function Foo
 \param x some random variable
 \author Bob
 \return Foo
 */

What is received back by java.cxx

/** This is describing function Foo
 *
 * @param x some random variable
 * @author Bob
 * @return Foo
 */

Development of the comment translator itself is simplified by the fact that the DoxygenTranslator module can easily include a main function and thus be developed, compiled, and tested independently of SWIG.