Automatic Documentation Generation & Hosting

Have some feature requests, feedback, cool stuff to share, or want to know where FreeCAD is going? This is the place.
Forum rules
Be nice to others! Read the FreeCAD code of conduct!
User avatar
gbroques
Posts: 164
Joined: Thu Jan 23, 2020 3:28 am
Location: St. Louis, Missouri

Automatic Documentation Generation & Hosting

Post by gbroques »

Hello all,
I've recently been spurred to investigate automatically building and hosting our documentation with discussion on this thread.

The impetus of this was my confusion, and being directed to compile the source code to see up to date documentation.

In an ideal world, I wouldn't have to compile the source code to see the latest up-to-date documentation. I could simply navigate to a URL and browse the docs in a readable and useful format.

The process of building and hosting our documentation should be automatic, and there are services to assist open-source projects with this task.

For example, I have experience with the open-source documentation hosting platform Read the Docs, which uses the Python Documentation Generator Sphinx, and integrating this service with Travis CI which we already use to automate our build and continuous integration process.

Since Sphinx and Read the Docs is built for python, it can readily consume python docstrings to generate documentation.

Sphinx also has a breath extension for generating documentation from Doxygen sources for FreeCAD's C++ code.

I don't necessarily care what services, technologies, or libraries we use.

What I want is:

1. Automatic building and hosting of documentation as part of our continuous integration process
2. Versioned documentation so that I can refer to the latest docs, the latest stable version, or an older version
3. Readable, formatted well, and pretty -- Supports C++ and Python

David_D outlines two solutions in this post:

1. Stick with Doxygen
2. Move to Sphinx

David_D also mentions upsides of moving to Sphinx:
David_D wrote:
  • Documentation generated quickly and easily. Lowers the bar to entry for those who want to document things.
  • Greater flexibility in how the documentation is structured, instead of sticking to doxygen's rather c++ oriented structure. Currently, I find the api doc's structure rather arbitrary. I think we can make it much more user friendly.
  • Good looking.
yorik chimed in on this post:
yorik wrote: A note about doxygen/Sphinx, I tried Sphinx some time ago (everything is still there in the /Doc folder) also because I felt doxygen was too ill-suited for python docs... But quickly came to the same shortcomings: For ex. Sphinx is unable to document classes spawn on the fly by C++ code, for example the Document Object. So we ended up setting up dummy objects (in Mod/TemplatePyMod), but it removed all the point of auto doc generation...

And after a while, since Sphinx wasn't able to do much more than doxygen anyway, and its main advantage was a better output/styling, but discovered you could apply css styles to the doxygen output, and therefore do quite a lot of styling there too, I saw lees the point of keeping using Sphinx, and stopped caring about it
He doesn't seemed married to the idea of using Sphinx, and seems fine with Doxygen. Although, was initially in favor of Sphinx because it's better suited to generating Python documentation.

My questions are:
1. How does this proposal fit in with our current hosted API docs?
2. Our current hosted API docs are automatically generated. Are they automatically deployed, hosted, and versioned?
3. Are there alternative technologies and services for documentation besides the ones I listed above?
4. If I were to figure out building and hosting our docs with Travis CI, Sphinx, and Read the Docs, then would someone with merge permissions accept these changes?
5. This is an open discussion. What do you feel our current automatically generated documentation lacks? How can it be improved?

vocx mentions on current limitation of our docs in this post:
vocx wrote: Also, remember that https://www.freecadweb.org/api is incomplete. This generated documentation is stripped from many things to make it small (600 MB), so it can be hosted online. The full documentation that can be produced offline from the sources is 4.5 GB in size.
The following "This is now a thread about documentation" post offers good related discussion on this topic.

Example Read the Docs Hosted & Sphinx Generated Docs Site for a FreeCAD Python Workbench
https://ose-3d-printer-workbench.readth ... index.html
vocx
Veteran
Posts: 5205
Joined: Thu Oct 18, 2018 9:18 pm

Re: Automatic Documentation Generation & Hosting

Post by vocx »

gbroques wrote: Thu Jun 04, 2020 2:43 am He doesn't seemed married to the idea of using Sphinx, and seems fine with Doxygen. Although, was initially in favor of Sphinx because it's better suited to generating Python documentation.
You also have to consider that tools in the open source world develop fast. I presume that at the time Yorik tested Sphinx, in 2013 or so, Breathe probably wasn't mature, and Sphinx wasn't as developed as it is now, so it couldn't handle well our particular mix of C++ and Python.

If you see the files in src/Doc/sphinx, many of them were last touched 7 and 9 years ago. So, I don't think Yorik was opposed to the idea of using Sphinx, it's just that at that time the tools were just not good enough. The experiments from David_D lead me to believe that the situation is different now. This should be possible, it's just that somebody has to delve into it and make it work.
1. How does this proposal fit in with our current hosted API docs?
2. Our current hosted API docs are automatically generated. Are they automatically deployed, hosted, and versioned?
The online API is basically manually updated by Yorik. At least that was the case before. I'm not sure if the situation has changed since we moved to the DigitalOcean servers. Since I haven't seen any announcement about it, I don't think it has changed.

If you notice, the online API documentation is incomplete, the reason is that the entire documentation is quite heavy, around 5 GB (a lot of Graphviz diagrams), so it takes considerable space. That's the reason the online API is a reduced version with only around 600 MB to 1 GB of content. So, you have to consider this as well for hosting. How much hosting can we get for free, or for a price?

But also, can we do something to our Doxygen configuration files to reduce the amount of documentation produced? For example, currently our documentation seems to also generate pages for PyCXX, Salome SMESH, Zipios++, etc. These are various external dependencies that are included in our source tree. They are not really part of FreeCAD, so do we need to document them at all? Maybe we could save a lot of space by not generating the SMESH documentation.

Independently of whether we use Sphinx or not, the current Doxygen configuration file could also be improved in many ways.
4. If I were to figure out building and hosting our docs with Travis CI, Sphinx, and Read the Docs, then would someone with merge permissions accept these changes?
I'm sure nobody is against better programming documentation, we just need to figure the details, how to generate it easily, how to make it readable and pretty, and the amount of hosting that we would need (would it cost any significant amount of dollars?). How often does the documentation need to be updated? Every week? Every month? The development version can change pretty fast, so I think every month is sensible, and obviously keeping the documentation of a stable version seems like a good idea.
Always add the important information to your posts if you need help. Also see Tutorials and Video tutorials.
To support the documentation effort, and code development, your donation is appreciated: liberapay.com/FreeCAD.
User avatar
gbroques
Posts: 164
Joined: Thu Jan 23, 2020 3:28 am
Location: St. Louis, Missouri

Re: Automatic Documentation Generation & Hosting

Post by gbroques »

vocx wrote: Thu Jun 04, 2020 3:17 am You also have to consider that tools in the open source world develop fast. I presume that at the time Yorik tested Sphinx, in 2013 or so, Breathe probably wasn't mature, and Sphinx wasn't as developed as it is now, so it couldn't handle well our particular mix of C++ and Python.

If you see the files in src/Doc/sphinx, many of them were last touched 7 and 9 years ago. So, I don't think Yorik was opposed to the idea of using Sphinx, it's just that at that time the tools were just not good enough. The experiments from David_D lead me to believe that the situation is different now. This should be possible, it's just that somebody has to delve into it and make it work.
Good points!
vocx wrote: Thu Jun 04, 2020 3:17 am The online API is basically manually updated by Yorik.
No good.

If this is the case, then I imagine Yorik will be interested in how to automate this process :D
vocx wrote: Thu Jun 04, 2020 3:17 am If you notice, the online API documentation is incomplete
This may not be a concern if we leave out information that people don't care about, or isn't helpful to include.

Is this a concern? Are we currently excluding important information that should be included?

What are examples of information we're currently excluding that should be available in our online API docs?
vocx wrote: Thu Jun 04, 2020 3:17 am So, you have to consider this as well for hosting. How much hosting can we get for free, or for a price?
Doing some research into Read the Docs, and they don't have an explicit limit stated anywhere, but there has to be some kind of limit.

I'll send an email to support@readthedocs.org about what their limit is, and report back.
vocx wrote: Thu Jun 04, 2020 3:17 am But also, can we do something to our Doxygen configuration files to reduce the amount of documentation produced? For example, currently our documentation seems to also generate pages for PyCXX, Salome SMESH, Zipios++, etc. These are various external dependencies that are included in our source tree. They are not really part of FreeCAD, so do we need to document them at all? Maybe we could save a lot of space by not generating the SMESH documentation.

Independently of whether we use Sphinx or not, the current Doxygen configuration file could also be improved in many ways.
Right, maybe this is the best place to start. Let's improve the current Doxygen configuration file first.

You mention generating pages for third-party libraries or external dependencies. I might argue that we don't need to include them at all, but would appreciate others weighing in.

Or at least exclude a sub-set of specific external dependencies that we don't care about like SMESH.
vocx wrote: Independently of whether we use Sphinx or not, the current Doxygen configuration file could also be improved in many ways.
What are other ways the current doxygen configuration file could be improved?

WAYS DOXYGEN CONFIGURATION FILE CAN BE IMPROVED
  • Exclude external dependencies (e.g. PyCXX, Salome SMESH, Zipios++, etc. )
vocx
Veteran
Posts: 5205
Joined: Thu Oct 18, 2018 9:18 pm

Re: Automatic Documentation Generation & Hosting

Post by vocx »

gbroques wrote: Thu Jun 04, 2020 10:56 pm ...
Is this a concern? Are we currently excluding important information that should be included?

What are examples of information we're currently excluding that should be available in our online API docs?
Go to the online documentation, then follow the links that you want. At some point you will get an unreachable page, because that page is missing. Basically, the online documentation misses many different articles, so many links point to dead ends. When you are reading the documentation, you want to jump and explore the different classes and functions, but if pages are missing, obviously you are left with incomplete information, which is frustrating.

So, there isn't something in particular that is missing, it's just that the entire documentation is incomplete.
Right, maybe this is the best place to start. Let's improve the current Doxygen configuration file first.
...
What are other ways the current doxygen configuration file could be improved?
See this thread, Doxygen: expanding inherited functions.

In my computer, the arrows don't open. This is maybe something about the CSS stylesheet, the javascript, etc. Do you know how to fix that?

In general, we would like to have a nicer template as well, I guess. We can configure the number of levels to display, the colors, the fonts, and things like that.

Read Doxygen.

The main configuration file is contained in src/Doc/BuildDevDoc.cfg.in, this file seems like it was built with Doxygen 1.7.1, but it seems to be essentially the same created with older versions. As you see in that directory, there are older configuration files, BuildDocDoxy.cfg (1.6.1), BuildDocDoxyFull.cfg (1.4.7). I feel the configuration file was created a long time ago, and it wasn't updated significantly, so it's possible the current documentation doesn't make use of Doxygen's most recent features because of it.

The current version is 1.8.x., so we should create a new configuration file, with doxygen -g, and then update it with the information in the current BuildDevDoc.cfg.in. But do notice that this file uses CMake to write some variables during the general CMake configuration. This is the reason troubleshooting this Doxygen business is a bit tiresome, because you need to make changes to the configuration file, then run CMake, then compile the documentation make DevDoc, and test. It takes quite a long time to generate the documentation unless you have many fast CPUs. And periodically you should try it from a completely empty directory to be sure it produces the correct output, and you aren't seeing the result of old files. If you could test it on a small scale and then scale it up to the entire system, that would be good.
Always add the important information to your posts if you need help. Also see Tutorials and Video tutorials.
To support the documentation effort, and code development, your donation is appreciated: liberapay.com/FreeCAD.
User avatar
gbroques
Posts: 164
Joined: Thu Jan 23, 2020 3:28 am
Location: St. Louis, Missouri

Re: Automatic Documentation Generation & Hosting

Post by gbroques »

vocx wrote: Thu Jun 04, 2020 11:24 pm Go to the online documentation, then follow the links that you want. At some point you will get an unreachable page, because that page is missing. Basically, the online documentation misses many different articles, so many links point to dead ends. ... which is frustrating.

... the entire documentation is incomplete.

See this thread, Doxygen: expanding inherited functions.

In my computer, the arrows don't open.

In general, we would like to have a nicer template as well, I guess. We can configure the number of levels to display, the colors, the fonts, and things like that.
OK this is great information, and you provide really good pointers for someone to investigate this further.

I've updated and revised my running list below.

WAYS DOXYGEN CONFIGURATION FILE CAN BE IMPROVED
  • Exclude external dependencies (e.g. PyCXX, Salome SMESH, Zipios++, etc. )
  • Documentation is incomplete, dead ends, and pages are missing
  • Arrows don't expand locally (see Doxygen: expanding inherited functions for details)
  • Configure a prettier more modern looking template
READ THE DOCS HOSTING
Also, support got back to me about file size limits on Read the Docs.

They said 5 GB of static content isn't a problem, and there are no size limits.

However, there are time and memory limits:
  • 15 minutes build time
  • 3GB of memory
  • 2 concurrent builds
We can increase build limits on a per-project basis. Send an email to support@readthedocs.org providing a good reason why your documentation needs more resources.
Source: https://docs.readthedocs.io/en/stable/builds.html

They also reminded me:
... Doxygen is not an explicitly supported documentation platform on Read the Docs. It is possible to use it but it involves a bit of a hack. We support Sphinx and MkDocs out of the box.
However, it should be possible to use Read the Docs and Sphinx with the Sphinx breathe extension.

We may hit those limits, but maybe not depending on how much we slim down the docs by techniques like excluding external dependencies.

Also possible they might be open to increasing those limits for us.
vocx
Veteran
Posts: 5205
Joined: Thu Oct 18, 2018 9:18 pm

Re: Automatic Documentation Generation & Hosting

Post by vocx »

gbroques wrote: Sat Jun 06, 2020 3:12 pm However, there are time and memory limits:
[*]15 minutes build time
I'm not sure how this impacts us, but I definitely think 15 minutes of build time is too little. I need more than 30 minutes to build the documentation from scratch, but maybe they have many processors to distribute the load and this isn't a problem?
... Doxygen is not an explicitly supported documentation platform on Read the Docs. It is possible to use it but it involves a bit of a hack. We support Sphinx and MkDocs out of the box.
I am not thinking about hosting the current Doxygen documentation to be honest. If you investigate that, it's fine, but I think what makes more sense is for you to continue the experiments of David_D with Sphinx+Breathe, and then we can think about hosting that online.
Also possible they might be open to increasing those limits for us.
Yes, I think in the past the Travis people have been accommodating. It's just that somebody needs to be the point of contact.
Always add the important information to your posts if you need help. Also see Tutorials and Video tutorials.
To support the documentation effort, and code development, your donation is appreciated: liberapay.com/FreeCAD.
vocx
Veteran
Posts: 5205
Joined: Thu Oct 18, 2018 9:18 pm

Re: Automatic Documentation Generation & Hosting

Post by vocx »

gbroques wrote: Thu Jun 04, 2020 10:56 pm ...
What are examples of information we're currently excluding that should be available in our online API docs?
See here, exactly what I was talking about, https://www.freecadweb.org/api -- broken....
Always add the important information to your posts if you need help. Also see Tutorials and Video tutorials.
To support the documentation effort, and code development, your donation is appreciated: liberapay.com/FreeCAD.
User avatar
yorik
Founder
Posts: 13466
Joined: Tue Feb 17, 2009 9:16 pm
Location: Brussels
Contact:

Re: Automatic Documentation Generation & Hosting

Post by yorik »

The API docs are indeed updated manually from time to time, mostly by me. This is indeed bad :D

About using other newer tools than Sphynx, that's fine with me too, anything that works better is always welcome. But there is one really interesting thing about doxygen: FreeCAD is really a mix of C++ and Python code. Some workbenches (FEM, Path..) are very hybrid. Lots of Python functionality is generated in C++. So focusing the doc generation on Python is IMHO missing a very big part of what makes FreeCAD powerful. And doxygen is I think the best tool we have to reflect that hybrid situation.

My humble opinion is that, before looking for the next best magical tool, we should work more on the contents. Make the output of doxygen better. Classify modules better. Add everything that is not yet properly documented. Remove from doxy generation the unuseful stuff (what I started with the "make WebDoc" target). Make a clearer distinction between C++ and Python parts. Make Python pages clearer. There is a lot of styling options possible.

Updating the doc automatically is no big deal. We should actually turn it into a git repo, and make it accessable with github pages or something. Then anybody could rebuild it and simply submit a PR... I'll try to rebuild it and do that today.

*EDIT* https://github.com/FreeCAD/API
vocx
Veteran
Posts: 5205
Joined: Thu Oct 18, 2018 9:18 pm

Re: Automatic Documentation Generation & Hosting

Post by vocx »

yorik wrote: Mon Jun 15, 2020 10:04 am ... Lots of Python functionality is generated in C++. So focusing the doc generation on Python is IMHO missing a very big part of what makes FreeCAD powerful. And doxygen is I think the best tool we have to reflect that hybrid situation
...
We aren't focusing only on Python, that's the whole point of this thread. With a component called "Breathe", Sphinx can use Doxygen underneath to parse C++ files, and in this way generate documentation for C++ sources and for Python sources at the same time. This was already investigated by David_D, who was the one who added new docstrings some weeks ago to Arch. According to him, this worked fine, but he never finished his investigations. Hopefully gbroques is still looking into this.

This is now a thread about documentation
Always add the important information to your posts if you need help. Also see Tutorials and Video tutorials.
To support the documentation effort, and code development, your donation is appreciated: liberapay.com/FreeCAD.
User avatar
David_D
Posts: 81
Joined: Fri Jun 29, 2018 6:43 am
Location: Christchurch, New Zealand

Re: Automatic Documentation Generation & Hosting

Post by David_D »

Hey, sorry for the late chime in. I've just come off some final exams. :cry: :cry:

I'd like to echo yorik, in that the tool we use to do the documentation is a much smaller picture than writing the actual documentation. When I was working through the sphinx experiment and documenting Arch, the actual documentation took far more time and effort than setting up sphinx.

Overall, whatever tool we use, or how frequently we rebuild the documentation, the code base will not be any less impenetrable unless we write documentation. Sphinx and Doxygen both work acceptably with both programming languages. If we wanted to, we could have both existing at the same time, and just let the users pick their preferred interface. They are not mutually exclusive.

That being said, I think sphinx is good, given my minor investigation. Having worked out many of the kinks in the c++ to sphinx system, I cannot think of any features that doxygen has that sphinx does not. It can do the c++ essential things, like showing dependency graphs and linking between parents and children. The most important thing however, is that I found sphinx significantly more flexible and extensible than doxygen. I believe that given FreeCAD's peculiarities, flexibility is essential.
Post Reply