Boost logo

Boost :

From: Mohammad Nejati [Ashtum] (ashtumashtum_at_[hidden])
Date: 2023-05-09 14:10:42


Hi, I'm Mohammed Nejati: a C++ enthusiast, Boost aficionado, and a
respecter of good implementation and documentation. I have recently
been hired by The C++ Alliance and I am glad that I have a chance to
dedicate my time to contribute to the C++ community.

We have recently begun to design a new website for boost.org and Boost
library documentation:
https://github.com/cppalliance/site-docs

I am on a quest to add full-text search functionality for the new
Boost library documentation website. Now, users can conveniently
search in a specific Boost library documentation or all Boost
libraries without using an external search engine. A great search
experience helps users of libraries quickly refer to the documentation
during development. Reference documentation can become more
discoverable with a customized search engine.

This is as new for me as it likely is for you, and any help and
feedback would be appreciated. In the following, I'll explain what we
have done and what are our findings so far:

We deployed the Antora Lunr Extension on our in-development site docs:
https://docs.cppalliance.org/user-guide/index.html

Two deployment options exist for search:
* In the client. With this model, the index is downloaded to the
browser, and the search algorithm is executed using the resources of
the browser’s host machine.
* In the server. Here the index is hosted in the cloud, and the search
algorithm runs on the computing resources of the cloud provider.

The client-side strengths:
* It is very responsive because there is no request/response involved
in the search process.
* There is no need for a server-side search engine and keep it updated
with new content.
* Low maintenance cost because there is no load on the server to
respond to search queries.
* It can work offline (considering that we can locally build reference
documentation).

A good example of a client-side search is https://docs.rs/ It is used
for searching in the reference documentation for each crate.
Here is an example of the search page (searching `socket` in the
`Tokio` library):
https://docs.rs/tokio/1.28.0/tokio/?search=socket

The metadata in these search results is applicable to C++ as well.

The server-side strengths:
* Wider search scopes: server-hosted search indices can be huge.
* Better results: because of more powerful cloud-computing resources.
* The possibility of semantic search.
* Analytics: server-collected statistics can be used to better serve users.

These deployment options are not mutually exclusive; both are
deployable, each where appropriate.

We are working on a design document for search functionality:
https://github.com/cppalliance/site-docs/blob/develop/doc/search-functionality.adoc.
Please feel free to provide feedback by opening an issue in GitHub or
asking questions on the list.

We are also currently exploring the possibility of fine-tuning a large
language model (LLM) specifically for C++ and Boost-related content.
This will be fine-tuned on a vast corpus of code snippets, tutorials,
and documentation. Our long-term goal is to create an interactive
learning platform where users can ask complex questions in natural
language and receive accurate and high-quality answers in return.
While this project is still in the experimental stage, we believe that
we can create a more engaging and interactive learning experience for
users, making it easier for them to learn and use libraries
effectively.

Thank you for your time, and I look forward to your suggestions and feedback.

Best regards,
Mohammad Nejati


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk