I wish to implement the DataFrame library for Boost. I have made the project proposal and am attaching the same.
As, this project is an expansion of the previous project
there are a few directions that I would like to work in whichever adds more to the project.
1. We could be to keep the existing code base as it is and implement new features on top of it.
2. We could try to restructure the code into Modules(C++20 feature) and then implement the features.
3. Or, we could restructure the code without ET but rather using C++20 One Ranges. This makes more sense to me as I believe that uBlas is being ported to C++20.
The DataFrame would include the following features:
1. All the features already present(union, combine, join).
2. Read/write from DataFrame using JSON(boost::property_tree to DataFrame)
3. Operator support for addition, subtraction, multiplication, division, modulo and power, etc as well as support for comparison operators.
4. Functions to perform apply, apply_element_wise, aggregate, transform and expand on a given DataFrame.
5. Data analysis tools for standard deviation, variance, mean, etc.
6. Re-indexing methods like replace, duplicate, filter, etc.
7. Reshaping methods for sorting, append, pivot, etc.
Full Details of the same can be found in the Project Proposal.
I have been in contact with David Bellot for the past week regarding GSoC. I have completed the first draft of the competency check and sent the same to him. I would love to work I am requesting him to be assigned as a mentor for the same.
This is an official application for the GSoC' 20. I am open to any and all suggestions.