Boost logo

Ublas :

Subject: [ublas] [GSoC] DataFrame project bonding
From: Wei Wang (wangweiaperion_at_[hidden])
Date: 2019-05-07 04:14:04


Hi,
My name is Wei Wang, and I am lucky to be able to as the student for GSoC19
Boost::ublas library. My project is to build a data structure work like
pandas.DataFrame or dataframe in R. As I found on GSoC's website, my mentor
will be Bellot, and I'm very glad.
I have two questions related to logistics:
(1) Where should I work for the code? I find an empty organization in
Github(https://github.com/BoostGSoC19), but I'm still not sure how I gonna
submit them.
(2) Should I fork the whole ublas project? Or simply start build my own
project directly under boost/numeric/ublas?
Another two questions related to project requirement:
(1) I have read one implementation from one previous student (
https://github.com/BoostGSoC17/data_frame), which is pretty good. But it
somehow goes against my idea. Is it okay to start a new project?
And also I'd like to ask what's your expectation from this project?
I'm targeting at pandas.DataFrame(though it won't be that full-featured),
but the basics are:
- indexing
- slicing
- sort based on col
- relation ops like select, join
- set operations on rows like union, set diff, intersect
- group (possibly)
(2) What should I show in my final submit? Will it be evaluated on whether
my code is able to merge? Or simply I will be provided some test case and
see if I can pass them?

Cheers,
Wei