|
Ublas : |
Subject: [ublas] [newbie] submatrix and rest of a matrix
From: David Bellot (david.bellot_at_[hidden])
Date: 2009-11-17 12:55:48
Hi Ublas users,
this is a newbie question:
I am implementing a cross validation algorithm and for that purpose I need
to split up a big matrix X1 and a big vector Y1 many times in different way.
The idea is to use a percentage of the initial dataset (X1 and Y1) to fit a
model and the rest of this dataset to test it.
Let's say my fitting procedure is
double fit(const matrix<double>& X, const vector<double>& y)
and initially I have my big dataset defined as
matrix<double> X1;
vector<double> Y1;
during the loop of the cross-validation algo, I split up X1 and Y1 in the
following manner to obtain a Xtraining and Ytraining dataset and Xtest and
Ytest dataset:
|------------------------------------|
| Xtraining | Xtest | Xtraining |
| | | |
| | | |
|------------------------------------|
|------------------------------------|
| Ytraining | Ytest | Ytraining |
|------------------------------------|
Of course, Xtest and Ytest is at a different position at each step of the
loop.
Xtest and Ytest are easy to obtain with a matrix_range<matrix<double> >
However Xtraining and Ytraining require a copy of the data to a temporary
matrix (and vector).
And this is my problem ! The dataset is to big and making a copy costs too
much. I cannot afford having 2 copies of the dataset in memory (and copying
them all the time).
So how can I do that efficiently (indirect_array ? other ?) and do I need to
redefine the prototype of my fit function ?
Best Regards,
David
-- David Bellot, PhD david.bellot_at_[hidden] http://david.bellot.free.fr