Today marks the end of 1st week of GSoC 2021. This week I worked on:
- Adapted the csv parser for mlpack.
- Made some minor changes in load fucntion to enable the calling of new parser.
- Created
file_type
for mlpack.
The main goal here is to make the parser as generic as possible. Cornad suggested that I should not use the arma fucntions which are not present in public API as they might get changed anytime which might lead to issues in future. So, we decided to either rewrite those for mlpack or we can use libraries like type_traits
.
Changes in the file structre:
I added 2 new files in mlpack/core/data/
- csv_parser.hpp
- csvparserimpl.hpp
Changes in user API: Currently mlpack relies on armadillo for linear algebra operations, but what if someone want to use some other library? Following on this idea we make a small change in the user facing API.
template<typename eT>
bool Load(const std::string& filename,
arma::Mat<eT>& matrix,
const bool fatal = false,
const bool transpose = true,
const arma::file_type inputLoadType = arma::auto_detect);
Currently Load fucntion loads an arma matrix from file. We have a template parameter eT which basically is the type of data we want to store in the matrix. But now we are planning to pass the whole martix as the template parameter.
template<typename MatType>
bool LoadCSVV(MatType& x, std::fstream& f, std::string&);
You can see here that we have MatType
as the template parameter, so user can basically have any matrix.
Apart from these I had to make some changes in load_impl.hpp
Link to my PR, you can follow the discussion there if you want.
Omar helped me alot by solving my doubts whenever I got stuck, but I still think I should be asking more doubts. So that’s the goal for the coming week, Ask as many doubts as you can!!!
I am planing to write another post on monday which will talk about what I am planning to do this week. Also I want to improve the look of the blogs a bit, so lets see how it goes.