Boost tokenizer does not work, and shows up invalid read of size on valgrind

Using latest boost version 39 on gcc 4.2.1 on SUSE linux I am trying to load a file, and then split the lines into a vector of strings. However when this is run, it showed that the last string was corrupt. When I ran this with valgrind, the very first error shows an invalid read of size 1, in guts of boost::char_separator. ==19963== Invalid read of size 1 ==19963== at 0x8056870: bool boost::char_separator<char, std::char_traits<char> >::operator()<__gnu_cxx::__normal_iterator<char const*, std::string>, std::string>(__gnu_cxx::__normal_iterator<char const*, std::string>&, __gnu_cxx::__normal_iterator<char const*, std::string>, std::string&) (token_functions.hpp:430) ==19963== by 0x8056E3E: boost::token_iterator<boost::char_separator<char, std::char_traits<char> >, __gnu_cxx::__normal_iterator<char const*, std::string>, std::string>::initialize() (token_iterator.hpp:70) ==19963== by 0x8056EA6: boost::token_iterator<boost::char_separator<char, std::char_traits<char> >, __gnu_cxx::__normal_iterator<char const*, std::string>, std::string>::token_iterator(boost::char_separator<char, std::char_traits<char> >, __gnu_cxx::__normal_iterator<char const*, std::string>, __gnu_cxx::__normal_iterator<char const*, std::string>) (token_iterator.hpp:77) ==19963== by 0x8056FAA: boost::tokenizer<boost::char_separator<char, std::char_traits<char> >, __gnu_cxx::__normal_iterator<char const*, std::string>, std::string>::begin() const (tokenizer.hpp:86) here is program: BOOST_AUTO_TEST_CASE( test_log_append ) { string logFile = "test/logfile.txt"; // Load the log file into a vector, of strings, and test content ifstream ifs(logFile.c_str()); BOOST_REQUIRE_MESSAGE(ifs, "Could not open log file\n"); stringstream ss; ss << ifs.rdbuf(); // Read the whole file into a string char_separator<char> sep("\n"); // Split the file content unix=\n pc =\n\r typedef boost::tokenizer<boost::char_separator<char> > tokenizer; tokenizer tokens(ss.str(), sep); // <<<<<<<< valgrind barfs here std::vector<std::string> lines; lines.reserve(9); std::copy(tokens.begin(), tokens.end(), back_inserter(lines)); // <<<<<<<< valgrind barfs here for(int i = 0; i < lines.size(); i++) { cerr << "'" << lines[i] << "'\n"; } } the input in the logfile.txt is of the form: MSG:[16:36:09 14.7.2009] First Message LOG:[16:36:09 14.7.2009] LOG WAR:[16:36:09 14.7.2009] ERROR ERR:[16:36:09 14.7.2009] WARNING DBG:[16:36:09 14.7.2009] DEBUG OTH:[16:36:09 14.7.2009] OTHER OTH:[16:36:09 14.7.2009] OTHER2 MSG:[16:36:09 14.7.2009] Last Message The output is of the form: 'MSG:[16:36:09 14.7.2009] First Message' 'LOG:[16:36:09 14.7.2009] LOG' 'WAR:[16:36:09 14.7.2009] ERROR' 'ERR:[16:36:09 14.7.2009] WARNING' 'DBG:[16:36:09 14.7.2009] DEBUG' 'OTH:[16:36:09 14.7.2009] OTHER' 'OTH:[16:36:09 14.7.2009] OTHER2' '�:[16:36:09 14.7.2009] Last Message' Notice that the last string is corrupt. Is the tokenizer known to be buggy in boost 1.39, or am I doing it all wrong ? Best regards, Ta, Avi

AMDG Avi Bahra wrote:
Using latest boost version 39 on gcc 4.2.1 on SUSE linux I am trying to load a file, and then split the lines into a vector of strings. However when this is run, it showed that the last string was corrupt. When I ran this with valgrind, the very first error shows an invalid read of size 1, in guts of boost::char_separator.
<snip>
here is program:
BOOST_AUTO_TEST_CASE( test_log_append ) {
string logFile = "test/logfile.txt";
// Load the log file into a vector, of strings, and test content ifstream ifs(logFile.c_str()); BOOST_REQUIRE_MESSAGE(ifs, "Could not open log file\n");
stringstream ss; ss << ifs.rdbuf(); // Read the whole file into a string char_separator<char> sep("\n"); // Split the file content unix=\n pc =\n\r typedef boost::tokenizer<boost::char_separator<char> > tokenizer; tokenizer tokens(ss.str(), sep); // <<<<<<<< valgrind barfs here
ss.str() returns a temporary std::string. boost::tokenizer stores a reference to the string. The temporary string is destroyed and tokenizer is left with a dangling reference. In Christ, Steven Watanabe
participants (2)
-
Avi Bahra
-
Steven Watanabe