Boost logo

Boost Users :

From: Mockey Chen (mockey.chen_at_[hidden])
Date: 2006-07-15 23:13:18


Hi, experts,
I using boost.regex parse the content-type header's value in MIME.

A typical content-type's BNF like following:

   Content-Type = "Content-Type:" SP media-type
   media-type = type "/" subtype *( ";" gen-param )
   type = token
   subtype = token

   gen-param = pname [ "=" pval ]
   pname = token
   pval = token / quoted-string
An example content-type as following:
multipart/mixed; boundary="theboundary" ;id=33

There may be zero or one more parameters in content-type.

My parse code as following:

#include <boost/regex.hpp>
#include <iostream>
#include <string>

#define COUT(x) std::cout << x << std::endl
#define VAR(v) std::cout << #v << ": " << v << std::endl

#define TOKEN "[a-zA-Z0-9]+"
#define TYPE "(" TOKEN ")"
#define SUBTYPE "(" TOKEN ")"
#define PNAME "(" TOKEN ")"
#define PVALUE "(" TOKEN ")"
#define PARAM "(;" PNAME "=" PVALUE ")"
#define CONTENT_TYPE_EXPRESS TYPE "/" SUBTYPE PARAM "*"

int main()
{
    std::string ct("multipart/mixed;boundary=theboundary;id=33");
    COUT(CONTENT_TYPE_EXPRESS);
    boost::regex regex_content_type(CONTENT_TYPE_EXPRESS);
    boost::smatch result;
    if (boost::regex_match(ct, result, regex_content_type)) {
        VAR(result.size());
        for (unsigned int i=0; i<result.size(); ++i) {
            if (result[i].matched) {
                COUT("result[" << i << "] = " << result[i]);
            }
        }
    }
    else {
        COUT("match failed.");
    }

    return 0;
}

/**
 output:

([a-zA-Z0-9]+)/([a-zA-Z0-9]+)(;([a-zA-Z0-9]+)=([a-zA-Z0-9]+))*
result.size(): 6
result[0] = multipart/mixed;boundary=theboundary;id=33
result[1] = multipart
result[2] = mixed
result[3] = ;id=33
result[4] = id
result[5] = 33
 */

I want to get the parameter "boundary" and its value.

Any way to do it?

Sebastian Redl said:
Not really. Regex doesn't support getting each submatch in a repeated
expression. Xpressive does, but in this case, I think you'd be better
off modelling the content-type grammar in Spirit and using that.

Anyway, this question should have gone to boost-users, so I forward this to
boost-users.

Thanks in advance.

-- 
Regards.
                       Mockey Chen

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net