Boost logo

Boost :

From: chintan rao (chintanraoh_at_[hidden])
Date: 2008-03-22 22:31:08


Hi,

On Sun, Mar 23, 2008 at 5:37 AM, Phil Endecott <
spam_from_boost_dev_at_[hidden]> wrote:

> chintan rao wrote:
> > I was thinking of implementing trie and various variation of tries,
> namely
> > Ternary trees, Suffix trees and DAWG(Directed Acyclic Word Graphs) , for
> > Google Soc.
>
> That's interesting. Do you have a motivating application?

Till now i was gathering ideas here at the mailing list and other places.
Will have to prepare a final application.

> I last thought about tries when Boost.Flyweight was being discussed.
> Could your tries be used there?

 I have not looked into Boost.Flyweight but I suppose they can be used :).

At one time I was considering writing an Apache log file analysis
> program. This would have used some sort of trie or similar structure
> to store the URLs (the aim is to be able to analyse very large logs in
> RAM). At the time I was considering PATRICIA tries.

Some people had suggested PATRACIA tries. I looked into the algorithm and
added it to my todo list.
Actually even the DAWG idea was suggested by someone else too :). I also
looked in Digital Search Trees :).

> I also have some code that stores large numbers of very similar filenames:
>
> /photos/christmas_2007/img_3277.jpg
> /photos/christmas_2007/img_3278.jpg
> ....
>
> These are currently stored in a binary file which is mmap()ed. The
> time to read the file is non-trivial and is bad for program start-up
> time. So perhaps I need a trie that can be stored in the mmap()ed
> file, i.e. not using pointers, like the Boost.Interprocess containers do.
>
> > I was thinking something like this
> >
> > This is the abstract of the interface for trie
> > Interface:
> > trie<type_2> something;
> > trie[string key]=type2 value; // O(key.size());
> > value trie::delete(string key); // O(key.size());
> > trie::iterator trie::find(string key); // O(key.size())'
> > int trie::size() // O(1) time. Gives the total no of
> elements in
> > the trie
> > trie::iterator trie::begin() //gives the beginning ie the
> > lexicographically least key :)
> > trie::iterator trie::end() //gives a iterator which is same as
> > x=x+trie.size();
> > string trie::get_key(iterator x) gets the key pointed to by iterator
> x;
> >
> > trie::iterator x; //worst case O(height of tree)
> > x++; // traverses the tree left to right and goes to the "next" node
> > x--; // traverses the tree right to left and goes to "prev" node
> >
> > *x // referrences the value of the key x is pointing to.
> > trie.delete(x) // deletes the key pointed to by x;
> >
> > trie.count_prefix(string key) // counts the no of keys with prefix
> key
> >
> > Examples:
> > trie::interator x;
> > trie<vector<int> > t;
> > t["hello"].push_back(9);
> > t["eat"].push_back(10);
> >
> > x=t.find("hello");
> > x--; //or --x; x now points to "eat" since eat comes before "hello"
> > lexicographically;
> > cout<<(*x)[0]<<endl;
> >
> > x++;//or ++x, x now points to "hello";
> >
> > x=t.end(); // or x++ will point to t.end().
> >
> > do
> > {
> > x--;
> > int size=(*x).size();
> > cout<<t.get_key(x)<<endl;
> > for(int j=0;j<size;j++)
> > cout<<(*x)[j]<<" ";
> > cout<<endl;
> > }
> > while(x!=t.begin());
> >
> > Please suggest changes to the interface and other things.
>

>
> I think you should be able to make it (almost) identical to std::map's
> interface. You should certainly use std::map as your starting point,
> and only deviate from it when you have a good reason to do so.
>
Should be easily possible :).

>
>
> Regards, Phil.
>
Regards,
Chintan

>
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk