|
HistoryJune 2006The had the idea of this project at the beginning of 2006, and I started working on it in june 2006 with the grabbing of english articles. I grabbed the whole content of Wikipedia with some scripts/programs and transformed the Xhtml content to Xml. But this was not the best way to obtain the whole content of Wikipedia. I realized that full dumps of wikipedia are available at download.wikipedia.org.July 2006I started the development of wikipedia-suggest using XML articles produced in july 2006. The idea was to rank the list of results with a link analysis approach, each article votes for its favorite documents (using the link to other articles it contains and the frequency of this link). After this analysis, each article has a rank (number of link to them) and can be added by title in a trie (finite state machine). Each node of this trie contains a static table of best articles for this position.The figure below shows an example of trie with a static table associated to queries "a", "w", and "wi".
All these programs are available under GPL licence in this [archive]. August 2006I improved Wikipedia suggest by adding functionalities :
|