|
DownloadGrabbingAt the begining, I wrote some programs and scripts to grab the whole content of wikipedia articles. But I realized that it is not the good way to obtain the content of Wikipedia. It is faster and better to use the wikipedia dumps available at download.wikipedia.org. All languages are available : english, Deutsch, French, ...AnalysisProgram that convert XML articles from download.wikipedia.org in a trie and after a program to perform queries on this trie. For the moment, I wrote two versions of the query program :
All these programs are available under GPL licence in this [archive], you can read the README file inside the tarball for a fast help. Compiled resourcesIf you do not want to compile files from download.wikipedia.org or you do not have time to download them, you can download the compiled resources here :
|