Corpus Wish List

Over the summer, once we have our server environment set up, we'll be installing corpora/other resources like treebanks and wordnets. (This page will use the cover term "corpora" for all of those.) Please use this page to put in requests for corpora you'd like to see. Ideally, a request should contain the actual corpus title as well as a URL where information about it can be found. All LDC corpora are fair game. Free non-LDC corpora should be no problem. Other non-free corpora will also be considered. If you don't know of a particular corpurs, but have a request for a kind of resource (e.g., a dependency bank for language X), go ahead and put that on as well. If you see a non-specific request like this and know of an appropriate resource, please fill in a pointer.

Any further information (such as what you would like to use the corpus for) is also welcome.

  • Prague Czech-English Dependency Treebank (LDC2004T25 ) URL
  • Czech National Corpus URL
  • ECI Multilingual Text URL
  • Buckeye corpus URL
  • British National Corpus URL
  • CHAINS (Characterizing INdividual Speakers) URL

-- EmilyBender - 12 Apr 2005, DavidBrodbeck - 07 Jun 2007

Topic revision: r7 - 2007-06-07 - 22:36:27 - DavidBrodbeck

This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
Privacy Statement Terms & Conditions