Software installed on Patas

This is not a complete list of every piece of software; for example, it doesn't include many of the standard packages that ship with Red Hat. This is mainly meant to be a listing of software of particular interest to linguists.

Name Installation path Run as Notes
Emacs 22 /opt/emacs22/ emacs  
Emacs 21 system default emacs-21.4  
Katoob /opt/katoob/ katoob Arabic bidirectional editor. Requires X forwarding.
Nano system default nano  
Vim system default vim  

Linguistics software:

Note: Some of this software was copied over from pongo and has not been tested on patas. If you find something that doesn't work, please email linghelp@u.

Name Installation path Purpose
Carmel v3.4 /NLP_TOOLS/FST/carmel/latest/bin/ Finite-state transducer
Cluto 2.1.1 /opt/cluto-2.1.1/ "a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters."
fsm-4.0 /opt/fsm-4.0/ Tools to construct and work with weighted finite-state machines.
GIZA++ 2001-01-30 /opt/GIZA++/ Training of statistical translation models
Mallet 0.4 /NLP_TOOLS/tool_sets/mallet/v0.4/mallet-0.4/ "MALLET is an integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text."
SRILM 1.5.3 /NLP_TOOLS/lm/srilm/latest/ "a toolkit for building and applying statistical language models"
Stanford NLP 1.6 /NLP_TOOLS/parsers/stanford_parser/latest/ "a Java implementation of probabilistic natural language parsers"
WordNet 2.1 /opt/WordNet-2.1/ Lexical database of English
Xerox Finite-State Tools /opt/xerox-tools/  

Perl modules:

Name Purpose
Lingua::Stem Stems words to their root form.
Plucene Perl port of the Lucene search engine.
Text::Levenshtein Levenshtein edit distance
Text::Similarity Supermodule for a class of modules for measuring the similarity of text documents.
WordNet::QueryData Direct perl interface to WordNet database.
WordNet::Similarity Implements a variety of semantic similarity and relatedness measures based on information found in the lexical database WordNet.

Python modules:

Name Purpose
NTLK "NLTK the Natural Language Toolkit is a suite of open source Python modules, data sets and tutorials supporting research and development in natural language processing."
editdist Calculates the Levenshtein edit distance between two strings

LaTeX packages:

Note: Many of these are only installed on the head node, so submitting a LaTeX job to Condor may not work as expected.

Name Purpose Notes
Tipa 1.3 A system for processing IPA (International Phonetic Alphabet) symbols in LaTeX Manual (in pdf and dvi format) is available in /usr/local/doc/tipa-1.3

