Quick start guide
In order to use TML, the easiest way is to use it as a command line tool. In order to do this you need:
Have java properly installed, version 1.6 or above is required. To check your version you can run:
Have MySql properly installed and working. You can check this by running:
(You may need to add the -p option if the root account has a password)
Unzip the content and check the README.txt for installation instructions.
Have java properly installed, version 1.6 or above is required. To check your version you can run:
- java -version
Have MySql properly installed and working. You can check this by running:
- mysql -u root
(You may need to add the -p option if the root account has a password)
- Download the latest TML distribution from a zip file in the download page.
Unzip the content and check the README.txt for installation instructions.
Using TML from the command line
You can execute TML from the command line with the following command to see all options:
java -jar tml-cmm-3.2.jar
Common LSA tasks:
Adding documents to a new repository:
java -jar tml-cmm-3.2.jar -I -repo /path/to/repository --idocs /path/to/txt/files --iclean
Adding documents to an existing repository:
java -jar tml-cmm-3.2.jar -I -repo /path/to/repository --idocs /path/to/txt/files
Calculating distances between all documents in the repository with default parameters:
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:document" --operations PassagesSimilarity
Calculating distances between all documents projected on its own semantic space keeping 2 dimensions:
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:document" --odim NUM --odimth 2 --operations PassagesSimilarity
Calculating distances between all documents projected on its own semantic space with terms selected with DF 2 (Document Frequency 2 or more), using TfIdf as weighting schema and keeping 20 percent of the dimensions:
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:document" --otsel NUM --otselth 2 --otwl TF --otwg Idf --odim PCT --odimth 20 --operations PassagesSimilarity
Same as before but reading parameters from a file
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:document" --ocparams corpus.properties --operations PassagesSimilarity
With corpus.properties containing:
termselcrit = DF
termselthre = 2
reduxcrit = PCT
reduxthre = 20
localtw = TF
globaltw = Idf
lanczos = false
Common CMM tasks:
Adding documents to a new repository while annotating documents with Stanford parser:
java -jar tml-cmm-3.2.jar -I -repo /path/to/repository --idocs /path/to/txt/files --iclean --iannotators PennTreeAnnotator
Adding documents to an existing repository while annotating documents with Stanford parser:
java -jar tml-cmm-3.2.jar -I -repo /path/to/repository --idocs /path/to/txt/files --iannotators PennTreeAnnotator
Extracting a Concept Map from document sample.txt:
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:sentence AND parent:sample" --operations CmmProcess
For a full list of the available operations, check the package tml.vectorspace.operations in the API docs.
java -jar tml-cmm-3.2.jar
Common LSA tasks:
Adding documents to a new repository:
java -jar tml-cmm-3.2.jar -I -repo /path/to/repository --idocs /path/to/txt/files --iclean
Adding documents to an existing repository:
java -jar tml-cmm-3.2.jar -I -repo /path/to/repository --idocs /path/to/txt/files
Calculating distances between all documents in the repository with default parameters:
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:document" --operations PassagesSimilarity
Calculating distances between all documents projected on its own semantic space keeping 2 dimensions:
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:document" --odim NUM --odimth 2 --operations PassagesSimilarity
Calculating distances between all documents projected on its own semantic space with terms selected with DF 2 (Document Frequency 2 or more), using TfIdf as weighting schema and keeping 20 percent of the dimensions:
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:document" --otsel NUM --otselth 2 --otwl TF --otwg Idf --odim PCT --odimth 20 --operations PassagesSimilarity
Same as before but reading parameters from a file
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:document" --ocparams corpus.properties --operations PassagesSimilarity
With corpus.properties containing:
termselcrit = DF
termselthre = 2
reduxcrit = PCT
reduxthre = 20
localtw = TF
globaltw = Idf
lanczos = false
Common CMM tasks:
Adding documents to a new repository while annotating documents with Stanford parser:
java -jar tml-cmm-3.2.jar -I -repo /path/to/repository --idocs /path/to/txt/files --iclean --iannotators PennTreeAnnotator
Adding documents to an existing repository while annotating documents with Stanford parser:
java -jar tml-cmm-3.2.jar -I -repo /path/to/repository --idocs /path/to/txt/files --iannotators PennTreeAnnotator
Extracting a Concept Map from document sample.txt:
java -jar tml-cmm-3.2.jar -O -repo /path/to/repository --ocorpus "type:sentence AND parent:sample" --operations CmmProcess
For a full list of the available operations, check the package tml.vectorspace.operations in the API docs.