Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The graphical user interface or "GUI" of the popular topic modeling implementation MALLET, is a useful alternative to the standard terminal or command line input MALLET frequently uses. Freely downloadable here, it is a quick and easy way to get started topic modeling without being comfortable in command line. To start, simply download the file, once it's finished, open the file TopicModelingTool.jar in order to begin (however, you will need Java installed on your machine in order to run it).W With the program started you will first want to select an input file or directory, this is the file or group of files that you want to run your topic modeling program on.After you select the correct input directory, you'll also want to specify your output directory. This is where MALLET will dump your results, both in a flat-file CSV version and in an html version for viewing in your internet browser. The default option for this is a new folder within your "downloads" but it's helpful if you move it to a more permanent spot. If you want to make your results shareable online, you can specify here where on your online server you'd like the results to go and they will be sent there in fully interactive form. Lastly, before you run the program you will want to click on the "Advanced" button next to the number of topics you want to create and make sure that the "Remove Stopwords" box is checked and that the Mallet Default file is selected in Stopword File tab at the top of the window. If you are working with a corpus that is not modern English i.e. has lots of "thy" or "thou" or regional dialects, the basic English Stopwords list is housed here on GitHub. You can download this file, make your own additions and then select it to use in Mallet instead of the default list. To do this, just select your modified file instead of the default Mallet list when you are in the "Advanced" section. Now you are ready to hit "learn topics" and run Mallet.

...