Search EBBA
EBBA 21255
Word Frequency Cloud
A word cloud is a novelty visual representation of text data used to depict free form text. Word clouds usually represent single words, and the importance of each word is shown with font size or color. This format is useful for quickly perceiving the most prominent terms and for locating a term alphabetically to determine its relative prominence. For our purposes, the word cloud visualizes the frequencies of the words in a ballad. The larger the word, the more frequently that word appears in the ballad.
TFIDF Word Cloud
TFIDF, or tf-idf (short for term frequency–inverse document frequency), is a numerical statistic that reflects how important a word is to a document in a collection or corpus. It is often used as a weighting factor in searches of information retrieval, text mining, and user modeling. The tf–idf value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps adjust for the fact that some words appear more frequently in general. In other words, the tf–idf value diminishes the weight of terms that occur very frequently in the document set and increases the weight of terms that rarely occur. The TFIDF cloud takes into account the frequency of a word in the entire corpus, as opposed to the word frequency cloud that uses the frequency of a word in a single document.
Document Topics Histogram
The document topics histogram makes use of latent Dirichlet allocation (LDA) topic modelling, a generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For our purposes, the observations are words collected into documents, and the LDA model forms a certain number of topics (160 topics in our case), where each word's presence is attributable to one of the document's topics. In other words, for each ballad in the corpus, we are finding the frequency of a word in relationship to a specific topic.
Top Words:
Word Association Network
English Broadside Ballad Archive
Housed at the University of California at Santa Barbara, Department of English
Director: Patricia Fumerton – Associate Director: Carl Stahmer – Assistant Director: Kristen McCants Forbes