labMTsimple.storyLab functions¶
-
labMTsimple.storyLab.copy_static_files()¶ Deprecated method to copy files from this module’s static directory into the directory where shifts are being made.
-
labMTsimple.storyLab.emotion(tmpStr, someDict, scoreIndex=1, shift=False, happsList=[])¶ Take a string and the happiness dictionary, and rate the string.
If shift=True, will return a vector (also then needs the happsList).
-
labMTsimple.storyLab.emotionFileReader(stopval=1.0, lang='english', min=1.0, max=9.0, returnVector=False)¶ Load the dictionary of sentiment words.
Stopval is our lens, $Delta _h$, read the labMT dataset into a dict with this lens (must be tab-deliminated).
With returnVector = True, returns tmpDict,tmpList,wordList. Otherwise, just the dictionary.
-
labMTsimple.storyLab.emotionV(frequencyVec, scoreVec)¶ Given the frequency vector and the score vector, compute the happs.
Doesn’t use numpy, but equivalent to np.dot(freq,happs)/np.sum(freq).
-
labMTsimple.storyLab.link_static_files()¶ Same as copy_static_files, but makes symbolic links.
-
labMTsimple.storyLab.shift(refFreq, compFreq, lens, words, sort=True)¶ Compute a shift, and return the results.
If sort=True, will return the three sorted lists, and sumTypes. Else, just the two shift lists, and sumTypes (words don’t need to be sorted).
-
labMTsimple.storyLab.shiftHtml(scoreList, wordList, refFreq, compFreq, outFile, corpus='LabMT', advanced=False, customTitle=False, title='', ref_name='reference', comp_name='comparison', ref_name_happs='', comp_name_happs='', isare='')¶ Make an interactive shift for exploring and sharing.
The most insane-o piece of code here (lots of file copying, writing vectors into html files, etc).
Accepts a score list, a word list, two frequency files and the name of an HTML file to generate
** will make the HTML file, and a directory called static that hosts a bunch of .js, .css that is useful.
-
labMTsimple.storyLab.shiftHtmlJupyter(scoreList, wordList, refFreq, compFreq, outFile, corpus='LabMT', advanced=False, customTitle=False, title='', ref_name='reference', comp_name='comparison', ref_name_happs='', comp_name_happs='', isare='', saveFull=True, selfshift=False, bgcolor='white')¶ Shifter that generates HTML in two pieces, designed to work inside of a Jupyter notebook.
Saves the filename as given (with .html extension), and sneaks in a filename-wrapper.html, and the wrapper file has the html headers, everything to be a standalone file. The filenamed html is just the guts of the html file, because the complete markup isn’t need inside the notebook.
-
labMTsimple.storyLab.shiftHtmlPreshifted(scoreList, wordList, refFreq, compFreq, outFile, corpus='LabMT', advanced=False, customTitle=False, title='', ref_name='reference', comp_name='comparison', ref_name_happs='', comp_name_happs='', isare='')¶ Make an interactive shift for exploring and sharing.
The most insane-o piece of code here (lots of file copying, writing vectors into html files, etc).
Accepts a score list, a word list, two frequency files and the name of an HTML file to generate
** will make the HTML file, and a directory called static that hosts a bunch of .js, .css that is useful.
-
labMTsimple.storyLab.stopper(tmpVec, score_list, word_list, stopVal=1.0, ignore=[], center=5.0)¶ Take a frequency vector, and 0 out the stop words.
Will always remove the nig* words.
Return the 0’ed vector.
-
labMTsimple.storyLab.stopper_mat(tmpVec, score_list, word_list, stopVal=1.0, ignore=[], center=5.0)¶ Take a frequency vector, and 0 out the stop words.
A sparse-aware matrix stopper. F-vecs are rows: [i,:]
Will always remove the nig* words.
Return the 0’ed matrix, sparse.
labMTsimple.speedy sentiDict class¶
-
class
labMTsimple.speedy.sentiDict(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶ An abstract class to score them all.
-
makeListsFromDict()¶ Make lists from a dict, used internally.
-
makeMarisaTrie(save_flag=False)¶ Turn a dictionary into a marisa_trie.
-
matcherDictBool(word)¶ MatcherTrieDict(word) just checks if a word is in the dict.
-
matcherTrieBool(word)¶ MatcherTrieBool(word) just checks if a word is in the list. Returns 0 or 1.
Works for both trie types. Only one needed to make the plots. Only use this for coverage, so don’t even worry about using with a dict.
-
matcherTrieDict(word, wordVec, count)¶ Not sure what this one does.
-
matcherTrieMarisa(word, wordVec, count)¶ Not sure what this one does.
-
my_marisa= (<marisa_trie.RecordTrie object>, <marisa_trie.RecordTrie object>)¶ Declare this globally.
-
openWithPath(filename, mode)¶ Helper function for searching for files.
-
scoreTrieDict(wordDict, idx=1, center=0.0, stopVal=0.0)¶ Score a wordDict using the dict backend.
INPUTS:
-wordDict is a favorite hash table of word and count.
-
scoreTrieMarisa(wordDict, idx=1, center=0.0, stopVal=0.0)¶ Score a wordDict using the marisa_trie backend.
INPUTS:
-wordDict is a favorite hash table of word and count.
-
stopper(tmpVec, stopVal=1.0, ignore=[])¶ Take a frequency vector, and 0 out the stop words.
Will always remove the nig* words.
Return the 0’ed vector.
-
wordVecifyTrieDict(wordDict)¶ Make a word vec from word dict using dict backend.
INPUTS:
-wordDict is our favorite hash table of word and count.
-
wordVecifyTrieMarisa(wordDict)¶ Make a word vec from word dict using marisa_trie backend.
INPUTS:
-wordDict is our favorite hash table of word and count.
-
The following subclasses of the sentiDict class are available:
-LabMT
-ANEW
-LIWC07
-MPQA
-OL
-WK
-LIWC01
-LIWC15
-PANASX
-Pattern
-SentiWordNet
-AFINN
-GI
-WDAL
-EmoLex
-MaxDiff
-HashtagSent
-Sent140Lex
-SOCAL
-SenticNet
-Emoticons
-SentiStrength
-VADER
-Umigon
-USent
-EmoSenticNet
these don’t get the data attributes so we’ll just leave them out
labMTsimple.speedy sentiDict subclasses¶
-
class
labMTsimple.speedy.LabMT(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶ LabMT class.
Now takes the full name of the language.
-
class
labMTsimple.speedy.ANEW(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶ ANEW class.
-
loadDict(bananas, lang)¶ Load the corpus into a dictionary, straight from the origin corpus file.
-
-
class
labMTsimple.speedy.LIWC07(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶ This is the default, define it anyway
-
class
labMTsimple.speedy.MPQA(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶ MPQA class.
-
loadDict(bananas, lang)¶ Load the corpus into a dictionary, straight from the origin corpus file.
-
-
class
labMTsimple.speedy.OL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶ -
loadDict(bananas, lang)¶ Load the corpus into a dictionary, straight from the origin corpus file.
-
-
class
labMTsimple.speedy.WK(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.LIWC01(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.LIWC15(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.PANASX(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.Pattern(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.SentiWordNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.AFINN(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.GI(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.WDAL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.EmoLex(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.MaxDiff(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.HashtagSent(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.Sent140Lex(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.SOCAL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.SenticNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.Emoticons(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.SentiStrength(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.VADER(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.Umigon(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.USent(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
-
class
labMTsimple.speedy.EmoSenticNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶
labMTsimple.speedy sentiDict subclasses auto¶
-
class
labMTsimple.speedy.AFINN(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.ANEW(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english') ANEW class.
-
loadDict(bananas, lang) Load the corpus into a dictionary, straight from the origin corpus file.
-
-
class
labMTsimple.speedy.EmoLex(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.EmoSenticNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.Emoticons(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.GI(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.HashtagSent(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.LIWC(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')¶ LIWC class.
-
loadDict(bananas, lang)¶ Load the corpus into a dictionary, straight from the origin corpus file.
-
-
class
labMTsimple.speedy.LIWC01(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.LIWC07(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english') This is the default, define it anyway
-
class
labMTsimple.speedy.LIWC15(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.LabMT(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english') LabMT class.
Now takes the full name of the language.
-
class
labMTsimple.speedy.MPQA(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english') MPQA class.
-
loadDict(bananas, lang) Load the corpus into a dictionary, straight from the origin corpus file.
-
-
class
labMTsimple.speedy.MaxDiff(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.OL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english') -
loadDict(bananas, lang) Load the corpus into a dictionary, straight from the origin corpus file.
-
-
class
labMTsimple.speedy.PANASX(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.Pattern(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.SOCAL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.Sent140Lex(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.SentiStrength(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.SentiWordNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.SenticNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.USent(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.Umigon(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.VADER(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.WDAL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.WK(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
-
class
labMTsimple.speedy.sentiDict(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english') An abstract class to score them all.
-
makeListsFromDict() Make lists from a dict, used internally.
-
makeMarisaTrie(save_flag=False) Turn a dictionary into a marisa_trie.
-
matcherDictBool(word) MatcherTrieDict(word) just checks if a word is in the dict.
-
matcherTrieBool(word) MatcherTrieBool(word) just checks if a word is in the list. Returns 0 or 1.
Works for both trie types. Only one needed to make the plots. Only use this for coverage, so don’t even worry about using with a dict.
-
matcherTrieDict(word, wordVec, count) Not sure what this one does.
-
matcherTrieMarisa(word, wordVec, count) Not sure what this one does.
-
my_marisa= (<marisa_trie.RecordTrie object>, <marisa_trie.RecordTrie object>) Declare this globally.
-
openWithPath(filename, mode) Helper function for searching for files.
-
scoreTrieDict(wordDict, idx=1, center=0.0, stopVal=0.0) Score a wordDict using the dict backend.
INPUTS:
-wordDict is a favorite hash table of word and count.
-
scoreTrieMarisa(wordDict, idx=1, center=0.0, stopVal=0.0) Score a wordDict using the marisa_trie backend.
INPUTS:
-wordDict is a favorite hash table of word and count.
-
stopper(tmpVec, stopVal=1.0, ignore=[]) Take a frequency vector, and 0 out the stop words.
Will always remove the nig* words.
Return the 0’ed vector.
-
wordVecifyTrieDict(wordDict) Make a word vec from word dict using dict backend.
INPUTS:
-wordDict is our favorite hash table of word and count.
-
wordVecifyTrieMarisa(wordDict) Make a word vec from word dict using marisa_trie backend.
INPUTS:
-wordDict is our favorite hash table of word and count.
-
-
labMTsimple.speedy.u(x)¶ Python 2/3 agnostic unicode function
test¶
-
class
labMTsimple.speedy.LabMT(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english') LabMT class.
Now takes the full name of the language.
test auto¶
-
class
labMTsimple.speedy.LabMT(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english') LabMT class.
Now takes the full name of the language.