Book Edition: 2009, 1e Lists and strings do not have exactly the same functionality. Lists have the added power that you can change their elements: >>> beatles[0] = "John Lennon" >>> del beatles[-1] >>> beatles ['John Lennon', 'Paul', 'George'] On the other hand, if we try to do that with a string—changing the 0th character in query to 'F'—we get: >>> query[0] = 'F' Traceback (most recent call last): File "[stdin]", line 1, in ? TypeError: object does not support item assignment This is because strings are immutable: you can’t change a string once you have created it. However, lists are mutable, and their contents can be modified at any time. As a result, lists support operations that modify the original value rather than producing a new value. --- --- --- --- --- What Is Unicode? Unicode supports over a million characters. Each character is assigned a number, called a code point. In Python, code points are written in the form \uXXXX, where XXXX is the number in four-digit hexadecimal form. Within a program, we can manipulate Unicode strings just like normal strings. However, when Unicode characters are stored in files or displayed on a terminal, they must be encoded as a stream of bytes. Some encodings (such as ASCII and Latin-2) use a single byte per code point, so they can support only a small subset of Unicode, enough for a single language. Other encodings (such as UTF-8) use multiple bytes and can represent the full range of Unicode characters. Text in files will be in a particular encoding, so we need some mechanism for translating it into Unicode—translation into Unicode is called decoding. Conversely, to write out Unicode to a file or a terminal, we first need to translate it into a suitable encoding—this translation out of Unicode is called encoding, and is illustrated in Figure 3-3. From a Unicode perspective, characters are abstract entities that can be realized as one or more glyphs. Only glyphs can appear on a screen or be printed on paper. A font is a mapping from characters to glyphs. In Python, a Unicode string literal can be specified by preceding an ordinary string literal with a u, as in u'hello'. Arbitrary Unicode characters are defined using the \uXXXX escape sequence inside a Unicode string literal. We find the integer ordinal of a character using ord(). For example: >>> ord('a') 97 The hexadecimal four-digit notation for 97 is 0061, so we can define a Unicode string literal with the appropriate escape sequence: >>> a = u'\u0061' >>> a u'a' >>> print a a --- --- --- --- --- 4.7 Algorithm Design A major part of algorithmic problem solving is selecting or adapting an appropriate algorithm for the problem at hand. Sometimes there are several alternatives, and choosing the best one depends on knowledge about how each alternative performs as the size of the data grows. Whole books are written on this topic, and we only have space to introduce some key concepts and elaborate on the approaches that are most prevalent in natural language processing. The best-known strategy is known as divide-and-conquer. We attack a problem of size n by dividing it into two problems of size n/2, solve these problems, and combine their results into a solution of the original problem. For example, suppose that we had a pile of cards with a single word written on each card. We could sort this pile by splitting it in half and giving it to two other people to sort (they could do the same in turn). Then, when two sorted piles come back, it is an easy task to merge them into a single sorted pile. See Figure 4-3 for an illustration of this process. Another example is the process of looking up a word in a dictionary. We open the book somewhere around the middle and compare our word with the current page. If it’s earlier in the dictionary, we repeat the process on the first half; if it’s later, we use the second half. This search method is called binary search since it splits the problem in half at every step. In another approach to algorithm design, we attack a problem by transforming it into an instance of a problem we already know how to solve. For example, in order to detect duplicate entries in a list, we can pre-sort the list, then scan through it once to check whether any adjacent pairs of elements are identical. --- --- --- --- --- Stemmers NLTK includes several off-the-shelf stemmers, and if you ever need a stemmer, you should use one of these in preference to crafting your own using regular expressions, since NLTK’s stemmers handle a wide range of irregular cases. The Porter and Lancaster stemmers follow their own rules for stripping affixes. Observe that the Porter stemmer correctly handles the word lying (mapping it to lie), whereas the Lancaster stemmer does not. >>> porter = nltk.PorterStemmer() >>> lancaster = nltk.LancasterStemmer() >>> [porter.stem(t) for t in tokens] ['DENNI', ':', 'Listen', ',', 'strang', 'women', 'lie', 'in', 'pond', 'distribut', 'sword', 'is', 'no', 'basi', 'for', 'a', 'system', 'of', 'govern', '.', 'Suprem', 'execut', 'power', 'deriv', 'from', 'a', 'mandat', 'from', 'the', 'mass', ',', 'not', 'from', 'some', 'farcic', 'aquat', 'ceremoni', '.'] >>> [lancaster.stem(t) for t in tokens] ['den', ':', 'list', ',', 'strange', 'wom', 'lying', 'in', 'pond', 'distribut', 'sword', 'is', 'no', 'bas', 'for', 'a', 'system', 'of', 'govern', '.', 'suprem', 'execut', 'pow', 'der', 'from', 'a', 'mand', 'from', 'the', 'mass', ',', 'not', 'from', 'som', 'farc', 'aqu', 'ceremony', '.'] --- --- --- --- --- Lemmatization The WordNet lemmatizer removes affixes only if the resulting word is in its dictionary. This additional checking process makes the lemmatizer slower than the stemmers just mentioned. Notice that it doesn’t handle lying, but it converts women to woman. >>> wnl = nltk.WordNetLemmatizer() >>> [wnl.lemmatize(t) for t in tokens] ['DENNIS', ':', 'Listen', ',', 'strange', 'woman', 'lying', 'in', 'pond', 'distributing', 'sword', 'is', 'no', 'basis', 'for', 'a', 'system', 'of', 'government', '.', 'Supreme', 'executive', 'power', 'derives', 'from', 'a', 'mandate', 'from', 'the', 'mass', ',', 'not', 'from', 'some', 'farcical', 'aquatic', 'ceremony', '.'] The WordNet lemmatizer is a good choice if you want to compile the vocabulary of some texts and want a list of valid lemmas (or lexicon headwords). --- --- --- --- --- 3.8 Segmentation This section discusses more advanced concepts, which you may prefer to skip on the first time through this chapter. Tokenization is an instance of a more general problem of segmentation. In this section, we will look at two other instances of this problem, which use radically different techniques to the ones we have seen so far in this chapter. Sentence Segmentation Manipulating texts at the level of individual words often presupposes the ability to divide a text into individual sentences. As we have seen, some corpora already provide access at the sentence level. In the following example, we compute the average number of words per sentence in the Brown Corpus: >>> len(nltk.corpus.brown.words()) / len(nltk.corpus.brown.sents()) 20.250994070456922 In other cases, the text is available only as a stream of characters. Before tokenizing the text into words, we need to segment it into sentences. NLTK facilitates this by including the Punkt sentence segmenter (Kiss & Strunk, 2006). Here is an example of its use in segmenting the text of a novel. (Note that if the segmenter’s internal data has been updated by the time you read this, you will see different output.) >>> sent_tokenizer=nltk.data.load('tokenizers/punkt/english.pickle') >>> text = nltk.corpus.gutenberg.raw('chesterton-thursday.txt') >>> sents = sent_tokenizer.tokenize(text) >>> pprint.pprint(sents[171:181]) ['"Nonsense!', '" said Gregory, who was very rational when anyone else\nattempted paradox.', '"Why do all the clerks and navvies in the\nrailway trains look so sad and tired,...', 'I will\ntell you.', 'It is because they know that the train is going right.', 'It\nis because they know that whatever place they have taken a ticket\nfor that ...', 'It is because after they have\npassed Sloane Square they know that the next stat...', 'Oh, their wild rapture!', 'oh,\ntheir eyes like stars and their souls again in Eden, if the next\nstation w...' '"\n\n"It is you who are unpoetical," replied the poet Syme.'] --- --- --- --- --- Like every other NLTK module, distance.py begins with a group of comment lines giving a one-line title of the module and identifying the authors. (Since the code is distributed, it also includes the URL where the code is available, a copyright statement, and license information.) Next is the module-level docstring, a triple-quoted multiline string containing information about the module that will be printed when someone types help(nltk.metrics.distance). # Natural Language Toolkit: Distance Metrics # # Author: Edward Loper : edloper@gradient.cis.upenn.edu # Steven Bird : sb@csse.unimelb.edu.au # """ Distance Metrics. Compute the distance between two items (usually strings). As metrics, they must satisfy the following three requirements: 1. d(a, a) = 0 2. d(a, b) >= 0 3. d(a, c) <= d(a, b) + d(b, c) """ After this comes all the import statements required for the module, then any global variables, followed by a series of function definitions that make up most of the module. Other modules define “classes,” the main building blocks of object-oriented programming, which falls outside the scope of this book. (Most NLTK modules also include a demo() function, which can be used to see examples of the module in use.) Some module variables and functions are only used within the module. These should have names beginning with an underscore, e.g., _helper(), since this will hide the name. If another module imports this one, using the idiom: from module import *, these names will not be imported. You can optionally list the externally accessible names of a module using a special built-in variable like this: __all__ = ['edit_distance', 'jaccard_distance']. --- --- --- --- --- Debugging Techniques Since most code errors result from the programmer making incorrect assumptions, the first thing to do when you detect a bug is to check your assumptions. Localize the problem by adding print statements to the program, showing the value of important variables, and showing how far the program has progressed. If the program produced an “exception”—a runtime error—the interpreter will print a stack trace, pinpointing the location of program execution at the time of the error. If the program depends on input data, try to reduce this to the smallest size while still producing the error. Once you have localized the problem to a particular function or to a line of code, you need to work out what is going wrong. It is often helpful to recreate the situation using the interactive command line. Define some variables, and then copy-paste the offending line of code into the session and see what happens. Check your understanding of the code by reading some documentation and examining other code samples that purport to do the same thing that you are trying to do. Try explaining your code to someone else, in case she can see where things are going wrong. Python provides a debugger which allows you to monitor the execution of your program, specify line numbers where execution will stop (i.e., breakpoints), and step through sections of code and inspect the value of variables. You can invoke the debugger on your code as follows: >>> import pdb >>> import mymodule >>> pdb.run('mymodule.myfunction()') It will present you with a prompt (Pdb) where you can type instructions to the debugger. Type help to see the full list of commands. Typing step (or just s) will execute the current line and stop. If the current line calls a function, it will enter the function and stop at the first line. Typing next (or just n) is similar, but it stops execution at the next line in the current function. The break (or b) command can be used to create or list breakpoints. Type continue (or c) to continue execution as far as the next breakpoint. Type the name of any variable to inspect its value. We can use the Python debugger to locate the problem in our find_words() function. Remember that the problem arose the second time the function was called. We’ll start by calling the function without using the debugger , using the smallest possible input. The second time, we’ll call it with the debugger. >>> import pdb >>> find_words(['cat'], 3) ['cat'] >>> pdb.run("find_words(['dog'], 3)") > [string](1)[module]() (Pdb) step --Call-- > [stdin](1)find_words() (Pdb) args text = ['dog'] wordlength = 3 result = ['cat'] Here we typed just two commands into the debugger: step took us inside the function, and args showed the values of its arguments (or parameters). We see immediately that result has an initial value of ['cat'], and not the empty list as expected. The debugger has helped us to localize the problem, prompting us to check our understanding of Python functions. --- --- --- --- --- Defensive Programming In order to avoid some of the pain of debugging, it helps to adopt some defensive programming habits. Instead of writing a 20-line program and then testing it, build the program bottom-up out of small pieces that are known to work. Each time you combine these pieces to make a larger unit, test it carefully to see that it works as expected. Consider adding assert statements to your code, specifying properties of a variable, e.g., assert(isinstance(text, list)). If the value of the text variable later becomes a string when your code is used in some larger context, this will raise an AssertionError and you will get immediate notification of the problem. Once you think you’ve found the bug, view your solution as a hypothesis. Try to predict the effect of your bugfix before re-running the program. If the bug isn’t fixed, don’t fall into the trap of blindly changing the code in the hope that it will magically start working again. Instead, for each change, try to articulate a hypothesis about what is wrong and why the change will fix the problem. Then undo the change if the problem was not resolved. As you develop your program, extend its functionality, and fix any bugs, it helps to maintain a suite of test cases. This is called regression testing, since it is meant to detect situations where the code “regresses”—where a change to the code has an unintended side effect of breaking something that used to work. Python provides a simple regression-testing framework in the form of the doctest module. This module searches a file of code or documentation for blocks of text that look like an interactive Python session, of the form you have already seen many times in this book. It executes the Python commands it finds, and tests that their output matches the output supplied in the original file. Whenever there is a mismatch, it reports the expected and actual values. For details, please consult the doctest documentation at DocTest Docs Apart from its value for regression testing, the doctest module is useful for ensuring that your software documentation stays in sync with your code. Perhaps the most important defensive programming strategy is to set out your code clearly, choose meaningful variable and function names, and simplify the code wherever possible by decomposing it into functions and modules with well-documented interfaces. --- --- --- --- --- CHAPTER 5 Categorizing and Tagging Words The process of classifying words into their parts-of-speech and labeling them accordingly is known as part-of-speech tagging, POS tagging, or simply tagging. Parts-of-speech are also known as word classes or lexical categories. The collection of tags used for a particular task is known as a tagset. 5.1 Using a Tagger A part-of-speech tagger, or POS tagger, processes a sequence of words, and attaches a part of speech tag to each word (don’t forget to import nltk): >>> text = nltk.word_tokenize("And now for something completely different") >>> nltk.pos_tag(text) [('And', 'CC'), ('now', 'RB'), ('for', 'IN'), ('something', 'NN'), ('completely', 'RB'), ('different', 'JJ')] 5.2 Tagged Corpora: Brown Corpus has been POS tagged. Representing Tagged Tokens By convention in NLTK, a tagged token is represented using a tuple consisting of the token and the tag. We can create one of these special tuples from the standard string representation of a tagged token, using the function str2tuple(): >>> tagged_token = nltk.tag.str2tuple('fly/NN') >>> tagged_token ('fly', 'NN') >>> tagged_token[0] 'fly' >>> tagged_token[1] 'NN' We can construct a list of tagged tokens directly from a string. The first step is to tokenize the string to access the individual word/tag strings, and then to convert each of these into a tuple (using str2tuple()). >>> sent = ''' ... The/AT grand/JJ jury/NN commented/VBD on/IN a/AT number/NN of/IN ... other/AP topics/NNS ,/, AMONG/IN them/PPO the/AT Atlanta/NP and/CC ... Fulton/NP-tl County/NN-tl purchasing/VBG departments/NNS which/WDT it/PPS ... said/VBD ``/`` ARE/BER well/QL operated/VBN and/CC follow/VB generally/RB ... accepted/VBN practices/NNS which/WDT inure/VB to/IN the/AT best/JJT ... interest/NN of/IN both/ABX governments/NNS ''/'' ./. ... ''' >>> [nltk.tag.str2tuple(t) for t in sent.split()] [('The', 'AT'), ('grand', 'JJ'), ('jury', 'NN'), ('commented', 'VBD'), ('on', 'IN'), ('a', 'AT'), ('number', 'NN'), ... ('.', '.')] Reading Tagged Corpora Several of the corpora included with NLTK have been tagged for their part-of-speech. Here’s an example of what you might see if you opened a file from the Brown Corpus with a text editor: The/at Fulton/np-tl County/nn-tl Grand/jj-tl Jury/nn-tl said/vbd Friday/nr an/at investigation/ nn of/in Atlanta’s/np$ recent/jj primary/nn election/nn produced/vbd / no/at evidence/nn ''/'' that/cs any/dti irregularities/nns took/vbd place/nn ./. Other corpora use a variety of formats for storing part-of-speech tags. NLTK’s corpus readers provide a uniform interface so that you don’t have to be concerned with the different file formats. In contrast with the file extract just shown, the corpus reader for the Brown Corpus represents the data as shown next. Note that part-of-speech tags have been converted to uppercase; this has become standard practice since the Brown Corpus was published. >>> nltk.corpus.brown.tagged_words() [('The', 'AT'), ('Fulton', 'NP-TL'), ('County', 'NN-TL'), ...] >>> nltk.corpus.brown.tagged_words(simplify_tags=True) [('The', 'DET'), ('Fulton', 'N'), ('County', 'N'), ...] Whenever a corpus contains tagged text, the NLTK corpus interface will have a tagged_words() method. Here are some more examples, again using the output format illustrated for the Brown Corpus: >>> print nltk.corpus.nps_chat.tagged_words() [('now', 'RB'), ('im', 'PRP'), ('left', 'VBD'), ...] >>> nltk.corpus.conll2000.tagged_words() [('Confidence', 'NN'), ('in', 'IN'), ('the', 'DT'), ...] >>> nltk.corpus.treebank.tagged_words() [('Pierre', 'NNP'), ('Vinken', 'NNP'), (',', ','), ...] Not all corpora employ the same set of tags; see the tagset help functionality and the readme() methods mentioned earlier for documentation. Initially we want to avoid the complications of these tagsets, so we use a built-in mapping to a simplified tagset: >>> nltk.corpus.brown.tagged_words(simplify_tags=True) [('The', 'DET'), ('Fulton', 'NP'), ('County', 'N'), ...] >>> nltk.corpus.treebank.tagged_words(simplify_tags=True) [('Pierre', 'NP'), ('Vinken', 'NP'), (',', ','), ...] Tagged corpora for several other languages are distributed with NLTK, including Chinese, Hindi, Portuguese, Spanish, Dutch, and Catalan. These usually contain non-ASCII text, and Python always displays this in hexadecimal when printing a larger structure such as a list. >>> nltk.corpus.sinica_treebank.tagged_words() [('\xe4\xb8\x80', 'Neu'), ('\xe5\x8f\x8b\xe6\x83\x85', 'Nad'), ...] >>> nltk.corpus.indian.tagged_words() [('\xe0\xa6\xae\xe0\xa6\xb9\xe0\xa6\xbf\xe0\xa6\xb7\xe0\xa7\x87\xe0\xa6\xb0', 'NN'), ('\xe0\xa6\xb8\xe0\xa6\xa8\xe0\xa7\x8d\xe0\xa6\xa4\xe0\xa6\xbe\xe0\xa6\xa8', 'NN'), ...] >>> nltk.corpus.mac_morpho.tagged_words() [('Jersei', 'N'), ('atinge', 'V'), ('m\xe9dia', 'N'), ...] >>> nltk.corpus.conll2002.tagged_words() [('Sao', 'NC'), ('Paulo', 'VMI'), ('(', 'Fpa'), ...] >>> nltk.corpus.cess_cat.tagged_words() [('El', 'da0ms0'), ('Tribunal_Suprem', 'np0000o'), ...] If your environment is set up correctly, with appropriate editors and fonts, you should be able to display individual strings in a human-readable way. If the corpus is also segmented into sentences, it will have a tagged_sents() method that divides up the tagged words into sentences rather than presenting them as one big list. This will be useful when we come to developing automatic taggers, as they are trained and tested on lists of sentences, not words. 5.4 Automatic Tagging The Default Tagger The simplest possible tagger assigns the same tag to each token. This may seem to be a rather banal step, but it establishes an important baseline for tagger performance. In order to get the best result, we tag each word with the most likely tag. Let’s find out which tag is most likely (now using the unsimplified tagset): >>> tags = [tag for (word, tag) in brown.tagged_words(categories='news')] >>> nltk.FreqDist(tags).max() 'NN' Now we can create a tagger that tags everything as NN. >>> raw = 'I do not like green eggs and ham, I do not like them Sam I am!' >>> tokens = nltk.word_tokenize(raw) >>> default_tagger = nltk.DefaultTagger('NN') >>> default_tagger.tag(tokens) [('I', 'NN'), ('do', 'NN'), ('not', 'NN'), ('like', 'NN'), ('green', 'NN'), ('eggs', 'NN'), ('and', 'NN'), ('ham', 'NN'), (',', 'NN'), ('I', 'NN'), ('do', 'NN'), ('not', 'NN'), ('like', 'NN'), ('them', 'NN'), ('Sam', 'NN'), ('I', 'NN'), ('am', 'NN'), ('!', 'NN')] Unsurprisingly, this method performs rather poorly. On a typical corpus, it will tag only about an eighth of the tokens correctly, as we see here: >>> default_tagger.evaluate(brown_tagged_sents) 0.13089484257215028 The Regular Expression Tagger >>> patterns = [ ... (r'.*ing$', 'VBG'), # gerunds ... (r'.*ed$', 'VBD'), # simple past ... (r'.*es$', 'VBZ'), # 3rd singular present ... (r'.*ould$', 'MD'), # modals ... (r'.*\'s$', 'NN$'), # possessive nouns ... (r'.*s$', 'NNS'), # plural nouns ... (r'^-?[0-9]+(.[0-9]+)?$', 'CD'), # cardinal numbers ... (r'.*', 'NN') # nouns (default) ... ] >>> regexp_tagger = nltk.RegexpTagger(patterns) >>> regexp_tagger.tag(brown_sents[3]) [('``', 'NN'), ('Only', 'NN'), ('a', 'NN'), ('relative', 'NN'), ('handful', 'NN'), ('of', 'NN'), ('such', 'NN'), ('reports', 'NNS'), ('was', 'NNS'), ('received', 'VBD'), ("''", 'NN'), (',', 'NN'), ('the', 'NN'), ('jury', 'NN'), ('said', 'NN'), (',', 'NN'), ('``', 'NN'), ('considering', 'VBG'), ('the', 'NN'), ('widespread', 'NN'), ...] >>> regexp_tagger.evaluate(brown_tagged_sents) 0.20326391789486245 The Lookup Tagger A lot of high-frequency words do not have the NN tag. Let’s find the hundred most frequent words and store their most likely tag. We can then use this information as the model for a “lookup tagger” (an NLTK UnigramTagger): >>> fd = nltk.FreqDist(brown.words(categories='news')) >>> cfd = nltk.ConditionalFreqDist(brown.tagged_words(categories='news')) >>> most_freq_words = fd.keys()[:100] >>> likely_tags = dict((word, cfd[word].max()) for word in most_freq_words) >>> baseline_tagger = nltk.UnigramTagger(model=likely_tags) >>> baseline_tagger.evaluate(brown_tagged_sents) 0.45578495136941344 5.5 N-Gram Tagging Unigram Tagging Unigram taggers are based on a simple statistical algorithm: for each token, assign the tag that is most likely for that particular token. For example, it will assign the tag JJ to any occurrence of the word frequent, since frequent is used as an adjective (e.g., a frequent word) more often than it is used as a verb (e.g., I frequent this cafe). A unigram tagger behaves just like a lookup tagger (Section 5.4), except there is a more convenient technique for setting it up, called training. In the following code sample, we train a unigram tagger, use it to tag a sentence, and then evaluate: >>> from nltk.corpus import brown >>> brown_tagged_sents = brown.tagged_sents(categories='news') >>> brown_sents = brown.sents(categories='news') >>> unigram_tagger = nltk.UnigramTagger(brown_tagged_sents) >>> unigram_tagger.tag(brown_sents[2007]) [('Various', 'JJ'), ('of', 'IN'), ('the', 'AT'), ('apartments', 'NNS'), ('are', 'BER'), ('of', 'IN'), ('the', 'AT'), ('terrace', 'NN'), ('type', 'NN'), (',', ','), ('being', 'BEG'), ('on', 'IN'), ('the', 'AT'), ('ground', 'NN'), ('floor', 'NN'), ('so', 'QL'), ('that', 'CS'), ('entrance', 'NN'), ('is', 'BEZ'), ('direct', 'JJ'), ('.', '.')] >>> unigram_tagger.evaluate(brown_tagged_sents) 0.9349006503968017 We train a UnigramTagger by specifying tagged sentence data as a parameter when we initialize the tagger. The training process involves inspecting the tag of each word and storing the most likely tag for any word in a dictionary that is stored inside the tagger. General N-Gram Tagging When we perform a language processing task based on unigrams, we are using one item of context. In the case of tagging, we consider only the current token, in isolation from any larger context. Given such a model, the best we can do is tag each word with its a priori most likely tag. This means we would tag a word such as wind with the same tag, regardless of whether it appears in the context the wind or to wind. An n-gram tagger is a generalization of a unigram tagger whose context is the current word together with the part-of-speech tags of the n-1 preceding tokens. The NgramTagger class uses a tagged training corpus to determine which part-of-speech tag is most likely for each context. Here we see a special case of an n-gram tagger, namely a bigram tagger. First we train it, then use it to tag untagged sentences: >>> bigram_tagger = nltk.BigramTagger(train_sents) >>> bigram_tagger.tag(brown_sents[2007]) [('Various', 'JJ'), ('of', 'IN'), ('the', 'AT'), ('apartments', 'NNS'), ('are', 'BER'), ('of', 'IN'), ('the', 'AT'), ('terrace', 'NN'), ('type', 'NN'), (',', ','), ('being', 'BEG'), ('on', 'IN'), ('the', 'AT'), ('ground', 'NN'), ('floor', 'NN'), ('so', 'CS'), ('that', 'CS'), ('entrance', 'NN'), ('is', 'BEZ'), ('direct', 'JJ'), ('.', '.')] >>> unseen_sent = brown_sents[4203] >>> bigram_tagger.tag(unseen_sent) [('The', 'AT'), ('population', 'NN'), ('of', 'IN'), ('the', 'AT'), ('Congo', 'NP'), ('is', 'BEZ'), ('13.5', None), ('million', None), (',', None), ('divided', None), ('into', None), ('at', None), ('least', None), ('seven', None), ('major', None), ('``', None), ('culture', None), ('clusters', None), ("''", None), ('and', None), ('innumerable', None), ('tribes', None), ('speaking', None), ('400', None), ('separate', None), ('dialects', None), ('.', None)] Notice that the bigram tagger manages to tag every word in a sentence it saw during training, but does badly on an unseen sentence. As soon as it encounters a new word (i.e., 13.5), it is unable to assign a tag. It cannot tag the following word (i.e., million), even if it was seen during training, simply because it never saw it during training with a None tag on the previous word. Consequently, the tagger fails to tag the rest of the sentence. Its overall accuracy score is very low: >>> bigram_tagger.evaluate(test_sents) 0.10276088906608193 Combining Taggers One way to address the trade-off between accuracy and coverage is to use the more accurate algorithms when we can, but to fall back on algorithms with wider coverage when necessary. For example, we could combine the results of a bigram tagger, a unigram tagger, and a default tagger, as follows: 1. Try tagging the token with the bigram tagger. 2. If the bigram tagger is unable to find a tag for the token, try the unigram tagger. 3. If the unigram tagger is also unable to find a tag, use a default tagger. Most NLTK taggers permit a backoff tagger to be specified. The backoff tagger may itself have a backoff tagger: >>> t0 = nltk.DefaultTagger('NN') >>> t1 = nltk.UnigramTagger(train_sents, backoff=t0) >>> t2 = nltk.BigramTagger(train_sents, backoff=t1) >>> t2.evaluate(test_sents) 0.84491179108940495 5.6 Transformation-Based Tagging A potential issue with n-gram taggers is the size of their n-gram table (or language model). If tagging is to be employed in a variety of language technologies deployed on mobile computing devices, it is important to strike a balance between model size and tagger performance. An n-gram tagger with backoff may store trigram and bigram tables, which are large, sparse arrays that may have hundreds of millions of entries. A second issue concerns context. The only information an n-gram tagger considers from prior context is tags, even though words themselves might be a useful source of information. It is simply impractical for n-gram models to be conditioned on the identities of words in the context. In this section, we examine Brill tagging, an inductive tagging method which performs very well using models that are only a tiny fraction of the size of n-gram taggers. Brill tagging is a kind of transformation-based learning, named after its inventor. The general idea is very simple: guess the tag of each word, then go back and fix the mistakes. In this way, a Brill tagger successively transforms a bad tagging of a text into a better one. As with n-gram tagging, this is a supervised learning method, since we need annotated training data to figure out whether the tagger’s guess is a mistake or not. However, unlike n-gram tagging, it does not count observations but compiles a list of transformational correction rules. The process of Brill tagging is usually explained by analogy with painting. Suppose we were painting a tree, with all its details of boughs, branches, twigs, and leaves, against a uniform sky-blue background. Instead of painting the tree first and then trying to paint blue in the gaps, it is simpler to paint the whole canvas blue, then “correct” the tree section by over-painting the blue background. In the same fashion, we might paint the trunk a uniform brown before going back to over-paint further details with even finer brushes. Brill tagging uses the same idea: begin with broad brush strokes, and then fix up the details, with successively finer changes. 5.7 How to Determine the Category of a Word Now that we have examined word classes in detail, we turn to a more basic question: how do we decide what category a word belongs to in the first place? In general, linguists use morphological, syntactic, and semantic clues to determine the category of a word. Morphological Clues The internal structure of a word may give useful clues as to the word’s category. For example, -ness is a suffix that combines with an adjective to produce a noun, e.g., happy → happiness, ill → illness. So if we encounter a word that ends in -ness, this is very likely to be a noun. Similarly, -ment is a suffix that combines with some verbs to produce a noun, e.g., govern → government and establish → establishment. English verbs can also be morphologically complex. For instance, the present participle of a verb ends in -ing, and expresses the idea of ongoing, incomplete action (e.g., falling, eating). The -ing suffix also appears on nouns derived from verbs, e.g., the falling of the leaves (this is known as the gerund). Syntactic Clues Another source of information is the typical contexts in which a word can occur. For example, assume that we have already determined the category of nouns. Then we might say that a syntactic criterion for an adjective in English is that it can occur immediately before a noun, or immediately following the words be or very. According to these tests, near should be categorized as an adjective: (2) a. the near window b. The end is (very) near. Semantic Clues Finally, the meaning of a word is a useful clue as to its lexical category. For example, the best-known definition of a noun is semantic: “the name of a person, place, or thing.” Within modern linguistics, semantic criteria for word classes are treated with suspicion, mainly because they are hard to formalize. Nevertheless, semantic criteria underpin many of our intuitions about word classes, and enable us to make a good guess about the categorization of words in languages with which we are unfamiliar. For example, if all we know about the Dutch word verjaardag is that it means the same as the English word birthday, then we can guess that verjaardag is a noun in Dutch. However, some care is needed: although we might translate zij is vandaag jarig as it’s her birthday today, the word jarig is in fact an adjective in Dutch, and has no exact equivalent in English. New Words All languages acquire new lexical items. A list of words recently added to the Oxford Dictionary of English includes cyberslacker, fatoush, blamestorm, SARS, cantopop, bupkis, noughties, muggle, and robata. Notice that all these new words are nouns, and this is reflected in calling nouns an open class. By contrast, prepositions are regarded as a closed class. That is, there is a limited set of words belonging to the class (e.g., above, along, at, below, beside, between, during, for, from, in, near, on, outside, over, past, through, towards, under, up, with), and membership of the set only changes very gradually over time. Morphology in Part-of-Speech Tagsets Common tagsets often capture some morphosyntactic information, that is, information about the kind of morphological markings that words receive by virtue of their syntactic role. Consider, for example, the selection of distinct grammatical forms of the word go illustrated in the following sentences: (3) a. Go away! b. He sometimes goes to the cafe. c. All the cakes have gone. d. We went on the excursion. Each of these forms—go, goes, gone, and went—is morphologically distinct from the others. Consider the form goes. This occurs in a restricted set of grammatical contexts, and requires a third person singular subject. Thus, the following sentences are ungrammatical. (4) a. *They sometimes goes to the cafe. b. *I sometimes goes to the cafe. By contrast, gone is the past participle form; it is required after have (and cannot be replaced in this context by goes), and cannot occur as the main verb of a clause. (5) a. *All the cakes have goes. b. *He sometimes gone to the cafe. We can easily imagine a tagset in which the four distinct grammatical forms just discussed were all tagged as VB. Although this would be adequate for some purposes, a more fine-grained tagset provides useful information about these forms that can help other processors that try to detect patterns in tag sequences. In addition to this set of verb tags, the various forms of the verb to be have special tags: be/BE, being/BEG, am/BEM, are/BER, is/BEZ, been/BEN, were/BED, and was/BEDZ (plus extra tags for negative forms of the verb). All told, this fine-grained tagging of verbs means that an automatic tagger that uses this tagset is effectively carrying out a limited amount of morphological analysis. Most part-of-speech tagsets make use of the same basic categories, such as noun, verb, adjective, and preposition. However, tagsets differ both in how finely they divide words into categories, and in how they define their categories. For example, is might be tagged simply as a verb in one tagset, but as a distinct form of the lexeme be in another tagset (as in the Brown Corpus). This variation in tagsets is unavoidable, since part-of-speech tags are used in different ways for different tasks. In other words, there is no one “right way” to assign tags, only more or less useful ways depending on one’s goals. --- --- --- --- --- 7.5 Named Entity Recognition NLTK provides a classifier that has already been trained to recognize named entities, accessed with the function nltk.ne_chunk(). If we set the parameter binary=True , then named entities are just tagged as NE; otherwise, the classifier adds category labels such as PERSON, ORGANIZATION, and GPE. >>> sent = nltk.corpus.treebank.tagged_sents()[22] >>> print nltk.ne_chunk(sent, binary=True) (S The/DT (NE U.S./NNP) is/VBZ one/CD ... according/VBG to/TO (NE Brooke/NNP T./NNP Mossman/NNP) ...) >>> print nltk.ne_chunk(sent) (S The/DT (GPE U.S./NNP) is/VBZ one/CD ... according/VBG to/TO (PERSON Brooke/NNP T./NNP Mossman/NNP) ...) • Entity recognition is often performed using chunkers, which segment multitoken sequences, and label them with the appropriate entity type. Common entity types include ORGANIZATION, PERSON, LOCATION, DATE, TIME, MONEY, and GPE (geo-political entity). --- --- --- --- ---Some Code
(temp) C:\Users\Ashish Jain>pip install --upgrade nltk Processing c:\users\ashish jain\appdata\local\pip\cache\wheels\ae\8c\3f\b1fe0ba04555b08b57ab52ab7f86023639a526d8bc8d384306\nltk-3.5-cp37-none-any.whl Requirement already satisfied, skipping upgrade: tqdm in e:\programfiles\anaconda3\envs\temp\lib\site-packages (from nltk) (4.48.2) Requirement already satisfied, skipping upgrade: joblib in e:\programfiles\anaconda3\envs\temp\lib\site-packages (from nltk) (0.16.0) Collecting regex Downloading regex-2020.9.27-cp37-cp37m-win_amd64.whl (268 kB) |████████████████████████████████| 268 kB 3.3 MB/s Collecting click Using cached click-7.1.2-py2.py3-none-any.whl (82 kB) Installing collected packages: regex, click, nltk Attempting uninstall: nltk Found existing installation: nltk 3.4.5 Uninstalling nltk-3.4.5: Successfully uninstalled nltk-3.4.5 Successfully installed click-7.1.2 nltk-3.5 regex-2020.9.27 (temp) C:\Users\Ashish Jain>pip show nltk Name: nltk Version: 3.5 Summary: Natural Language Toolkit Home-page: http://nltk.org/ Author: Steven Bird Author-email: stevenbird1@gmail.com License: Apache License, Version 2.0 Location: e:\programfiles\anaconda3\envs\temp\lib\site-packages Requires: tqdm, regex, click, joblib Required-by: textblob, sumy import nltk print("nltk:", nltk.__version__) nltk: 3.5 import matplotlib import matplotlib.pyplot as plt # Without "%matplotlib inline", you get error "Javascript Error: IPython is not defined" in JupyterLab. %matplotlib inline # For scrollable output image %matplotlib nbagg with open('files_1/Unicode.txt', mode = 'r') as f: txt = f.read() txt[:80] 'What Is Unicode?\nUnicode supports over a million characters. Each character is a' Tokenize # Tokenize into words words = nltk.tokenize.word_tokenize(txt) print(words[:5]) print("Number of words:", len(words)) ['What', 'Is', 'Unicode', '?', 'Unicode'] Number of words: 302 Stopwords from nltk.corpus import stopwords print("Number of English stopwords:", len(stopwords.words('english'))) Number of English stopwords: 179 Word-Frequency Plot from nltk.probability import FreqDist fdist1 = FreqDist(words) %matplotlib inline fig = plt.figure(figsize=(12,5)) fdist1.plot(100, cumulative=True) Converting input text to an NLTK text # text = nltk.Text(txt) # [Text: W h a t I s ...] text = nltk.Text(words) print(text) [Text: What Is Unicode ? Unicode supports over a...] Words Collocations (Bigram and Trigram) text.collocation_list(5) [('string', 'literal'), ('code', 'point'), ('Unicode', 'characters'), ('Unicode', 'string')] from nltk.collocations import * trigram_measures = nltk.collocations.TrigramAssocMeasures() finder = TrigramCollocationFinder.from_words(text) finder.nbest(trigram_measures.pmi, 10) [('abstract', 'entities', 'that'), ('by', 'preceding', 'an'), ('escape', 'sequence', 'inside'), ('just', 'like', 'normal'), ('preceding', 'an', 'ordinary'), ('specified', 'by', 'preceding'), ('\\uXXXX', 'escape', 'sequence'), ('encodingâ€', '”', 'this'), ('four-digit', 'hexadecimal', 'form'), ('like', 'normal', 'strings')] Clean HTML with open('files_1/tempate.html', mode = 'r') as f: in_html = f.read() nltk.clean_html(in_html) # NotImplementedError: To remove HTML markup, use BeautifulSoup's get_text() function Word Distance Ref: nltk.org "Word Distance" import pkgutil for importer, modname, ispkg in pkgutil.iter_modules(nltk.__path__): print("Found submodule %s (is a package: %s)" % (modname, ispkg)) Found submodule app (is a package: True) Found submodule book (is a package: False) Found submodule ccg (is a package: True) Found submodule chat (is a package: True) Found submodule chunk (is a package: True) ... dir(nltk)[:5] ['AbstractLazySequence', 'AffixTagger', 'AlignedSent', 'Alignment', 'AnnotationTask'] [i for i in dir(nltk) if 'dist' in i] ['binary_distance', 'custom_distance', 'distance', 'edit_distance', 'edit_distance_align', 'interval_distance', 'jaccard_distance', 'masi_distance'] string_distance_examples = [ ("rain", "shine"), ("abcdef", "acbdef"), ("language", "lnaguaeg"), ("language", "lnaugage"), ("language", "lngauage"), ] for i in string_distance_examples: print(i[0], i[1], ":", nltk.binary_distance(i[0], i[1])) rain shine : 1.0 abcdef acbdef : 1.0 language lnaguaeg : 1.0 language lnaugage : 1.0 language lngauage : 1.0 for i in string_distance_examples: print(i[0], i[1], ":", nltk.edit_distance(i[0], i[1])) rain shine : 3 abcdef acbdef : 2 language lnaguaeg : 4 language lnaugage : 3 language lngauage : 2
Friday, October 2, 2020
Natural Language Toolkit (NLTK) - Highlights (Book by Steven Bird)
Thursday, October 1, 2020
7 Frequent Python 'os' Package Uses
(base) C:\Users\ashish\Desktop\TEST>tree /f C:. │ TEST2.txt │ └───TEST1 │ TEST1.2.txt │ └───TEST1.1 TEST1.1.1.txt = = = = = 1. List all the subdirectories and files: (base) C:\Users\ashish\Desktop\TEST>python Python 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.listdir('.') ['TEST1', 'TEST2.txt'] >>> = = = = = 2. Recursively get all the subdirectories and files under the current folder (signified by "."): >>> for dirpath, subdirs, files in os.walk("."): ... print("dirpath:", dirpath) ... for s in subdirs: ... print("subdir:", s) ... for f in files: ... print("file:", f) ... dirpath: . subdir: TEST1 subdir: TEST4 file: TEST2.txt file: TEST3.txt dirpath: .\TEST1 subdir: TEST1.1 file: TEST1.2.txt dirpath: .\TEST1\TEST1.1 file: TEST1.1.1.txt dirpath: .\TEST4 >>> = = = = = 3. Call a OS Shell (or Windows CMD) command from Python Shell: >>> os.system("tree /f") C:. │ TEST2.txt │ TEST3.txt │ ├───TEST1 │ │ TEST1.2.txt │ │ │ └───TEST1.1 │ TEST1.1.1.txt │ └───TEST4 0 Note: 0 at the end signifies 'successful' exit. = = = = = 4. Create a "Path" from Strings: "os.path.join" joins strings with the character representing 'path separater' for the OS. >>> os.path.join("DIR1", "DIR2") 'DIR1\\DIR2' >>> os.path.join("DIR1", "DIR2", "DIR3") 'DIR1\\DIR2\\DIR3' >>> os.path.sep '\\' = = = = = 5. Get 'current working directory': >>> os.getcwd() 'C:\\Users\\ashish\\Desktop\\TEST' = = = = = 6. Change Directory: >>> os.chdir("TEST1") >>> os.listdir() ['TEST1.1', 'TEST1.2.txt'] >>> os.getcwd() 'C:\\Users\\ashish\\Desktop\\TEST\\TEST1' = = = = = 7. Getting environment variables: >>> import os >>> os.environ['PATH'] 'E:\\programfiles\\Anaconda3;E:\\programfiles\\Anaconda3\\Lib...' = = = = =
Monday, September 28, 2020
Difficult Conversations. How to Discuss What Matters Most (D. Stone, B. Patton, S. Heen, 2010)
Preface to Second Edition The book 'Difficult Converstion' has been used to train oil-rig operators in the North Sea, Iñupiat negotiators in the oil-rich Northern Slope of Alaska, and business leaders at Saudi Aramco. It’s been used at the Boston Area Rape Crisis Center and the headquarters and field offices of UN-AIDS. Doctors, nurses, and administrators in hospitals across the United States have used it to deliver better patient care and develop more humane workplaces. Within the U.S. government, it’s distributed at the Department of Justice, the IRS, the Federal Reserve, and the Postal Service. During one administration, the White House made it required reading for its top sixteen hundred political appointees. Law schools, business schools, and colleges assign it, as do high school teachers, life coaches, therapists, and ministers. The explanation for all of it is just this: people are people. We have perceptions and thoughts and feelings, and we work and play with other human beings who have their own perceptions, thoughts, and feelings. In the present times, the space for difficult conversations has expanded because of the following reasons: • Your organization is flat, aligned, and right-sized, but you still can’t stand your boss. • You fly three thousand miles and drive two hours to visit your elderly widowed father, and the first words out of his mouth are “You’re late!” • You’ve got four e-mail addresses, two voice-mail accounts, and sit only feet away from your five closest colleagues, but not one of them has found a way to talk to you about what they apparently call your “confrontational style.” • No matter how hard you try, you can’t seem to get your sales, manufacturing, and product development teams to see themselves as members of the same organization. ~ ~ ~ Relationships that deal productively with the inevitable stresses of life are more durable; people who are willing and able to “stick through the hard parts” emerge with a stronger sense of trust in each other and the relationship, because now they have a track record of having worked through something hard and seen that the relationship survived. ~ ~ ~ The ability to manage difficult conversations effectively is foundational to achieving almost any significant change. And in addition to supporting major change initiatives, these skills are increasingly needed simply to sustain business as usual. The relentless press of competition has forced most businesses to grow in scale to achieve efficiencies and competitive clout. Many industries are now global in reach. At the same time, the need to be responsive to the market – nimble, flexible, adaptive – has driven many organizations to be less hierarchical and to operate in a matrix that introduces more complexity to decision making and the ability to get things done. This is a recipe for more conflict – and for more difficult conversations. ~ ~ ~ Introduction A Difficult Conversation Is Anything You Find It Hard to Talk About Sexuality, race, gender, politics, and religion come quickly to mind as difficult topics to discuss, and for many of us they are. Anytime we feel vulnerable or our self-esteem is implicated, when the issues at stake are important and the outcome uncertain, when we care deeply about what is being discussed or about the people with whom we are discussing it, there is potential for us to experience the conversation as difficult. And, of course, there’s the stuff of everyday life, conversations that feel more ordinary but cause anxiety nonetheless: returning merchandise without a receipt, asking your secretary to do some photocopying, telling the painters not to smoke in the house. ~ ~ ~ The Dilemma: Avoid or Confront, It Seems There Is No Good Path Perhaps the neighbors’ dog keeps you up at night. "Maybe I’ll get used to it," you might think. But then the dog barks again, and you resolve that tomorrow you are going to talk to the neighbors once and for all. Because at some level we know the truth: If we try to avoid the problem, we’ll feel taken advantage of, our feelings will fester, we’ll wonder why we don’t stick up for ourselves, and we’ll rob the other person of the opportunity to improve things. We may be rejected or attacked; we might hurt the other person in ways we didn’t intend; and the relationship might suffer. If you follow the steps presented in this book, you will find difficult conversations becoming easier and causing less anxiety. In fact, the people we’ve worked with, who have learned new approaches to dealing with their most challenging conversations, report less anxiety and greater effectiveness in all of their conversations. The problems are complex, and the people involved are not easy to deal with. But we have discovered that, regardless of context, the things that make difficult conversations difficult, and the errors in thinking and acting that compound those difficulties, are the same. The people involved may be so emotionally troubled, the stakes so high, or the conflict so intense that a book – or even professional intervention – is unlikely to help. But first and more important, it will help you understand better what you’re up against and why it makes sense to shift from a “message delivery stance” to a “learning stance.” Only then will you be able to understand and implement the steps of a learning conversation. ~ ~ ~ In studying hundreds of conversations of every kind we have discovered that there is an underlying structure to what’s going on, and understanding this structure, in itself, is a powerful first step in improving how we deal with these conversations. Everything problematic that two people say, think, and feel in a difficult conversation falls into one of these three “conversations.” And everything in your difficult conversations does too. 1. The “What Happened?” Conversation. Most difficult conversations involve disagreement about what has happened or what should happen. Who said what and who did what? Who’s right, who meant what, and who’s to blame? The “What Happened?” Conversation: What’s the Story Here? 1.1 The Truth Assumption As we argue vociferously for our view, we often fail to question one crucial assumption upon which our whole stance in the conversation is built: I am right, you are wrong. This simple assumption causes endless grief. What am I right about? I am right that you drive too fast. I am right that you are unable to mentor younger colleagues. I am right that your comments at Thanksgiving were inappropriate. I am right that the patient should have received more medication after such a painful operation. I am right that the contractor overcharged me. I am right that I deserve a raise. I am right that the brochure is fine as it is. The number of things I am right about would fill a book. There’s only one hitch: I am not right. How could this be so? It seems impossible. Surely I must be right sometimes! Well, no. The point is this: difficult conversations are almost never about getting the facts right. They are about conflicting perceptions, interpretations, and values. They are not about what a contract states, they are about what a contract means. They are not about which child-rearing book is most popular, they are about which child-rearing book we should follow. They are not about what is true, they are about what is important. 1.2 The Intention Invention The second argument in the “What Happened?” Conversation is over intentions – yours and mine. Did you yell at me to hurt my feelings or merely to emphasize your point? Did you throw my cigarettes out because you’re trying to control my behavior or because you want to help me live up to my commitment to quit? What I think about your intentions will affect how I think about you and, ultimately, how our conversation goes. The error we make in the realm of intentions is simple but profound : we assume we know the intentions of others when we don’t. Worse still, when we are unsure about someone’s intentions, we too often decide they are bad. The truth is, intentions are invisible. We assume them from other people’s behavior. In other words, we make them up, we invent them. But our invented stories about other people’s intentions are accurate much less often than we think. Why? Because people’s intentions, like so much else in difficult conversations, are complex. Sometimes people act with mixed intentions. Sometimes they act with no intention, or at least none related to us. And sometimes they act on good intentions that nonetheless hurt us. Because our view of others’ intentions (and their views of ours) are so important in difficult conversations, leaping to unfounded assumptions can be a disaster. 1.3 The Blame Frame The third error we make in the “What Happened?” Conversation has to do with blame. Most difficult conversations focus significant attention on who’s to blame for the mess we’re in. When the company loses its biggest client, for example, we know that there will shortly ensue a ruthless game of blame roulette. We don’t care where the ball lands, as long as it doesn’t land on us. Personal relationships are no different. Your relationship with your stepmother is strained? She’s to blame. She should stop bugging you about your messy room and the kids you hang out with. Talking about fault is similar to talking about truth — it produces disagreement, denial, and little learning. It evokes fears of punishment and insists on an either/or answer. Nobody wants to be blamed, especially unfairly, so our energy goes into defending ourselves. 2. The Feelings Conversation. Every difficult conversation also asks and answers questions about feelings. We conduct an internal debate over whether this means we are competent or incompetent, a good person or bad, worthy of love or unlovable. As we argue vociferously for our view, we often fail to question one crucial assumption upon which our whole stance in the conversation is built: I am right, you are wrong. They are about conflicting perceptions, interpretations, and values. These are not questions of right and wrong, but questions of interpretation and judgment.The Feelings Conversation: What Should We Do with Our Emotions?
Difficult conversations are not just about what happened; they also involve emotion. The question is not whether strong feelings will arise, but how to handle them when they do. Should you tell your boss how you really feel about his management style, or about the colleague who stole your idea? Should you share with your sister how hurt you feel that she stayed friends with your ex? And what should you do with the anger you are likely to experience if you decide to talk with that vendor about his sexist remarks? In the presence of strong feelings, many of us work hard to stay rational. Getting too deep into feelings is messy, clouds good judgment, and in some contexts — for example, at work — can seem just plain inappropriate. Bringing up feelings can also be scary or uncomfortable, and can make us feel vulnerable. After all, what if the other person dismisses our feelings or responds without real understanding? Or takes our feelings to heart in a way that wounds them or irrevocably damages the relationship? And once we’ve gotten our feelings off our chest, it’s their turn. Are we up to hearing all about their anger and pain? An Opera Without Music (Part of 'Feelings Convesation') The problem with this reasoning is that it fails to take account of one simple fact: difficult conversations do not just involve feelings, they are at their very core about feelings. Feelings are not some noisy byproduct of engaging in difficult talk, they are an integral part of the conflict. Engaging in a difficult conversation without talking about feelings is like staging an opera without the music. You’ll get the plot but miss the point. Consider some of your own difficult conversations. What feelings are involved? Hurt or anger? Disappointment, shame, confusion? Do you feel treated unfairly or without respect? For some of us, even saying “I love you” or “I’m proud of you” can feel risky. In the short term, engaging in a difficult conversation without talking about feelings may save you time and reduce your anxiety. It may also seem like a way to avoid certain serious risks – to you, to others, and to the relationship. But the question remains: if feelings are the issue, what have you accomplished if you don’t address them? Understanding feelings, talking about feelings, managing feelings – these are among the greatest challenges of being human. There is nothing that will make dealing with feelings easy and risk-free. Most of us, however, can do a better job in the Feelings Conversation than we are now. It may not seem like it, but talking about feelings is a skill that can be learned. 3. The Identity Conversation. In the “What Happened?” Conversation, moving away from the truth assumption frees us to shift our purpose from proving we are right to understanding the perceptions, interpretations, and values of both sides. And to offer our views as perceptions, interpretations, and values – not as “the truth.” Did you throw my cigarettes out because you’re trying to control my behavior or because you want to help me live up to my commitment to quit? The error we make in the realm of intentions is simple but profound : we assume we know the intentions of others when we don’t. Because people’s intentions, like so much else in difficult conversations, are complex. Sometimes people act with mixed intentions. Most difficult conversations focus significant attention on who’s to blame for the mess we’re in. But in situations that give rise to difficult conversations, it is almost always true that what happened is the result of things both people did — or failed to do. In the presence of strong feelings, many of us work hard to stay rational. Getting too deep into feelings is messy, clouds good judgment, and in some contexts — for example, at work — can seem just plain inappropriate. And once we’ve gotten our feelings off our chest, it’s their turn. This line of reasoning suggests that we stay out of the Feelings Conversation altogether. The Identity Conversation looks inward: it’s all about who we are and how we see ourselves. In short: before, during, and after the difficult conversation, the Identity Conversation is about what I am saying to myself about me. "Why does my sense of who I am in the world matter here?" Or Jack might be thinking, “This is about the brochure, not about me.” In fact, anytime a conversation feels difficult, it is in part precisely because it is about You, with a capital Y. In fact, what if your boss gives you good reasons for turning you down? Instead of wanting to persuade and get your way, you want to understand what has happened from the other person’s point of view, explain your point of view, share and understand feelings, and work together to figure out a way to manage the problem going forward. We need to have a learning conversation. This book will help you turn difficult conversations into learning conversations by helping you handle each of the Three Conversations more productively and improving your ability to handle all three at once. This will help you shift to a learning stance when it’s your difficult conversation and you aren’t feeling very open. Then we turn to the mechanics of how to talk productively about the issues that matter to you: finding the best ways to begin, inquiring and listening to learn, expressing yourself with power and clarity, and solving problems jointly, including how to get the conversation back on track when the going gets rough. Michael’s version of the story is different from Jack’s: In the past couple of years I’ve really gone out of my way to try to help Jack out, and it seems one thing or another has always gone wrong. But what really made me angry was the way Jack was making excuses about the chart instead of just fixing it. He knew it wasn’t up to professional standards. ~ ~ ~ When we argue, we tend to trade conclusions – the “bottom line” of what we think: “Get a new mattress” versus “Stop trying to control me.” “I’m going to New York to make it big” versus “You’re naive.” “Couples counseling is helpful” versus “Couples counseling is a waste of time.” But neither conclusion makes sense in the other person’s story. Telling someone to change makes it less rather than more likely that they will. Understand each other’s stories Understanding each other’s stories from the inside won’t necessarily “solve” the problem, but as with Karen and Trevor, it’s an essential first step. Second, we each have access to different information. Sitting on his uncle’s shoulders, Andrew shouted with delight as football players, cheerleaders, and the school band rolled by on lavish floats. His Uncle Doug, truck indifferent, hadn’t noticed a single truck. Of course, neither Doug nor Andrew walked away from the parade thinking, “I enjoyed my particular perspective on the parade based on the information I paid attention to.” Each walked away thinking, “I enjoyed the parade.” Each assumes that what he paid attention to was what was significant about the experience. Often we go through an entire conversation – or indeed an entire relationship – without ever realizing that each of us is paying attention to different things, that our views are based on different information. ~ ~ ~ Jack doesn’t know that Michael’s graphic designer has taken an unscheduled personal leave in the midst of their busiest season, affecting not just this project but others as well. Jack doesn’t know that Michael has been dissatisfied with some of Jack’s work in the past. But rather than assuming we already know everything we need to, we should assume that there is important information we don’t have access to. Two especially important factors in how we interpret what we see are: (1) our past experiences and (2) the implicit rules we’ve learned about how things should and should not be done. ~ ~ ~ To celebrate the end of a long project, Bonnie and her coworkers scraped together the money to treat their supervisor, Caroline, to dinner at a nice restaurant. This was not the first time Ollie had arrived late, and Thelma was so frustrated that she had trouble focusing for the first twenty minutes of their meeting. Ollie was frustrated that Thelma was frustrated. Thelma’s rule is “It is unprofessional and inconsiderate to be late.” Ollie’s rule is “It is unprofessional to obsess about small things so much that you can’t focus on what’s important.” Because Thelma and Ollie both interpret the situation through the lens of their own implicit rule, they each see the other person as acting inappropriately. Our implicit rules often take the form of things people “should” or “shouldn’t” do: “You should spend money on education, but not on clothes.” “You should never criticize a colleague in front of others.” "You should never leave the toilet seat up." "Squeeze the toothpaste in the middle" Or "Don't let the kids watch more than two hours of TV." The list is endless. There’s nothing wrong with having these rules. ~ ~ ~ Consider the disagreement between Tony and his wife, Keiko. To her shock, Tony says he’s not going with her to visit his sister, but instead is going to watch the football game on TV. When Keiko asks why, Tony mumbles something about this being a “big game,” and adds, “I’ll stop by the hospital tomorrow.” Keiko is deeply troubled by this. That’s the most selfish, shallow, ridiculous thing I’ve ever heard!” But she catches herself in her own certainty, and instead of saying, “How could you do such a thing?” she negotiates herself to a place of curiosity. Since Tony believes that his sister won’t care whether he comes today or tomorrow, he chooses in favor of his mental health. It can be awfully hard to stay curious about another person’s story when you have your own story to tell, especially if you’re thinking that only one story can really be right. After all, your story is so different from theirs, and makes so much sense to you. Part of the stress of staying curious can be relieved by adopting what we call the “And Stance.” We usually assume that we must either accept or reject the other person’s story, and that if we accept theirs, we must abandon our own. The And Stance is based on the assumption that the world is complex, that you can feel hurt, angry, and wronged, and they can feel just as hurt, angry, and wronged. Sometimes people have honest disagreements, but even so, the most useful question is not “Who’s right?” but “Now that we really understand each other, what’s a good way to manage this problem?” You may be thinking that the advice to shift from certainty and arguing to curiosity and the And Stance generally makes sense, but that there must be exceptions. Even if you understand another person’s story with genuine insight and empathy, you may still stumble on the next step, thinking that however much their story makes sense to them, you are still “right” and they are still “wrong.” For example, what about the conversation you have with your daughter about her smoking? What may help is to tell him about the impact his drinking has on you, and, further, to try to understand his story. ~ ~ ~ Later that evening things went from bad to worse: LORI: I really resented it at the party, the way you treated me in front of our friends. LEO: The way I treated you? What are you talking about? LORI: About the ice cream. You act like you’re my father or something. You have this need to control me or put me down. LEO: Lori, I wasn’t trying to hurt you. You said you were on a diet, and I’m just trying to help you stick to it. You’re so defensive. You hear everything as an attack on you, even when I’m trying to help. LORI: Help!? Humiliating me in front of my friends is your idea of helping? Exploring “Lori’s mistake” requires us to understand how our minds work when devising stories about what others intend, and to learn to recognize the set of questionable assumptions upon which these stories are built. Here’s the problem: While we care deeply about other people’s intentions toward us, we don’t actually know what their intentions are. We Assume Intentions from the Impact on Us Much of the first mistake can be traced to one basic error: we make an attribution about another person’s intentions based on the impact of their actions on us. We feel hurt; therefore they intended to hurt us. When Margaret finally saw the doctor a week later, she asked curtly how his vacation had been. Yet knowing that he was not acting out of selfishness, but from an unrelated and generous motivation, left Margaret feeling substantially better about having to wait the extra week. With business and even personal relationships increasingly conducted via e-mail, voice mail, faxes, and conference calls, we often have to read between the lines to figure out what people really mean. What’s ironic — and all too human — about our tendency to attribute bad intentions to others is how differently we treat ourselves. Are There Never Bad Intentions? The easiest and most common way of expressing these assumptions about the intentions of others is with an accusatory question: “How come you wanted to hurt me?” “Why do you ignore me like this?” “What have I done that makes you feel it’s okay to step all over me?” We might think we are sharing our hurt, frustration, anger, or confusion, but we are rather accusing. For example, Leo is defensive throughout, and at the end, when he says that he sometimes wonders if Lori “starts these fights on purpose,” he actually accuses Lori of bad intentions. If interviewed about their conversation afterward, both Lori and Leo would report that they were the victim of the other’s bad intentions. You think your boss isn’t giving you enough responsibility. When we think others have bad intentions toward us, it affects our behavior. As we’ve seen, the mistake Lori makes of assuming she knows Leo’s intentions, though seemingly small, has big consequences. Working to understand what the other person is really saying is particularly important because when someone says “You intended to hurt me” that isn’t quite what they mean. The father who is too busy at work to attend his son’s basketball game doesn’t intend to hurt his son. He would prefer not to hurt his son. If the father responds to his son’s complaint by saying “I didn’t intend to hurt you,” he’s not addressing his son’s real concern: “You may not have intended to hurt me, but you knew you were hurting me, and you did it anyway.” It is useful to attempt to clarify your intentions. Another problem with assuming that good intentions sanitize a negative impact is that intentions are often more complex than just “good” or “bad.” Are Leo’s intentions purely angelic? And he is sending a message to Lori that says, “I’m more interested in defending myself than I am in investigating the complexities of what might be going on for me in our relationship.” Interestingly, when people take on the job of thinking hard about their own intentions, it sends a profoundly positive message to the other person about the importance of the relationship. ~ ~ ~ Separating impact from intentions requires us to be aware of the automatic leap from “I was hurt” to “You intended to hurt me.” You can make this distinction by asking yourself three questions: 1. Actions: “What did the other person actually say or do?” 2. Impact: “What was the impact of this on me?” 3. Assumption: “Based on this impact, what assumption am I making about what the other person intended?” Hold Your View as a Hypothesis. Once you have clearly answered these three questions, the next step is to make absolutely certain that you recognize that your assumption about their intentions is just an assumption. It is a guess, a hypothesis. Your hypothesis is not based on nothing ; you know what was said or done. But as we’ve seen, this is not a lot of evidence to go on. Your guess might be right and it might be wrong. In fact, your reaction might even say as much about you as it does about what they did. Perhaps you’ve had a past experience that gives their action special meaning to you. Share the Impact on You; Inquire About Their Intentions. You can use your answers to the three questions listed above to begin the difficult conversation itself: say what the other person did, tell them what its impact was on you, and explain your assumption about their intentions, taking care to label it as a hypothesis that you are checking rather than asserting to be true. Consider how this would change the beginning of the conversation between Lori and Leo. Instead of beginning with an accusation, Lori can begin by identifying what Leo said, and what the impact was on her: LORI: You know when you said, “Why don’t you lay off the ice cream”? Well, I felt hurt by that. LEO: You did? LORI: Yeah. LEO: I was just trying to help you stay on your diet. Why does that make you upset? LORI: I felt embarrassed that you said it in front of our friends. Then what I wonder is whether you said it on purpose to embarrass or hurt me. I don’t know why you’d want to do that, but that’s what I’m thinking when it happens. LEO: Well, I’m certainly not doing it on purpose. I guess I didn’t realize it was so upsetting. I’m confused about what it is you want me to say if I see you going off your diet... The conversation is only beginning, but it is off to a better start. ~ ~ ~ When you share your assumptions about their intentions, simply be clear that you are sharing assumptions – guesses – and that you are sharing them for the purpose of testing whether they make sense to the other person. When we find ourselves in Leo’s position – being accused of bad intentions – we have a strong tendency to want to defend ourselves: “That is not what I intended.” We are defending our intentions and our character. Remember that the accusation about our bad intentions is always made up of two separate ideas: (1) we had bad intentions and (2) the other person was frustrated, hurt, or embarrassed. And if you start by listening and acknowledging the feelings, and then return to the question of intentions, it will make your conversation significantly easier and more constructive. Understanding how we distort others’ intentions, making difficult conversations even more difficult, is crucial to untangling what happened between us. You blame your assistant, not just because she’s a convenient target for your frustration or because letting others know it was she and not you who screwed up may help salvage your reputation, but because it is the simple truth: this was her fault. You can blame her explicitly, saying something like “I don’t know how you could have let this happen!” Or, if you tend to be less confrontational (or have been taught that blaming people isn’t helpful), you can blame her implicitly, with something less threatening, like “Let’s do better next time.” Either way, she’ll get the message: she’s to blame. Who is the bad person in this relationship? Focusing on blame is a bad idea. Primarily, because blame is often irrelevant and unfair. You can’t move away from blame until you understand what blame is, what motivates us to want to blame each other, and how to move toward something else that will better serve your purposes in difficult conversations. When we ask the question “Who is to blame?” we are really asking three questions in one. Did your assistant’s actions (or inaction) cause you to have the wrong storyboards? When blame is in play, you can expect defensiveness, strong emotion, interruptions, and arguments about what “good assistants,” “loving spouses,” or “any reasonable person” should or shouldn’t do. When we blame someone, we are offering them the role of “the accused,” so they do what accused people do: they defend themselves any way they can. The first question is “How did we each contribute to bringing about the current situation?” Or put another way: “What did we each do or not do to get ourselves into this mess?” The second question is “Having identified the contribution system, how can we change it? What can we do about it as we go forward?” In short, contribution is useful when our goal is to understand what actually happened so that we can improve how we work together in the future. In the worlds of both business and personal relationships, too often we deal in blame when our real goals are understanding and change. To illustrate, let’s return to the ExtremeSport story and imagine two contrasting conversations between you and your assistant. The first conversation focuses on blame, the second on contribution. YOU: I wanted to talk to you about my presentation at ExtremeSport. You packed the wrong storyboards. The situation was unbelievably awkward, and made me look terrible. We simply can’t work this way. ASSISTANT: I heard. I’m so sorry. I just, well, you probably don’t want to hear my excuses. YOU: I just don’t understand how you could let this happen. ASSISTANT: I’m really sorry. YOU: I know you didn’t do it on purpose, and I know you feel bad, but I don’t want this to happen again. You understand what I’m saying? ASSISTANT: It won’t. I promise you. All three elements of blame are present: you caused this, I’m judging you negatively, and implicit in what I am saying is that one way or another you will be punished, especially if it happens again. In contrast, a conversation about contribution might sound like this: YOU: I wanted to talk to you about my presentation at ExtremeSport. When I arrived I found the wrong storyboards in my briefcase. ASSISTANT: I heard. I’m so sorry. I feel terrible. YOU: I appreciate that. I’m feeling bad too. Let’s retrace our steps and think about how this happened. I suspect we may each have contributed to the problem. From your point of view, did I do anything differently this time? ASSISTANT: I’m not sure. We were working on three accounts at once, and on the one just before this one, when I asked about which boards you wanted packed, you got angry. I know it is my responsibility to know which boards you want, but sometimes when things get hectic, it can get confusing. YOU: If you’re unsure, you should always ask. But it sounds like you’re saying I don’t always make it easy to do that. ASSISTANT: Well, I do feel intimidated sometimes. When you get really busy, it’s like you don’t want to be bothered. The day you left you were in that kind of mood. I was trying to stay out of your way, because I didn’t want to add to your frustration. I had planned to double-check which boards you wanted when you got off the phone, but then I had to run to the copy center. After you left I remembered, but I knew you usually double-checked your briefcase, so I figured it was okay. YOU: Yeah, I do usually double-check, but this time I was so overwhelmed I forgot. I think we’d both better doublecheck every time. And I do get in those moods. I know it can be hard to interact with me when I’m like that. I need to work on being less impatient and abrupt. But if you’re unsure, I need you to ask questions no matter what kind of mood I’m in. ASSISTANT: So you want me to ask questions even if I think it will annoy you? YOU: Yes, although I’ll try to be less irritable. Can you do that? ASSISTANT: Well, talking about it like this makes it easier. I realize it’s important. YOU: You can even refer to this conversation. You can say, “I know you’re under pressure, but you made me promise I’d ask this...” Or just say, “Hey, you promised not to be such a jerk!” ASSISTANT: [laughs] Okay, that works for me. YOU: And we might also think about how you could track better which appointments are going to be for which campaigns.... In the second conversation, you and your assistant have begun to identify the contributions that you each brought to the problem, and the ways in which each of your reactions are part of an overall pattern: You feel anxious and distracted about an upcoming presentation, and snap at your assistant. She assumes you want her out of your way, and withdraws. Something falls through the cracks, and then you are even more annoyed and worried the next time you are preparing, since you’re no longer sure you can trust your assistant to help you. So you become more abrupt, increasingly unapproachable, and the communication between you continues to erode. Mistakes multiply. As you get a handle on the interactive system the two of you have created, you can see what you each need to do to avoid or alter that system in the future. As a result, this second conversation is much more likely than the first to produce lasting change in the way you work together. Indeed, the first conversation runs the risk of reinforcing the problem. Since part of the system is that your assistant feels discouraged from talking to you because she fears provoking your anger, a conversation about blame is likely to make that tendency worse, not better. If you go that way, she’ll eventually conclude that you’re impossible to work with, and you’ll report that she’s incompetent. Contribution Is Joint and Interactive A contribution system includes inputs from both people. Think about a baseball pitcher facing a batter. If the batter strikes out in a crucial situation, he might explain that he wasn’t seeing well, that his wrist injury was still bothering him, or perhaps that he simply failed to come through in the clutch. The pitcher, however, might describe the strikeout by saying, “I knew he was thinking curve, so I came in with a high fast-ball,” or, “I was in a zone. I knew I had him before he even got in the batter’s box.” Who is right, the batter or the pitcher? Of course, the answer is both, at least in part. Whether the batter strikes out or hits a home run is a result of the interaction between the batter and the pitcher. Depending on your perspective, you might focus on the actions of one or the other, but the actions of both are required for the outcome. It’s the same in difficult conversations. Other than in extreme cases, such as child abuse, almost every situation that gives rise to a conversation is the result of a joint contribution system. Focusing on only one or the other of the contributors obscures rather than illuminates that system. ~ ~ ~ After a car accident, for example, an automaker expecting to be sued may resist making safety improvements for fear it will seem an admission that the company should have done something before the accident. “Truth commissions” often are created because of this trade-off between assigning blame and gaining an understanding of what really happened. Focusing on Blame Hinders Problem-Solving When the dog disappears, who’s to blame? When your real goal is finding the dog, fixing the ceiling, and preventing such incidents in the future, focusing on blame is a waste of time. It neither helps you understand the problem looking back, nor helps you fix it going forward. Fundamentally, using the blame frame makes conversations more difficult, while understanding the contribution system makes a difficult conversation easier and more likely to be productive. ~ ~ ~ Contribution asks: “What did I do that helped cause the situation?” You can find contribution even in situations where you carry no blame; you did contribute to being mugged. In his autobiography, A Long Walk to Freedom, Nelson Mandela provides an example of how people who have been overwhelmingly victimized can still seek to understand their own contribution to their problems. One of the most common contributions to a problem, and one of the easiest to overlook, is the simple act of avoiding. ~ ~ ~ "Fighting to show love" has limitations. Yet it and many other less-than-ideal dynamics are surprisingly common, at home and in the workplace. Why? First, because despite its problems the familiar pattern is comfortable, and the members of the group work to keep each person playing their role. Second, because changing a contribution system requires more than just spotting it and recognizing its limitations. The people involved also have to find another way to provide its benefits. George and his parents need to find better ways to demonstrate affection and maintain closeness. And this is likely to require some tough work in their Feelings and Identity Conversations. In an organization, this explains why people find it hard to change how they work together even when they see the limitations of common role assumptions, such as “Leaders set strategy; subordinates implement it.” To change how people interact, they need both an alternate model everyone thinks is better and the skills to make that model work at least as well as the current approach. ~ ~ ~ Two Tools for Spotting Contribution If you are still unable to see your contribution, try one of the following two approaches. 1. Role Reversal Ask yourself, “What would they say I’m contributing?” Pretend you are the other person and answer the question in the first person, using pronouns such as I, me, and my. Seeing yourself through someone else’s eyes can help you understand what you’re doing to feed the system. 2. The Observer’s Insight Step back and look at the problem from the perspective of a disinterested observer. Imagine that you are a consultant called in to help the people in this situation better understand why they are getting stuck. How would you describe, in a neutral, nonjudgmental way, what each person is contributing? If you have trouble getting out of your own shoes in this way, ask a friend to try for you. If what your friend comes up with surprises you, don’t reject it immediately. Rather, imagine that it is true. Ask how that could be, and what it would mean. ~ ~ ~ The 'Feelings' Conversation From Chapter: 5 Have Your Feelings (Or They Will Have You) % Feelings Matter: They Are Often at the Heart of Difficult Conversations % We Try to Frame Feelings Out of the Problem. Do not do this. Framing the feelings out of the conversation is likely to result in outcomes that are unsatisfying for both people. The real problem is not dealt with, and further, emotions have an uncanny knack for finding their way back into the conversation, usually in not very helpful ways. 1. Unexpressed Feelings Can Leak into the Conversation 2. Unexpressed Feelings Can Burst into the Conversation 3. Unexpressed Feelings Make It Difficult to Listen 4. Unexpressed Feelings Take a Toll on Our Self-Esteem and Relationships ~ ~ ~ A Way Out of the Feelings Bind There are ways to manage the problem of feelings. Working to get feelings into the conversation is almost always helpful as long as you do so in a purposive way. While the drawbacks of avoiding feelings are inevitable, the drawbacks of sharing feelings are not. If you are able to share feelings with skill, you can avoid many of the potential costs associated with expressing feelings and even reap some unexpected benefits. This is the way out of the feelings bind. By following a few key guidelines you can greatly increase your chances of getting your feelings into your conversations and into your relationships in ways that are healthy, meaningful, and satisfying: first, you need to sort out just what your feelings are; second, you need to negotiate with your feelings; and third, you need to share your actual feelings, not attributions or judgments about the other person. Finding Your Feelings: Learn Where Feelings Hide % Explore Your Emotional Footprint % Accept That Feelings Are Normal and Natural. % Recognize That Good People Can Have Bad Feelings. % Learn That Your Feelings Are as Important as Theirs. % Find the Bundle of Feelings Behind the Simple Labels % Don’t Let Hidden Feelings Block Other Emotions. ~ ~ ~ A Landscape of Sometimes Hard-to-Find Feelings Love: Affectionate, caring, close, proud, passionate Anger: Frustrated, exasperated, enraged, indignant Hurt: Let down, betrayed, disappointed, needy Shame: Embarrassed, guilty, regretful, humiliated, self-loathing Fear: Anxious, terrified, worried, obsessed, suspicious Self-Doubt: Inadequate, unworthy, inept, unmotivated Joy: Happy, enthusiastic, full, elated, content Sadness: Bereft, wistful, joyless, depressed Jealousy: Envious, selfish, covetous, anguished, yearning Gratitude: Appreciative, thankful, relieved, admiring Loneliness: Desolate, abandoned, empty, longing ~ ~ ~ % Find the Feelings Lurking Under Attributions, Judgments, and Accusations % Lift the Lid on Attributions and Judgments. % We Translate Our Feelings Into Judgments: “If you were a good friend you would have been there for me.” Attributions: “Why were you trying to hurt me?” Characterizations: “You’re just so inconsiderate.” Problem-Solving: “The answer is for you to call me more often.” % Use the Urge to Blame as a Clue to Find Important Feelings. % Don’t Treat Feelings as Gospel: Negotiate with Them % Don’t Vent: Describe Feelings Carefully: 1. Frame Feelings Back into the Problem 2. Express the Full Spectrum of Your Feelings 3. Don’t Evaluate — Just Share % Express Your Feelings Without Judging, Attributing, or Blaming. % Don’t Monopolize: Both Sides Can Have Strong Feelings at the Same Time. % An Easy Reminder: Say “I Feel...” ~ ~ ~ The Identity Conversation From Chapter: 6 Ground Your Identity: Ask Yourself What’s at Stake % Difficult Conversations Threaten Our Identity Three Core Identities There are probably as many identities as there are people. But three identity issues seem particularly common, and often underlie what concerns us most during difficult conversations: - Am I competent? - Am I a good person? - Am I worthy of love? • Am I Competent? “I agonized about whether to bring up the subject of my salary. Spurred on by my colleagues, I finally did. Before I could even get started, my supervisor said, ‘I’m surprised you want to discuss this. The truth is, I’ve been disappointed by your performance this year.’ I felt nauseous. Maybe I’m not the talented chemist I thought I was.” • Am I a Good Person? “I had intended to break up with Sandra that night. I began in a roundabout way, and as soon as she got the drift, she started to cry. It hurt me so much to see her in such pain. The hardest thing for me in life is hurting people I care about; it goes against who I am spiritually and emotionally. I just couldn’t bear how I was feeling, and after a few moments I was telling her how much I loved her and that everything would work out between us.” • Am I Worthy of Love? “I began a conversation with my brother about the way he treats his wife. He talks down to her and I know it really bothers her. I was hugely nervous bringing it up, and my words were getting all twisted. Then he shouted, ‘Who are you to tell me how to act?! You’ve never had a real relationship in your whole life!’ After that, I could hardly breathe, let alone talk. All I could think about was how I wanted to get out of there.” Suddenly, who we thought we were when we walked into the conversation is called into question. % An Identity Quake Can Knock Us Off Balance % There’s No Quick Fix % Vulnerable Identities: The All-or-Nothing Syndrome Ex: If I am not good, am probably bad. % Denial Clinging to a purely positive identity leaves no place in our self-concept for negative feedback. If I think of myself as a super-competent person who never makes mistakes, then feedback suggesting that I have made a mistake presents a problem. The only way to keep my identity intact is to deny the feedback — to figure out why it’s not really true, why it doesn’t really matter, or why what I did wasn’t actually a mistake. % Exaggeration: The alternative to denial is exaggeration. In all-or-nothing thinking, taking in negative feedback requires us not just to adjust our self-image, but to flip it. If I’m not completely competent, then I’m completely incompetent: “Maybe I’m not as creative and special as I thought I was. I’ll probably never amount to anything. Maybe I’ll even get fired.” % We Let Their Feedback Define Who We Are. When we exaggerate, we act as if the other person’s feedback is the only information we have about ourselves. We put everything up for grabs, and let what they say dictate how we see ourselves. We may turn in a hundred memos on time, but if we are criticized for being late with the 101st memo, we think to ourselves, “I can never do anything right.” This one piece of information fills our whole identity screen. This example may seem ridiculous, but we all think like this on occasion, and not only around dramatic or traumatic events. If the waitress gives you a funny look as she collects her tip, you’re cheap. If you don’t help your friends paint their house, you’re selfish. If your brother says you don’t visit his children enough, you’re an uncaring aunt. It’s easy to see why exaggeration is such a debilitating reaction. % Ground Your Identity % Step One: Become Aware of Your Identity Issues % Step Two: Complexify Your Identity (Adopt the And Stance) % Three Things to Accept About Yourself 1. You Will Make Mistakes. 2. Your Intentions Are Complex. 3. You Have Contributed to the Problem. % During the Conversation: Learn to Regain Your Balance 1. Let Go of Trying to Control Their Reaction 2. Prepare for Their Response 3. Imagine That It’s Three Months Or Ten Years From Now 4. Take a Break % Their Identity Is Also Implicated % Raising Identity Issues Explicitly % Find the Courage to Ask for Help ~ ~ ~ Chapter: 7 Create a Learning Conversation What’s Your Purpose? When to Raise It and When to Let Go % To Raise or Not to Raise: How to Decide? - How Do I Know I’ve Made the Right Choice? - Work Through the Three Conversations % Three Kinds of Conversations That Don’t Make Sense - Is the Real Conflict Inside You? - Is There a Better Way to Address the Issue Than Talking About It? - Do You Have Purposes That Make Sense? - Remember, You Can’t Change Other People. - Don’t Focus on Short-Term Relief at Long-Term Cost. - Don’t Hit-and-Run. % Letting Go % Adopt Some Liberating Assumptions - It’s Not My Responsibility to Make Things Better; It’s My Responsibility to Do My Best. - They Have Limitations Too. - This Conflict Is Not Who I Am. - Letting Go Doesn’t Mean I No Longer Care. % If You Raise It: Three Purposes That Work 1. Learning Their Story 2. Expressing Your Views and Feelings 3. Problem-Solving Together % Stance and Purpose Go Hand in Hand These three purposes accommodate the fact that you and the other person see the world differently, that you each have powerful feelings about what is going on, and that you each have your own identity issues to work through. Each of you, in short, has your own story. You need purposes that can reckon with this reality. These are the purposes that emerge from a learning stance, from working through the Three Conversations and shifting your internal orientation from certainty to curiosity, from debate to exploration, from simplicity to complexity, from “either/or” to “and.” They may seem simple – perhaps even simplistic. But their straightforwardness masks both the difficulty involved in doing them well and the power they have to transform the way you handle your conversations. Working from a learning stance with these purposes in mind, the rest of this book explores in detail how to conduct a learning conversation, from getting started to getting unstuck. ~ ~ ~ Chapter: 8 Getting Started: Begin from the Third Story % Why Our Typical Openings Don’t Help - We Begin Inside Our Own Story - We Trigger Their Identity Conversation from the Start % Step One: Begin from the Third Story - Think Like a Mediator - Not Right or Wrong, Not Better or Worse - Just Different - The Third Story. Example: Opening Lines From Inside Your Story: If you contest Dad’s will, it’s going to tear our family apart. From the Third Story: I wanted to talk about Dad’s will. You and I obviously have different understandings of what Dad intended, and of what’s fair to each of us. I wanted to understand why you see things the way you do, and to share with you my perspective and feelings. In addition, I have strong feelings and fears about what a court fight would mean for the family; I suspect you do too. -- -- -- From Inside Your Story : I was very upset by what you said in front of our supervisor. From the Third Story: I wanted to talk to you about what happened in the meeting this morning. I was upset by something you said. I wanted to explain what was bothering me, and also hear your perspective on the situation. From Inside Your Story: Your son Nathan can be difficult in class — disruptive and argumentative. You’ve said in the past that things at home are fine, but something must be troubling him. From the Third Story: I wanted to share with you my concerns about Nathan’s behavior in class, and hear more about your sense of what might be contributing to it. I know from our past conversation that you and I have different thinking on this. My sense is that if a child is having trouble at school, something is usually bothering him at home, and I know you’ve felt strongly that that’s not true in this case. Maybe together we can figure out what’s motivating Nathan and how to handle it. % If They Start the Conversation, You Can Still Step to the Third Story % Step Two: Extend an Invitation 1. Describe Your Purposes 2. Invite, Don’t Impose 3. Make Them Your Partner in Figuring It Out 4. Be Persistent % Some Specific Kinds of Conversations - Delivering Bad News - Making Requests : “I Wonder If It Would Make Sense...?” - Revisiting Conversations Gone Wrong : Talk About How to Talk About It. % A Map for Going Forward: Third Story, Their Story, Your Story - What to Talk About: The Three Conversations - What to Talk About: Explore where each story comes from: “My reactions here probably have a lot to do with my experiences in a previous job...” Share the impact on you: “I don’t know whether you intended this, but I felt extremely uncomfortable when...” Take responsibility for your contribution: “There are a number of things I’ve done that have made this situation harder...” Describe feelings: “I’m anxious about bringing this up, but at the same time, it’s important to me that we talk about it...” Reflect on the identity issues: “I think the reason this subject hooks me is that I don’t like thinking of myself as someone who...” % How to Talk About It: Listening, Expression, and Problem-Solving ~ ~ ~ Chapter: 9 Learning: Listen from the Inside Out % Listening Transforms the Conversation % Listening to Them Helps Them Listen to You % The Stance of Curiosity: How to Listen from the Inside Out - Forget the Words, Focus on Authenticity - The Commentator in Your Head: Become More Aware of Your Internal Voice - Don’t Turn It Off, Turn It Up - Managing Your Internal Voice - Negotiate Your Way to Curiosity. - Don’t Listen: Talk. Sometimes you’ll find that your internal voice is just too strong to take on. You try to negotiate your way to curiosity, but you just can’t get there. If you’re sitting on feelings of pain or outrage or betrayal, or, conversely, if you’re overcome with joy or love, then listening may be a hopeless task. % Three Skills: Inquiry, Paraphrasing, and Acknowledgment - Inquire to Learn - Don’t Make Statements Disguised as Questions - Don’t Use Questions to Cross-Examine - Ask Open-Ended Questions - Ask for More Concrete Information - Ask Questions About the Three Conversations - Make It Safe for Them Not to Answer % Paraphrase for Clarity - Check Your Understanding - Show That You’ve Heard % Acknowledge Their Feelings - Answer the Invisible Questions - How to Acknowledge - Order Matters: Acknowledge Before Problem-Solving - Acknowledging Is Not Agreeing % A Final Thought: Empathy Is a Journey, Not a Destination ~ ~ ~ Chapter 10 Expression: Speak for Yourself with Clarity and Power % Orators Need Not Apply % You’re Entitled (Yes, You) - No More, But No Less - Beware Self-Sabotage - Failure to Express Yourself Keeps You Out of the Relationship - Feel Entitled, Feel Encouraged, But Don’t Feel Obligated % Speak the Heart of the Matter - Start with What Matters Most - Say What You Mean: Don’t Make Them Guess - Don’t Rely on Subtext. - Avoid Easing In. - Don’t Make Your Story Simplistic: Use the “Me-Me” And % Telling Your Story with Clarity: Three Guidelines 1. Don’t Present Your Conclusions as The Truth 2. Share Where Your Conclusions Come From 3. Don’t Exaggerate with “Always” and “Never” : Give Them Room to Change % Help Them Understand You - Ask Them to Paraphrase Back - Ask How They See It Differently — and Why ~ ~ ~ Chapter 11 Problem-Solving: Take the Lead % Skills for Leading the Conversation If your conversations are going to get anywhere, you’re going to have to take the lead. There are a set of powerful “moves” you can make during the conversation – reframing, listening, and naming the dynamic – that can help keep the conversation on track, whether the other person is being cooperative or not. When the other person heads in a destructive direction, reframing puts the conversation back on course. It allows you to translate unhelpful statements into helpful ones. Listening is not only the skill that lets you into the other person’s world; it is also the single most powerful move you can make to keep the conversation constructive. And naming the dynamic is useful when you want to address a troubling aspect of the conversation. It is a particularly good strategy if the other person is dominating the conversation and seems unwilling to follow your lead. % Reframe, Reframe, Reframe % You Can Reframe Anything Truth -> Different Story Accusations -> Intentions and impact Blame -> Contributions Judgements, characterizations -> Feelings What's wrong with you -> What's going on for them % The “You-Me” And % It’s Always the Right Time to Listen - Be Persistent About Listening % Name the Dynamic: Make the Trouble Explicit % Now What? Begin to Problem-Solve - It Takes Two to Agree - Gather Information and Test Your Perceptions - Propose Crafting a Test. : Say What Is Still Missing : Say What Would Persuade You : Ask What (If Anything) Would Persuade Them - Ask Their Advice - Invent Options - Ask What Standards Should Apply - The Principle of Mutual Caretaking. - If You Still Can’t Agree, Consider Your Alternatives - It Takes Time ~ ~ ~ Chapter: 12 Putting It All Together Five steps to a difficult conversation: Step One: Prepare by Walking Through the Three Conversations Step Two: Check Your Purposes and Decide Whether to Raise It Step Three: Start from the Third Story Step Four: Explore Their Story and Yours Step Five: Problem-Solving • Deciding: Is this the best way to address the issue and achieve your purposes? • Is the issue really embedded in your Identity Conversation? • Can you affect the problem by changing your contributions? • If you don’t raise it, what can you do to help yourself let go? Now with sub-steps: Step 3: Start from the Third Story 1. Describe the problem as the difference between your stories. Include both viewpoints as a legitimate part of the discussion. 2. Share your purposes. 3. Invite them to join you as a partner in sorting out the situation together. Step 4: Explore Their Story and Yours • Listen to understand their perspective on what happened. Ask questions. Acknowledge the feelings behind the arguments and accusations. Paraphrase to see if you’ve got it. Try to unravel how the two of you got to this place. • Share your own viewpoint, your past experiences, intentions, feelings. • Reframe, reframe, reframe to keep on track. From truth to perceptions, blame to contribution, accusations to feelings, and so on. Step 5: Problem-Solving • Invent options that meet each side’s most important concerns and interests. • Look to standards for what should happen. Keep in mind the standard of mutual caretaking; relationships that always go one way rarely last. • Talk about how to keep communication open as you go forward.
Tuesday, September 22, 2020
Sentiment Analysis Testing on Some Difficult Sentences
We are going to test three sentiment analyzers: 1. TextBlob 2. BERT Based Sentiment Analyzer 3. vaderSentiment The sentences are shown below (and link to Excel is given at the bottom): Note: vaderSentiment could not be found in "conda-forge". "conda install vaderSentiment -c conda-forge" failed. (temp) C:\Users\Ashish Jain>pip install vaderSentiment Collecting vaderSentiment Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl (125 kB) |████████████████████████████████| 125 kB 1.6 MB/s Requirement already satisfied: requests in e:\programfiles\anaconda3\envs\temp\lib\site-packages (from vaderSentiment) (2.24.0) Requirement already satisfied: idna<3,>=2.5 in e:\programfiles\anaconda3\envs\temp\lib\site-packages (from requests->vaderSentiment) (2.10) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in e:\programfiles\anaconda3\envs\temp\lib\site-packages (from requests->vaderSentiment) (1.25.10) Requirement already satisfied: chardet<4,>=3.0.2 in e:\programfiles\anaconda3\envs\temp\lib\site-packages (from requests->vaderSentiment) (3.0.4) Requirement already satisfied: certifi>=2017.4.17 in e:\programfiles\anaconda3\envs\temp\lib\site-packages (from requests->vaderSentiment) (2020.6.20) Installing collected packages: vaderSentiment Successfully installed vaderSentiment-3.3.2 "vaderSentiment" As of 23-Sep-2020, new changes in current version: Refactoring for Python 3 compatibility, improved modularity, and incorporation into [NLTK: current version 3.5]. Ref: nltk.org About the Scoring of 'vaderSentiment': 1.The compound score is computed by summing the valence scores of each word in the lexicon, adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive). This is the most useful metric if you want a single unidimensional measure of sentiment for a given sentence. Calling it a 'normalized, weighted composite score' is accurate. It is also useful for researchers who would like to set standardized thresholds for classifying sentences as either positive, neutral, or negative. Typical threshold values (used in the literature cited on this page) are: % positive sentiment: compound score >= 0.05 % neutral sentiment: (compound score > -0.05) and (compound score < 0.05) % negative sentiment: compound score <= -0.05 2. The pos, neu, and neg scores are ratios for proportions of text that fall in each category (so these should all add up to be 1... or close to it with float operation). These are the most useful metrics if you want multidimensional measures of sentiment for a given sentence. For example: Not bad at all >> {'pos': 0.487, 'compound': 0.431, 'neu': 0.513, 'neg': 0.0} This scoring can modified to taking the maximum of 'pos', 'neu' or 'neg'. Python Code import pandas as pd from textblob import TextBlob from sklearn.metrics import accuracy_score from sklearn.metrics import confusion_matrix from sklearn.metrics import f1_score from collections import Counter import requests import DrawConfusionMatrix as dcm from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer df = pd.read_excel('files_1/Sentences for Sentiment Analysis.xlsx') df_1 = df.drop_duplicates(subset='Sentence', keep="last") df_2 = df_1[df_1.Sentiment.isin(['Positive', 'Negative', 'Neutral'])] def get_sentiment_textblob(sentence): blob = TextBlob(sentence) sentiment = 'Positive' if blob.sentiment.polarity < 0: sentiment = 'Negative' elif blob.sentiment.polarity == 0: sentiment = 'Neutral' return sentiment df_2['textblob'] = df_2.Sentence.apply(lambda x: get_sentiment_textblob(x)) cm = confusion_matrix(df_2['Sentiment'], df_2['textblob']) print("Accuracy: {:.3f}".format(accuracy_score(df_2['Sentiment'], df_2['textblob']))) print(Counter(df_2['textblob'])) Accuracy: 0.519 Counter({'Positive': 33, 'Neutral': 28, 'Negative': 18}) classes = ['Negative', 'Neutral', 'Positive'] dcm.plot_confusion_matrix(cm, classes = classes, use_seaborn = True) For TextBlob: For BERT based code, we have to start a Flask server and get sentiment output from there (Ref 1): Note: BERT based Sentiment Analyzer, we get only 'Positive' or 'Negative': def get_sentiment_bert(sentence): return requests.get("http://127.0.0.1:5000/?text=" + sentence).json()['sentiment'] df_2['bert_sentiment'] = df_2.Sentence.apply(lambda x: get_sentiment_bert(x)) cm = confusion_matrix(df_2['Sentiment'], df_2['bert_sentiment']) print("Accuracy: {:.3f}".format(accuracy_score(df_2['Sentiment'], df_2['bert_sentiment']))) print(Counter(df_2['bert_sentiment'])) Accuracy: 0.722 Counter({'Negative': 43, 'Positive': 36}) classes = ['Negative', 'Neutral', 'Positive'] dcm.plot_confusion_matrix(cm, classes = classes, use_seaborn = True) print(f1_score(df_2['Sentiment'], df_2['bert_sentiment'], average='macro')) print(f1_score(df_2['Sentiment'], df_2['bert_sentiment'], average='micro')) print(f1_score(df_2['Sentiment'], df_2['bert_sentiment'], average='weighted')) macro: 0.5029855988760098 micro: 0.7215189873417721 weighted: 0.6898667484760773 # vaderSentiment analyzer = SentimentIntensityAnalyzer() def get_sentiment_vader(sentence): vs = analyzer.polarity_scores(sentence) if vs['compound'] >= 0.05: rtn_val = 'Positive' elif vs['compound'] > -0.05 and vs['compound'] < 0.05: rtn_val = 'Neutral' else: rtn_val = 'Negative' return rtn_val df_2['vader_sentiment'] = df_2.Sentence.apply(lambda x: get_sentiment_vader(x)) Accuracy: 0.557 Counter({'Positive': 41, 'Neutral': 23, 'Negative': 15}) Confusion matrix, without normalization [[11 10 9] [ 0 4 3] [ 4 9 29]] BERT based code is performing the best with accuracy of about .72 The link to Excel containining the setences: GitHub Ref 1: Sentiment Analysis using BERT, DistilBERT and ALBERT Ref 2: Sentiment Analysis Tutorial (2014, Bing Liu)
Monday, September 21, 2020
Flutter Notes (Week 2, Sep 2020)
Ques: How to fix initialization error for DefaultKotlinSourceSetKt? Error: Could not initialize class org.jetbrains.kotlin.gradle.plugin.sources.DefaultKotlinSourceSetKt Answer: This error comes on upgrading Android Studio (from October 2019 Release to Sep, 2020 (v4.0.1)) and the SDK Components. On updating the Kotlin version on Android Studio -> Tools -> Kotlin -> Check For Update. Then make changes on Kotlin Version as below on build.gradle After changing: ext.kotlin_version = '1.3.72' Old value: 1.3.10 Issues appearing in "Build" process logs: 1. Deprecation Warning for 'Android Gradle' plugin: WARNING: API 'variant.getMergeAssets()' is obsolete and has been replaced with 'variant.getMergeAssetsProvider()'. It will be removed in version 5.0 of the Android Gradle plugin. For more information, see https://d.android.com/r/tools/task-configuration-avoidance. To determine what is calling variant.getMergeAssets(), use -Pandroid.debug.obsoleteApi=true on the command line to display more information. WARNING: API 'variantOutput.getProcessResources()' is obsolete and has been replaced with 'variantOutput.getProcessResourcesProvider()'. It will be removed in version 5.0 of the Android Gradle plugin. For more information, see https://d.android.com/r/tools/task-configuration-avoidance. To determine what is calling variantOutput.getProcessResources(), use -Pandroid.debug.obsoleteApi=true on the command line to display more information. 2. > Task :app:compileFlutterBuildDebugArm ╔==== ║ A new version of Flutter is available! ║ ║ To update to the latest version, run "flutter upgrade". ╚==== C:\Users\Ashish Jain>flutter --version Waiting for another flutter command to release the startup lock... Flutter 1.9.1+hotfix.2 • channel stable • https://github.com/flutter/flutter.git Framework • revision 2d2a1ffec9 (1 year, 1 month ago) • 2019-09-06 18:39:49 -0700 Engine • revision b863200c37 Tools • Dart 2.5.0 3. Manifest merger failed : Overlay manifest:package attribute declared at AndroidManifest.xml:2:5-37 value=(com.survival8.survival) has a different value=(com.survival8.one) declared in main manifest at AndroidManifest.xml:2:5-32 Suggestion: remove the overlay declaration at AndroidManifest.xml and place it in the build.gradle: flavorName { applicationId = "com.survival8.survival" } 4. On replacing the entire code from a back-up of code developed on 'Ubuntu': "\home\administrator\flutter" was our path in Ubuntu machine. Error: C:\Users\Ashish Jain\AndroidStudioProjects\survival\android\app\home\administrator\flutter\packages\flutter_tools\gradle\flutter.gradle (The system cannot find the path specified) We have a "local.properties" file with contents: Path: C:\Users\Ashish Jain\AndroidStudioProjects\survival\android\local.properties Contents from Ubuntu back-up code: ## This file must *NOT* be checked into Version Control Systems, # as it contains information specific to your local configuration. # # Location of the SDK. This is only used by Gradle. # For customization when using a Version Control System, please read the header note. # Mon Sep 21 19:31:06 IST 2020 flutter.buildMode=release flutter.versionName=1.0.0 flutter.sdk=/home/administrator/flutter sdk.dir=C\:\\Users\\Ashish Jain\\AppData\\Local\\Android\\Sdk flutter.versionCode=1 Here, we need to change the property "flutter.sdk" according to our current system. --> flutter.sdk=E\:\\programfiles\\flutter C:\Users\Ashish Jain\AndroidStudioProjects>flutter upgrade Flutter 1.20.4 • channel stable • https://github.com/flutter/flutter.git Framework • revision fba99f6cf9 (7 days ago) • 2020-09-14 15:32:52 -0700 Engine • revision d1bc06f032 Tools • Dart 2.9.2 Running flutter doctor... Doctor summary (to see all details, run flutter doctor -v): [√] Flutter (Channel stable, 1.20.4, on Microsoft Windows [Version 10.0.18363.1082], locale en-US) [!] Android toolchain - develop for Android devices (Android SDK version 29.0.2) X Android license status unknown. Try re-installing or updating your Android SDK Manager. See https://developer.android.com/studio/#downloads or visit https://flutter.dev/docs/get-started/install/windows#android-setup for detailed instructions. [√] Android Studio (version 4.0) C:\Users\Ashish Jain\AndroidStudioProjects>flutter --version Flutter 1.20.4 • channel stable • https://github.com/flutter/flutter.git Framework • revision fba99f6cf9 (7 days ago) • 2020-09-14 15:32:52 -0700 Engine • revision d1bc06f032 Tools • Dart 2.9.2 5. When opening a project, open the "android" folder and not the parent directory such as "survival" or "survival8". 6. Where to view the Gradle version running: 7. Way to clone 'stable' Flutter GitHub repository: $ git clone -b stable https://github.com/flutter/flutter.git
Subscribe to:
Posts (Atom)